From patrick.sanan at gmail.com  Fri Nov  1 05:41:09 2019
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Fri, 1 Nov 2019 11:41:09 +0100
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
Message-ID: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>

*Context:* I'm trying to track down an error that (only) arises when
running a Fortran 90 code, using PETSc, on a new cluster. The code creates
and destroys a linear system (Mat,Vec, and KSP) at each of (many)
timesteps. The error message from a user looks like this, which leads me to
suspect that MPI_Comm_dup() is being called many times and this is
eventually a problem for this particular MPI implementation (Open MPI
2.1.0):


[lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
[lo-a2-058:21425] *** reported by process [4222287873,2]
[lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
[lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
[lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
will now abort,
[lo-a2-058:21425] ***    and potentially your MPI job)

*Question: *I remember some discussion recently (but can't find the thread)
about not calling MPI_Comm_dup() too many times from PetscCommDuplicate(),
which would allow one to safely use the (admittedly not optimal) approach
used in this application code. Is that a correct understanding and would
the fixes made in that context also apply to Fortran? I don't fully
understand the details of the MPI techniques used, so thought I'd ask here.

If I hack a simple build-solve-destroy example to run several loops, I see
a notable difference between C and Fortran examples. With the attached
ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP
tutorials examples ex23.c and ex21f.F90, respectively, I see the following.
Note that in the Fortran case, it appears that communicators are actually
duplicated in each loop, but in the C case, this only happens in the first
loop:

[(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep
PetscCommDuplicate
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784
max tags = 268435455
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784

[(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep
PetscCommDuplicate
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784
max tags = 268435455
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784
max tags = 268435455
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784
max tags = 268435455
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784
max tags = 268435455
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784
max tags = 268435455
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
-2080374784
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/8eb48d50/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex221f.F90
Type: application/octet-stream
Size: 10705 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/8eb48d50/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex223.c
Type: application/octet-stream
Size: 7641 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/8eb48d50/attachment-0003.obj>

From stefano.zampini at gmail.com  Fri Nov  1 06:16:59 2019
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Fri, 1 Nov 2019 14:16:59 +0300
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
In-Reply-To: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
References: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
Message-ID: <6CA72589-DD32-4FC8-B7CD-D692401DBE7A@gmail.com>

From src/sys/objects/ftn-custom/zstart.c petscinitialize_internal

PETSC_COMM_WORLD = MPI_COMM_WORLD

Which means that PETSC_COMM_WORLD is not a PETSc communicator.

The first matrix creation duplicates the PETSC_COMM_WORLD and thus can be reused for the other objects
When you finally destroy the matrix inside the loop, the ref count of this duplicated comm goes to zero and it is free
This is why you duplicate at each step

However, the C version of PetscInitialize does the same, so I?m not sure why this happens with Fortran and not with C. (Do you leak objects in the C code?)


> On Nov 1, 2019, at 1:41 PM, Patrick Sanan via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Context: I'm trying to track down an error that (only) arises when running a Fortran 90 code, using PETSc, on a new cluster. The code creates and destroys a linear system (Mat,Vec, and KSP) at each of (many) timesteps. The error message from a user looks like this, which leads me to suspect that MPI_Comm_dup() is being called many times and this is eventually a problem for this particular MPI implementation (Open MPI 2.1.0):
> 
> [lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
> [lo-a2-058:21425] *** reported by process [4222287873,2]
> [lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
> [lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
> [lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> [lo-a2-058:21425] ***    and potentially your MPI job)
> 
> Question: I remember some discussion recently (but can't find the thread) about not calling MPI_Comm_dup() too many times from PetscCommDuplicate(), which would allow one to safely use the (admittedly not optimal) approach used in this application code. Is that a correct understanding and would the fixes made in that context also apply to Fortran? I don't fully understand the details of the MPI techniques used, so thought I'd ask here. 
> 
> If I hack a simple build-solve-destroy example to run several loops, I see a notable difference between C and Fortran examples. With the attached ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP tutorials examples ex23.c and ex21f.F90, respectively, I see the following. Note that in the Fortran case, it appears that communicators are actually duplicated in each loop, but in the C case, this only happens in the first loop:
> 
> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep PetscCommDuplicate
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> 
> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep PetscCommDuplicate
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> 
> 
> 
> <ex221f.F90><ex223.c>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/d5a5aa53/attachment.html>

From stefano.zampini at gmail.com  Fri Nov  1 06:36:03 2019
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Fri, 1 Nov 2019 14:36:03 +0300
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
In-Reply-To: <6CA72589-DD32-4FC8-B7CD-D692401DBE7A@gmail.com>
References: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
	<6CA72589-DD32-4FC8-B7CD-D692401DBE7A@gmail.com>
Message-ID: <4D16D3C4-F23A-425D-B776-C21BE1B4742A@gmail.com>

I know why your C code does not duplicate the comm at each step. This is because it uses PETSC_VIEWER_STDOUT_WORLD, which basically inserts the duplicated comm into PETSC_COMM_WORLD as attribute. Try removing the KSPView call and you will see the C code behaves as the Fortran one.


> On Nov 1, 2019, at 2:16 PM, Stefano Zampini <stefano.zampini at gmail.com> wrote:
> 
> From src/sys/objects/ftn-custom/zstart.c petscinitialize_internal
> 
> PETSC_COMM_WORLD = MPI_COMM_WORLD
> 
> Which means that PETSC_COMM_WORLD is not a PETSc communicator.
> 
> The first matrix creation duplicates the PETSC_COMM_WORLD and thus can be reused for the other objects
> When you finally destroy the matrix inside the loop, the ref count of this duplicated comm goes to zero and it is free
> This is why you duplicate at each step
> 
> However, the C version of PetscInitialize does the same, so I?m not sure why this happens with Fortran and not with C. (Do you leak objects in the C code?)
> 
> 
>> On Nov 1, 2019, at 1:41 PM, Patrick Sanan via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>> 
>> Context: I'm trying to track down an error that (only) arises when running a Fortran 90 code, using PETSc, on a new cluster. The code creates and destroys a linear system (Mat,Vec, and KSP) at each of (many) timesteps. The error message from a user looks like this, which leads me to suspect that MPI_Comm_dup() is being called many times and this is eventually a problem for this particular MPI implementation (Open MPI 2.1.0):
>> 
>> [lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
>> [lo-a2-058:21425] *** reported by process [4222287873,2]
>> [lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
>> [lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
>> [lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>> [lo-a2-058:21425] ***    and potentially your MPI job)
>> 
>> Question: I remember some discussion recently (but can't find the thread) about not calling MPI_Comm_dup() too many times from PetscCommDuplicate(), which would allow one to safely use the (admittedly not optimal) approach used in this application code. Is that a correct understanding and would the fixes made in that context also apply to Fortran? I don't fully understand the details of the MPI techniques used, so thought I'd ask here. 
>> 
>> If I hack a simple build-solve-destroy example to run several loops, I see a notable difference between C and Fortran examples. With the attached ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP tutorials examples ex23.c and ex21f.F90, respectively, I see the following. Note that in the Fortran case, it appears that communicators are actually duplicated in each loop, but in the C case, this only happens in the first loop:
>> 
>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep PetscCommDuplicate
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> 
>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep PetscCommDuplicate
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> 
>> 
>> 
>> <ex221f.F90><ex223.c>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/b9af819d/attachment-0001.html>

From patrick.sanan at gmail.com  Fri Nov  1 06:45:04 2019
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Fri, 1 Nov 2019 12:45:04 +0100
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
In-Reply-To: <4D16D3C4-F23A-425D-B776-C21BE1B4742A@gmail.com>
References: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
	<6CA72589-DD32-4FC8-B7CD-D692401DBE7A@gmail.com>
	<4D16D3C4-F23A-425D-B776-C21BE1B4742A@gmail.com>
Message-ID: <D3B055A4-FF75-433C-AB57-E0CAD8843A00@gmail.com>

Ah, really interesting! In the attached ex321f.F90, I create a dummy KSP before the loop, and indeed the behavior is as you say - no duplications

[(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex321f -info | grep PetscCommDuplicate
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784

I've asked the user to re-run with -info, so then I'll hopefully be able to see whether the duplication is happening as I expect (in which case your insight might provide at least a workaround), and to see if it's choosing a new communicator number each time, somehow. 

> Am 01.11.2019 um 12:36 schrieb Stefano Zampini <stefano.zampini at gmail.com>:
> 
> I know why your C code does not duplicate the comm at each step. This is because it uses PETSC_VIEWER_STDOUT_WORLD, which basically inserts the duplicated comm into PETSC_COMM_WORLD as attribute. Try removing the KSPView call and you will see the C code behaves as the Fortran one.
> 
> 
>> On Nov 1, 2019, at 2:16 PM, Stefano Zampini <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>> wrote:
>> 
>> From src/sys/objects/ftn-custom/zstart.c petscinitialize_internal
>> 
>> PETSC_COMM_WORLD = MPI_COMM_WORLD
>> 
>> Which means that PETSC_COMM_WORLD is not a PETSc communicator.
>> 
>> The first matrix creation duplicates the PETSC_COMM_WORLD and thus can be reused for the other objects
>> When you finally destroy the matrix inside the loop, the ref count of this duplicated comm goes to zero and it is free
>> This is why you duplicate at each step
>> 
>> However, the C version of PetscInitialize does the same, so I?m not sure why this happens with Fortran and not with C. (Do you leak objects in the C code?)
>> 
>> 
>>> On Nov 1, 2019, at 1:41 PM, Patrick Sanan via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>> 
>>> Context: I'm trying to track down an error that (only) arises when running a Fortran 90 code, using PETSc, on a new cluster. The code creates and destroys a linear system (Mat,Vec, and KSP) at each of (many) timesteps. The error message from a user looks like this, which leads me to suspect that MPI_Comm_dup() is being called many times and this is eventually a problem for this particular MPI implementation (Open MPI 2.1.0):
>>> 
>>> [lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
>>> [lo-a2-058:21425] *** reported by process [4222287873,2]
>>> [lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
>>> [lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
>>> [lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>>> [lo-a2-058:21425] ***    and potentially your MPI job)
>>> 
>>> Question: I remember some discussion recently (but can't find the thread) about not calling MPI_Comm_dup() too many times from PetscCommDuplicate(), which would allow one to safely use the (admittedly not optimal) approach used in this application code. Is that a correct understanding and would the fixes made in that context also apply to Fortran? I don't fully understand the details of the MPI techniques used, so thought I'd ask here. 
>>> 
>>> If I hack a simple build-solve-destroy example to run several loops, I see a notable difference between C and Fortran examples. With the attached ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP tutorials examples ex23.c and ex21f.F90, respectively, I see the following. Note that in the Fortran case, it appears that communicators are actually duplicated in each loop, but in the C case, this only happens in the first loop:
>>> 
>>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep PetscCommDuplicate
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> 
>>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep PetscCommDuplicate
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> 
>>> 
>>> 
>>> <ex221f.F90><ex223.c>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/637e0339/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex321f.F90
Type: application/octet-stream
Size: 10854 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/637e0339/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/637e0339/attachment-0003.html>

From stefano.zampini at gmail.com  Fri Nov  1 06:48:57 2019
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Fri, 1 Nov 2019 14:48:57 +0300
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
In-Reply-To: <D3B055A4-FF75-433C-AB57-E0CAD8843A00@gmail.com>
References: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
	<6CA72589-DD32-4FC8-B7CD-D692401DBE7A@gmail.com>
	<4D16D3C4-F23A-425D-B776-C21BE1B4742A@gmail.com>
	<D3B055A4-FF75-433C-AB57-E0CAD8843A00@gmail.com>
Message-ID: <A6C8CBD0-5047-45DA-9AF3-7AFD81FD5245@gmail.com>

It seems we don?t have a fortran wrapper for PetscCommDuplicate (or at least I cannot find it) Is this an oversight?

If we have a Fortran wrapper for PetscComm{Duplicate~Destroy}, the proper fix will be to call PetscCommDuplicate(PETSC_COMM_WORLD,&user_petsc_comm) after PetscInitalize and PetscCommDestroy(&user_petsc_comm) right before PetscFinalize is called in your app

> On Nov 1, 2019, at 2:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> 
> Ah, really interesting! In the attached ex321f.F90, I create a dummy KSP before the loop, and indeed the behavior is as you say - no duplications
> <ex321f.F90>
> 
> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex321f -info | grep PetscCommDuplicate
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> 
> I've asked the user to re-run with -info, so then I'll hopefully be able to see whether the duplication is happening as I expect (in which case your insight might provide at least a workaround), and to see if it's choosing a new communicator number each time, somehow. 
> 
>> Am 01.11.2019 um 12:36 schrieb Stefano Zampini <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>>:
>> 
>> I know why your C code does not duplicate the comm at each step. This is because it uses PETSC_VIEWER_STDOUT_WORLD, which basically inserts the duplicated comm into PETSC_COMM_WORLD as attribute. Try removing the KSPView call and you will see the C code behaves as the Fortran one.
>> 
>> 
>>> On Nov 1, 2019, at 2:16 PM, Stefano Zampini <stefano.zampini at gmail.com <mailto:stefano.zampini at gmail.com>> wrote:
>>> 
>>> From src/sys/objects/ftn-custom/zstart.c petscinitialize_internal
>>> 
>>> PETSC_COMM_WORLD = MPI_COMM_WORLD
>>> 
>>> Which means that PETSC_COMM_WORLD is not a PETSc communicator.
>>> 
>>> The first matrix creation duplicates the PETSC_COMM_WORLD and thus can be reused for the other objects
>>> When you finally destroy the matrix inside the loop, the ref count of this duplicated comm goes to zero and it is free
>>> This is why you duplicate at each step
>>> 
>>> However, the C version of PetscInitialize does the same, so I?m not sure why this happens with Fortran and not with C. (Do you leak objects in the C code?)
>>> 
>>> 
>>>> On Nov 1, 2019, at 1:41 PM, Patrick Sanan via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>> 
>>>> Context: I'm trying to track down an error that (only) arises when running a Fortran 90 code, using PETSc, on a new cluster. The code creates and destroys a linear system (Mat,Vec, and KSP) at each of (many) timesteps. The error message from a user looks like this, which leads me to suspect that MPI_Comm_dup() is being called many times and this is eventually a problem for this particular MPI implementation (Open MPI 2.1.0):
>>>> 
>>>> [lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
>>>> [lo-a2-058:21425] *** reported by process [4222287873,2]
>>>> [lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
>>>> [lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
>>>> [lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>>>> [lo-a2-058:21425] ***    and potentially your MPI job)
>>>> 
>>>> Question: I remember some discussion recently (but can't find the thread) about not calling MPI_Comm_dup() too many times from PetscCommDuplicate(), which would allow one to safely use the (admittedly not optimal) approach used in this application code. Is that a correct understanding and would the fixes made in that context also apply to Fortran? I don't fully understand the details of the MPI techniques used, so thought I'd ask here. 
>>>> 
>>>> If I hack a simple build-solve-destroy example to run several loops, I see a notable difference between C and Fortran examples. With the attached ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP tutorials examples ex23.c and ex21f.F90, respectively, I see the following. Note that in the Fortran case, it appears that communicators are actually duplicated in each loop, but in the C case, this only happens in the first loop:
>>>> 
>>>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep PetscCommDuplicate
>>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> 
>>>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep PetscCommDuplicate
>>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>>> 
>>>> 
>>>> 
>>>> <ex221f.F90><ex223.c>
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/9fa2f4af/attachment.html>

From patrick.sanan at gmail.com  Fri Nov  1 10:09:52 2019
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Fri, 1 Nov 2019 16:09:52 +0100
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
In-Reply-To: <A6C8CBD0-5047-45DA-9AF3-7AFD81FD5245@gmail.com>
References: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
	<6CA72589-DD32-4FC8-B7CD-D692401DBE7A@gmail.com>
	<4D16D3C4-F23A-425D-B776-C21BE1B4742A@gmail.com>
	<D3B055A4-FF75-433C-AB57-E0CAD8843A00@gmail.com>
	<A6C8CBD0-5047-45DA-9AF3-7AFD81FD5245@gmail.com>
Message-ID: <CA+z91TfcsSG7sYtWyyGxGDs-g2fC+=Yx2OGXp6Fuf=ELkPEd-g@mail.gmail.com>

I don't see those interfaces, either. If there was a reason that they're
non-trivial to implement, we should at least note on the man pages in
"Fortran Note:" sections that they don't exist.

In this particular instance, we can get by without those interfaces by just
creating and destroying the KSP once (the settings are constant), thus
hanging onto a reference that way.

I'll wait for our -info run to come back and will then confirm that this
fixes things. Thanks again, Stefano!

Am Fr., 1. Nov. 2019 um 12:49 Uhr schrieb Stefano Zampini <
stefano.zampini at gmail.com>:

> It seems we don?t have a fortran wrapper for PetscCommDuplicate (or at
> least I cannot find it) Is this an oversight?
>
> If we have a Fortran wrapper for PetscComm{Duplicate~Destroy}, the proper
> fix will be to call PetscCommDuplicate(PETSC_COMM_WORLD,&user_petsc_comm)
> after PetscInitalize and PetscCommDestroy(&user_petsc_comm) right before
> PetscFinalize is called in your app
>
> On Nov 1, 2019, at 2:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
>
> Ah, really interesting! In the attached ex321f.F90, I create a dummy KSP
> before the loop, and indeed the behavior is as you say - no duplications
> <ex321f.F90>
>
> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex321f -info | grep
> PetscCommDuplicate
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688
> -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
>
> I've asked the user to re-run with -info, so then I'll hopefully be able
> to see whether the duplication is happening as I expect (in which case your
> insight might provide at least a workaround), and to see if it's choosing a
> new communicator number each time, somehow.
>
> Am 01.11.2019 um 12:36 schrieb Stefano Zampini <stefano.zampini at gmail.com
> >:
>
> I know why your C code does not duplicate the comm at each step. This is
> because it uses PETSC_VIEWER_STDOUT_WORLD, which basically inserts the
> duplicated comm into PETSC_COMM_WORLD as attribute. Try removing the
> KSPView call and you will see the C code behaves as the Fortran one.
>
>
> On Nov 1, 2019, at 2:16 PM, Stefano Zampini <stefano.zampini at gmail.com>
> wrote:
>
> From src/sys/objects/ftn-custom/zstart.c petscinitialize_internal
>
> PETSC_COMM_WORLD = MPI_COMM_WORLD
>
> Which means that PETSC_COMM_WORLD is not a PETSc communicator.
>
> The first matrix creation duplicates the PETSC_COMM_WORLD and thus can be
> reused for the other objects
> When you finally destroy the matrix inside the loop, the ref count of this
> duplicated comm goes to zero and it is free
> This is why you duplicate at each step
>
> However, the C version of PetscInitialize does the same, so I?m not sure
> why this happens with Fortran and not with C. (Do you leak objects in the C
> code?)
>
>
> On Nov 1, 2019, at 1:41 PM, Patrick Sanan via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> *Context:* I'm trying to track down an error that (only) arises when
> running a Fortran 90 code, using PETSc, on a new cluster. The code creates
> and destroys a linear system (Mat,Vec, and KSP) at each of (many)
> timesteps. The error message from a user looks like this, which leads me to
> suspect that MPI_Comm_dup() is being called many times and this is
> eventually a problem for this particular MPI implementation (Open MPI
> 2.1.0):
>
>
> [lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
> [lo-a2-058:21425] *** reported by process [4222287873,2]
> [lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
> [lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
> [lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
> will now abort,
> [lo-a2-058:21425] ***    and potentially your MPI job)
>
> *Question: *I remember some discussion recently (but can't find the
> thread) about not calling MPI_Comm_dup() too many times from
> PetscCommDuplicate(), which would allow one to safely use the (admittedly
> not optimal) approach used in this application code. Is that a correct
> understanding and would the fixes made in that context also apply to
> Fortran? I don't fully understand the details of the MPI techniques used,
> so thought I'd ask here.
>
> If I hack a simple build-solve-destroy example to run several loops, I see
> a notable difference between C and Fortran examples. With the attached
> ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP
> tutorials examples ex23.c and ex21f.F90, respectively, I see the following.
> Note that in the Fortran case, it appears that communicators are actually
> duplicated in each loop, but in the C case, this only happens in the first
> loop:
>
> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep
> PetscCommDuplicate
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688
> -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
>
> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep
> PetscCommDuplicate
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688
> -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688
> -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688
> -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688
> -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688
> -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
> -2080374784
>
>
>
>
> <ex221f.F90><ex223.c>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/79ac7fc0/attachment-0001.html>

From alexlindsay239 at gmail.com  Fri Nov  1 10:14:30 2019
From: alexlindsay239 at gmail.com (Alexander Lindsay)
Date: Fri, 1 Nov 2019 10:14:30 -0500
Subject: [petsc-users] VI: RS vs SS
In-Reply-To: <7D085275-4E5C-4152-9784-51A703688A0A@mcs.anl.gov>
References: <CANFcJrFjh9mZR3gE+cpKE9fqYmjC97CQ=qT3UKay6TH72RWArQ@mail.gmail.com>
	<CAP2=TMjXW=Rp_=7A=4oRnW0btrNbMVP0ok_8hB0xL4ua_BXz8A@mail.gmail.com>
	<F0A41450-707F-4266-971A-F26672F312DC@anl.gov>
	<6DA5E815-6FB8-465F-98A5-3BA67F668AFB@mcs.anl.gov>
	<CANFcJrEuGiv53KGvc+mGDkSd1E6_KO20EKNVsz5JMHP+4Ym+Hw@mail.gmail.com>
	<7D085275-4E5C-4152-9784-51A703688A0A@mcs.anl.gov>
Message-ID: <CANFcJrH66S_eA8mA0GgzesZUhVp4Dkz7+grWg=ABMRruAPJRew@mail.gmail.com>

No, the matrix is not symmetric because of how we impose some Dirichlet
conditions on the boundary. I could easily give you the Jacobian, for one
of the "bad" problems. But at least in the case of RSLS, I don't know
whether the algorithm is performing badly, or whether the slow convergence
is simply a property of the algorithm. Here's a VI monitor history for a
representative "bad" solve.

  0 SNES VI Function norm 0.229489 Active lower constraints 0/1 upper
constraints 0/1 Percent of total 0. Percent of bounded 0.
  1 SNES VI Function norm 0.365268 Active lower constraints 83/85 upper
constraints 83/85 Percent of total 0.207241 Percent of bounded 0.
  2 SNES VI Function norm 0.495088 Active lower constraints 82/84 upper
constraints 82/84 Percent of total 0.204744 Percent of bounded 0.
  3 SNES VI Function norm 0.478328 Active lower constraints 81/83 upper
constraints 81/83 Percent of total 0.202247 Percent of bounded 0.
  4 SNES VI Function norm 0.46163 Active lower constraints 80/82 upper
constraints 80/82 Percent of total 0.19975 Percent of bounded 0.
  5 SNES VI Function norm 0.444996 Active lower constraints 79/81 upper
constraints 79/81 Percent of total 0.197253 Percent of bounded 0.
  6 SNES VI Function norm 0.428424 Active lower constraints 78/80 upper
constraints 78/80 Percent of total 0.194757 Percent of bounded 0.
  7 SNES VI Function norm 0.411916 Active lower constraints 77/79 upper
constraints 77/79 Percent of total 0.19226 Percent of bounded 0.
  8 SNES VI Function norm 0.395472 Active lower constraints 76/78 upper
constraints 76/78 Percent of total 0.189763 Percent of bounded 0.
  9 SNES VI Function norm 0.379092 Active lower constraints 75/77 upper
constraints 75/77 Percent of total 0.187266 Percent of bounded 0.
 10 SNES VI Function norm 0.362776 Active lower constraints 74/76 upper
constraints 74/76 Percent of total 0.184769 Percent of bounded 0.
 11 SNES VI Function norm 0.346525 Active lower constraints 73/75 upper
constraints 73/75 Percent of total 0.182272 Percent of bounded 0.
 12 SNES VI Function norm 0.330338 Active lower constraints 72/74 upper
constraints 72/74 Percent of total 0.179775 Percent of bounded 0.
 13 SNES VI Function norm 0.314217 Active lower constraints 71/73 upper
constraints 71/73 Percent of total 0.177278 Percent of bounded 0.
 14 SNES VI Function norm 0.298162 Active lower constraints 70/72 upper
constraints 70/72 Percent of total 0.174782 Percent of bounded 0.
 15 SNES VI Function norm 0.282173 Active lower constraints 69/71 upper
constraints 69/71 Percent of total 0.172285 Percent of bounded 0.
 16 SNES VI Function norm 0.26625 Active lower constraints 68/70 upper
constraints 68/70 Percent of total 0.169788 Percent of bounded 0.
 17 SNES VI Function norm 0.250393 Active lower constraints 67/69 upper
constraints 67/69 Percent of total 0.167291 Percent of bounded 0.
 18 SNES VI Function norm 0.234604 Active lower constraints 66/68 upper
constraints 66/68 Percent of total 0.164794 Percent of bounded 0.
 19 SNES VI Function norm 0.218882 Active lower constraints 65/67 upper
constraints 65/67 Percent of total 0.162297 Percent of bounded 0.
 20 SNES VI Function norm 0.203229 Active lower constraints 64/66 upper
constraints 64/66 Percent of total 0.1598 Percent of bounded 0.
 21 SNES VI Function norm 0.187643 Active lower constraints 63/65 upper
constraints 63/65 Percent of total 0.157303 Percent of bounded 0.
 22 SNES VI Function norm 0.172126 Active lower constraints 62/64 upper
constraints 62/64 Percent of total 0.154806 Percent of bounded 0.
 23 SNES VI Function norm 0.156679 Active lower constraints 61/63 upper
constraints 61/63 Percent of total 0.15231 Percent of bounded 0.
 24 SNES VI Function norm 0.141301 Active lower constraints 60/62 upper
constraints 60/62 Percent of total 0.149813 Percent of bounded 0.
 25 SNES VI Function norm 0.125993 Active lower constraints 59/61 upper
constraints 59/61 Percent of total 0.147316 Percent of bounded 0.
 26 SNES VI Function norm 0.110755 Active lower constraints 58/60 upper
constraints 58/60 Percent of total 0.144819 Percent of bounded 0.
 27 SNES VI Function norm 0.0955886 Active lower constraints 57/59 upper
constraints 57/59 Percent of total 0.142322 Percent of bounded 0.
 28 SNES VI Function norm 0.0804936 Active lower constraints 56/58 upper
constraints 56/58 Percent of total 0.139825 Percent of bounded 0.
 29 SNES VI Function norm 0.0654705 Active lower constraints 55/57 upper
constraints 55/57 Percent of total 0.137328 Percent of bounded 0.
 30 SNES VI Function norm 0.0505198 Active lower constraints 54/56 upper
constraints 54/56 Percent of total 0.134831 Percent of bounded 0.
 31 SNES VI Function norm 0.0356422 Active lower constraints 53/55 upper
constraints 53/55 Percent of total 0.132335 Percent of bounded 0.
 32 SNES VI Function norm 0.020838 Active lower constraints 52/54 upper
constraints 52/54 Percent of total 0.129838 Percent of bounded 0.
 33 SNES VI Function norm 0.0061078 Active lower constraints 51/53 upper
constraints 51/53 Percent of total 0.127341 Percent of bounded 0.
 34 SNES VI Function norm 2.2664e-12 Active lower constraints 51/52 upper
constraints 51/52 Percent of total 0.127341 Percent of bounded 0.

I've read that in some cases the VI solver is simply unable to move the
constraint set more than one grid cell per non-linear iteration. That looks
like what I'm seeing here...

On Tue, Oct 29, 2019 at 7:15 AM Munson, Todd <tmunson at mcs.anl.gov> wrote:

>
> Hi,
>
> Is the matrix for the linear PDE symmetric?  If so, then the VI is
> equivalent to
> finding the stationary points of a bound-constrained quadratic program and
> you
> may want to use the TAO Newton Trust-Region or Line-Search methods for
> bound-constrained optimization problems.
>
> Alp: are there flags set when a problem is linear with a symmetric
> matrix?  Maybe
> we can do an internal reformulation in those cases to use the optimization
> tools.
>
> Is there an easy way to get the matrix and the constant vector for one of
> the
> problems that fails or does not perform well?  Typically, the TAO RSLS
> methods will work well for the types of problems that you have and if
> they are not, then I can go about finding out why and making some
> improvements.
>
> Monotone in this case is that your matrix is positive semidefinite; x^TMx
> >= 0 for
> all x.  For M symmetric, this is the same as M having all nonnegative
> eigenvalues.
>
> Todd.
>
> > On Oct 28, 2019, at 11:14 PM, Alexander Lindsay <
> alexlindsay239 at gmail.com> wrote:
> >
> > On Thu, Oct 24, 2019 at 4:52 AM Munson, Todd <tmunson at mcs.anl.gov>
> wrote:
> >
> > Hi,
> >
> > For these problems, how large are they?  And are they linear or
> nonlinear?
> > What I can do is use some fancier tools to help with what is going on
> with
> > the solvers in certain cases.
> >
> > For the results cited above:
> >
> > 100 elements -> 101 dofs
> > 1,000 elements -> 1,001 dofs
> > 10,000 elements -> 10,001 dofs
> >
> > The PDE is linear with simple bounds constraints on the variable: 0 <= u
> <= 10
> >
> >
> > For Barry's question, the matrix in the SS solver is a diagonal matrix
> plus
> > a column scaling of the Jacobian.
> >
> > Note: semismooth, reduced space and interior point methods mainly work
> for
> > problems that are strictly monotone.
> >
> > Dumb question, but monotone in what way?
> >
> > Thanks for the replies!
> >
> > Alex
> >
> > Finding out what is going on with
> > your problems with some additional diagnostics might yield some
> > insights.
> >
> > Todd.
> >
> > > On Oct 24, 2019, at 3:36 AM, Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >
> > > See bottom
> > >
> > >
> > >> On Oct 14, 2019, at 1:12 PM, Justin Chang via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > >>
> > >> It might depend on your application, but for my stuff on maximum
> principles for advection-diffusion, I found RS to be much better than SS.
> Here?s the paper I wrote documenting the performance numbers I came across
> > >>
> > >> https://www.sciencedirect.com/science/article/pii/S0045782516316176
> > >>
> > >> Or the arXiV version:
> > >>
> > >> https://arxiv.org/pdf/1611.08758.pdf
> > >>
> > >>
> > >> On Mon, Oct 14, 2019 at 1:07 PM Alexander Lindsay via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > >> I've been working on mechanical contact in MOOSE for a while, and
> it's led to me to think about general inequality constraint enforcement.
> I've been playing around with both `vinewtonssls` and `vinewtonrsls`. In
> Benson's and Munson's Flexible Complementarity Solvers paper, they were
> able to solve 73.7% of their problems with SS and 65.5% with RS which led
> them to conclude that the SS method is generally more robust.  We have had
> at least one instance where a MOOSE user reported an order of magnitude
> reduction in non-linear iterations when switching from SS to RS. Moreover,
> when running the problem described in this issue, I get these results:
> > >>
> > >> num_elements = 100
> > >> SS nl iterations = 53
> > >> RS nl iterations = 22
> > >>
> > >> num_elements = 1000
> > >> SS nl iterations = 123
> > >> RS nl iterations = 140
> > >>
> > >> num_elements = 10000
> > >> SS: fails to converge within 50 nl iterations during the second time
> step whether using a `basic` or `bt` line search
> > >> RS: fails to converge within 50 nl iterations during the second time
> step whether using a `basic` or `bt` line search (although I believe
> `vinewtonrsls` performs a line-search that is guaranteed to keep the
> degrees of freedom within their bounds)
> > >>
> > >> So depending on the number of elements, it appears that either SS or
> RS may be more performant. I guess since I can get different relative
> performance with even the same PDE, it would be silly for me to ask for
> guidance on when to use which? In the conclusion of Benson's and Munson's
> paper, they mention using mesh sequencing for generating initial guesses on
> finer meshes. Does anyone know whether there have been any publications
> using PETSc/TAO and mesh sequencing for solving large VI problems?
> > >>
> > >> A related question: what needs to be done to allow SS to run with
> `-snes_mf_operator`? RS already appears to support the option.
> > >
> > >   This may not make sense. Is the operator used in the SS solution
> process derivable from the function that is being optimized with the
> constraints or some strange scaled beast?
> > >>
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/99283ebb/attachment.html>

From tmunson at mcs.anl.gov  Fri Nov  1 10:18:59 2019
From: tmunson at mcs.anl.gov (Munson, Todd)
Date: Fri, 1 Nov 2019 15:18:59 +0000
Subject: [petsc-users] VI: RS vs SS
In-Reply-To: <CANFcJrH66S_eA8mA0GgzesZUhVp4Dkz7+grWg=ABMRruAPJRew@mail.gmail.com>
References: <CANFcJrFjh9mZR3gE+cpKE9fqYmjC97CQ=qT3UKay6TH72RWArQ@mail.gmail.com>
	<CAP2=TMjXW=Rp_=7A=4oRnW0btrNbMVP0ok_8hB0xL4ua_BXz8A@mail.gmail.com>
	<F0A41450-707F-4266-971A-F26672F312DC@anl.gov>
	<6DA5E815-6FB8-465F-98A5-3BA67F668AFB@mcs.anl.gov>
	<CANFcJrEuGiv53KGvc+mGDkSd1E6_KO20EKNVsz5JMHP+4Ym+Hw@mail.gmail.com>
	<7D085275-4E5C-4152-9784-51A703688A0A@mcs.anl.gov>
	<CANFcJrH66S_eA8mA0GgzesZUhVp4Dkz7+grWg=ABMRruAPJRew@mail.gmail.com>
Message-ID: <D937CCEF-8B70-4E8A-9491-D6F45CD4E0D5@mcs.anl.gov>


Yes, that looks weird.  Can you send me directly the linear problem (M, q, l, and u)?  I
will take a look and run some other diagnostics with some of my other tools.

Thanks, Todd.

> On Nov 1, 2019, at 10:14 AM, Alexander Lindsay <alexlindsay239 at gmail.com> wrote:
> 
> No, the matrix is not symmetric because of how we impose some Dirichlet conditions on the boundary. I could easily give you the Jacobian, for one of the "bad" problems. But at least in the case of RSLS, I don't know whether the algorithm is performing badly, or whether the slow convergence is simply a property of the algorithm. Here's a VI monitor history for a representative "bad" solve.
> 
>   0 SNES VI Function norm 0.229489 Active lower constraints 0/1 upper constraints 0/1 Percent of total 0. Percent of bounded 0.
>   1 SNES VI Function norm 0.365268 Active lower constraints 83/85 upper constraints 83/85 Percent of total 0.207241 Percent of bounded 0.
>   2 SNES VI Function norm 0.495088 Active lower constraints 82/84 upper constraints 82/84 Percent of total 0.204744 Percent of bounded 0.
>   3 SNES VI Function norm 0.478328 Active lower constraints 81/83 upper constraints 81/83 Percent of total 0.202247 Percent of bounded 0.
>   4 SNES VI Function norm 0.46163 Active lower constraints 80/82 upper constraints 80/82 Percent of total 0.19975 Percent of bounded 0.
>   5 SNES VI Function norm 0.444996 Active lower constraints 79/81 upper constraints 79/81 Percent of total 0.197253 Percent of bounded 0.
>   6 SNES VI Function norm 0.428424 Active lower constraints 78/80 upper constraints 78/80 Percent of total 0.194757 Percent of bounded 0.
>   7 SNES VI Function norm 0.411916 Active lower constraints 77/79 upper constraints 77/79 Percent of total 0.19226 Percent of bounded 0.
>   8 SNES VI Function norm 0.395472 Active lower constraints 76/78 upper constraints 76/78 Percent of total 0.189763 Percent of bounded 0.
>   9 SNES VI Function norm 0.379092 Active lower constraints 75/77 upper constraints 75/77 Percent of total 0.187266 Percent of bounded 0.
>  10 SNES VI Function norm 0.362776 Active lower constraints 74/76 upper constraints 74/76 Percent of total 0.184769 Percent of bounded 0.
>  11 SNES VI Function norm 0.346525 Active lower constraints 73/75 upper constraints 73/75 Percent of total 0.182272 Percent of bounded 0.
>  12 SNES VI Function norm 0.330338 Active lower constraints 72/74 upper constraints 72/74 Percent of total 0.179775 Percent of bounded 0.
>  13 SNES VI Function norm 0.314217 Active lower constraints 71/73 upper constraints 71/73 Percent of total 0.177278 Percent of bounded 0.
>  14 SNES VI Function norm 0.298162 Active lower constraints 70/72 upper constraints 70/72 Percent of total 0.174782 Percent of bounded 0.
>  15 SNES VI Function norm 0.282173 Active lower constraints 69/71 upper constraints 69/71 Percent of total 0.172285 Percent of bounded 0.
>  16 SNES VI Function norm 0.26625 Active lower constraints 68/70 upper constraints 68/70 Percent of total 0.169788 Percent of bounded 0.
>  17 SNES VI Function norm 0.250393 Active lower constraints 67/69 upper constraints 67/69 Percent of total 0.167291 Percent of bounded 0.
>  18 SNES VI Function norm 0.234604 Active lower constraints 66/68 upper constraints 66/68 Percent of total 0.164794 Percent of bounded 0.
>  19 SNES VI Function norm 0.218882 Active lower constraints 65/67 upper constraints 65/67 Percent of total 0.162297 Percent of bounded 0.
>  20 SNES VI Function norm 0.203229 Active lower constraints 64/66 upper constraints 64/66 Percent of total 0.1598 Percent of bounded 0.
>  21 SNES VI Function norm 0.187643 Active lower constraints 63/65 upper constraints 63/65 Percent of total 0.157303 Percent of bounded 0.
>  22 SNES VI Function norm 0.172126 Active lower constraints 62/64 upper constraints 62/64 Percent of total 0.154806 Percent of bounded 0.
>  23 SNES VI Function norm 0.156679 Active lower constraints 61/63 upper constraints 61/63 Percent of total 0.15231 Percent of bounded 0.
>  24 SNES VI Function norm 0.141301 Active lower constraints 60/62 upper constraints 60/62 Percent of total 0.149813 Percent of bounded 0.
>  25 SNES VI Function norm 0.125993 Active lower constraints 59/61 upper constraints 59/61 Percent of total 0.147316 Percent of bounded 0.
>  26 SNES VI Function norm 0.110755 Active lower constraints 58/60 upper constraints 58/60 Percent of total 0.144819 Percent of bounded 0.
>  27 SNES VI Function norm 0.0955886 Active lower constraints 57/59 upper constraints 57/59 Percent of total 0.142322 Percent of bounded 0.
>  28 SNES VI Function norm 0.0804936 Active lower constraints 56/58 upper constraints 56/58 Percent of total 0.139825 Percent of bounded 0.
>  29 SNES VI Function norm 0.0654705 Active lower constraints 55/57 upper constraints 55/57 Percent of total 0.137328 Percent of bounded 0.
>  30 SNES VI Function norm 0.0505198 Active lower constraints 54/56 upper constraints 54/56 Percent of total 0.134831 Percent of bounded 0.
>  31 SNES VI Function norm 0.0356422 Active lower constraints 53/55 upper constraints 53/55 Percent of total 0.132335 Percent of bounded 0.
>  32 SNES VI Function norm 0.020838 Active lower constraints 52/54 upper constraints 52/54 Percent of total 0.129838 Percent of bounded 0.
>  33 SNES VI Function norm 0.0061078 Active lower constraints 51/53 upper constraints 51/53 Percent of total 0.127341 Percent of bounded 0.
>  34 SNES VI Function norm 2.2664e-12 Active lower constraints 51/52 upper constraints 51/52 Percent of total 0.127341 Percent of bounded 0.
> 
> I've read that in some cases the VI solver is simply unable to move the constraint set more than one grid cell per non-linear iteration. That looks like what I'm seeing here...
> 
> On Tue, Oct 29, 2019 at 7:15 AM Munson, Todd <tmunson at mcs.anl.gov> wrote:
> 
> Hi,
> 
> Is the matrix for the linear PDE symmetric?  If so, then the VI is equivalent to
> finding the stationary points of a bound-constrained quadratic program and you
> may want to use the TAO Newton Trust-Region or Line-Search methods for
> bound-constrained optimization problems.
> 
> Alp: are there flags set when a problem is linear with a symmetric matrix?  Maybe
> we can do an internal reformulation in those cases to use the optimization tools.
> 
> Is there an easy way to get the matrix and the constant vector for one of the
> problems that fails or does not perform well?  Typically, the TAO RSLS
> methods will work well for the types of problems that you have and if
> they are not, then I can go about finding out why and making some
> improvements.
> 
> Monotone in this case is that your matrix is positive semidefinite; x^TMx >= 0 for 
> all x.  For M symmetric, this is the same as M having all nonnegative eigenvalues.
> 
> Todd.
> 
> > On Oct 28, 2019, at 11:14 PM, Alexander Lindsay <alexlindsay239 at gmail.com> wrote:
> > 
> > On Thu, Oct 24, 2019 at 4:52 AM Munson, Todd <tmunson at mcs.anl.gov> wrote:
> > 
> > Hi,
> > 
> > For these problems, how large are they?  And are they linear or nonlinear?  
> > What I can do is use some fancier tools to help with what is going on with 
> > the solvers in certain cases.
> > 
> > For the results cited above:
> > 
> > 100 elements -> 101 dofs
> > 1,000 elements -> 1,001 dofs
> > 10,000 elements -> 10,001 dofs
> > 
> > The PDE is linear with simple bounds constraints on the variable: 0 <= u <= 10
> > 
> > 
> > For Barry's question, the matrix in the SS solver is a diagonal matrix plus
> > a column scaling of the Jacobian.
> > 
> > Note: semismooth, reduced space and interior point methods mainly work for
> > problems that are strictly monotone. 
> > 
> > Dumb question, but monotone in what way?
> > 
> > Thanks for the replies!
> > 
> > Alex
> > 
> > Finding out what is going on with
> > your problems with some additional diagnostics might yield some 
> > insights.
> > 
> > Todd.
> > 
> > > On Oct 24, 2019, at 3:36 AM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> > > 
> > > 
> > > See bottom
> > > 
> > > 
> > >> On Oct 14, 2019, at 1:12 PM, Justin Chang via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > >> 
> > >> It might depend on your application, but for my stuff on maximum principles for advection-diffusion, I found RS to be much better than SS. Here?s the paper I wrote documenting the performance numbers I came across
> > >> 
> > >> https://www.sciencedirect.com/science/article/pii/S0045782516316176
> > >> 
> > >> Or the arXiV version:
> > >> 
> > >> https://arxiv.org/pdf/1611.08758.pdf
> > >> 
> > >> 
> > >> On Mon, Oct 14, 2019 at 1:07 PM Alexander Lindsay via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > >> I've been working on mechanical contact in MOOSE for a while, and it's led to me to think about general inequality constraint enforcement. I've been playing around with both `vinewtonssls` and `vinewtonrsls`. In Benson's and Munson's Flexible Complementarity Solvers paper, they were able to solve 73.7% of their problems with SS and 65.5% with RS which led them to conclude that the SS method is generally more robust.  We have had at least one instance where a MOOSE user reported an order of magnitude reduction in non-linear iterations when switching from SS to RS. Moreover, when running the problem described in this issue, I get these results:
> > >> 
> > >> num_elements = 100
> > >> SS nl iterations = 53
> > >> RS nl iterations = 22
> > >> 
> > >> num_elements = 1000
> > >> SS nl iterations = 123
> > >> RS nl iterations = 140
> > >> 
> > >> num_elements = 10000
> > >> SS: fails to converge within 50 nl iterations during the second time step whether using a `basic` or `bt` line search
> > >> RS: fails to converge within 50 nl iterations during the second time step whether using a `basic` or `bt` line search (although I believe `vinewtonrsls` performs a line-search that is guaranteed to keep the degrees of freedom within their bounds)
> > >> 
> > >> So depending on the number of elements, it appears that either SS or RS may be more performant. I guess since I can get different relative performance with even the same PDE, it would be silly for me to ask for guidance on when to use which? In the conclusion of Benson's and Munson's paper, they mention using mesh sequencing for generating initial guesses on finer meshes. Does anyone know whether there have been any publications using PETSc/TAO and mesh sequencing for solving large VI problems?
> > >> 
> > >> A related question: what needs to be done to allow SS to run with `-snes_mf_operator`? RS already appears to support the option.
> > > 
> > >   This may not make sense. Is the operator used in the SS solution process derivable from the function that is being optimized with the constraints or some strange scaled beast?
> > >> 
> > > 
> > 
> 


From bsmith at mcs.anl.gov  Fri Nov  1 10:24:22 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Fri, 1 Nov 2019 15:24:22 +0000
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
In-Reply-To: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
References: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
Message-ID: <51DE3BAE-2B7F-4273-95AB-9B0B32D4A483@anl.gov>


  Certain OpenMPI versions have bugs where even when you properly duplicate and then free  communicators it eventually "runs out of communicators". This is a definitely a bug and was fixed in later OpenMPI versions.  We wasted a lot of time tracking down this bug in the past. By now it is an old version of OpenMPI; the OpenMPI site https://www.open-mpi.org/software/ompi/v4.0/ lists the buggy versions as retired. 

   So the question is should PETSc attempt to change its behavior or add functionality or hacks to work around this bug?

   My answer is NO. This is a "NEW" cluster! A "NEW" cluster is not running OpenMPI 2.1 by definition of new.  The cluster manager needs to remove the buggy version of OpenMPI from their system. If the cluster manager is incapable of doing the most elementary part of the their job (removing buggy code) then the application person is stuck having to put hacks into their code to work around the bugs on their cluster; it cannot be PETSc's responsibility to distorted itself due to ancient bugs in other software.

  Barry

Note that this OpenMPI bug does not affect very many MPI or PETSc codes. It only affects those codes that completely correctly call duplicate and free many times. This is why PETSc configure doesn't blacklist the OpenMPI version (though perhaps it should).


> On Nov 1, 2019, at 5:41 AM, Patrick Sanan via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Context: I'm trying to track down an error that (only) arises when running a Fortran 90 code, using PETSc, on a new cluster. The code creates and destroys a linear system (Mat,Vec, and KSP) at each of (many) timesteps. The error message from a user looks like this, which leads me to suspect that MPI_Comm_dup() is being called many times and this is eventually a problem for this particular MPI implementation (Open MPI 2.1.0):
> 
> [lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
> [lo-a2-058:21425] *** reported by process [4222287873,2]
> [lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
> [lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
> [lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> [lo-a2-058:21425] ***    and potentially your MPI job)
> 
> Question: I remember some discussion recently (but can't find the thread) about not calling MPI_Comm_dup() too many times from PetscCommDuplicate(), which would allow one to safely use the (admittedly not optimal) approach used in this application code. Is that a correct understanding and would the fixes made in that context also apply to Fortran? I don't fully understand the details of the MPI techniques used, so thought I'd ask here. 
> 
> If I hack a simple build-solve-destroy example to run several loops, I see a notable difference between C and Fortran examples. With the attached ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP tutorials examples ex23.c and ex21f.F90, respectively, I see the following. Note that in the Fortran case, it appears that communicators are actually duplicated in each loop, but in the C case, this only happens in the first loop:
> 
> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep PetscCommDuplicate
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> 
> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep PetscCommDuplicate
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> 
> 
> 
> <ex221f.F90><ex223.c>


From patrick.sanan at gmail.com  Fri Nov  1 10:54:04 2019
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Fri, 1 Nov 2019 16:54:04 +0100
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
In-Reply-To: <51DE3BAE-2B7F-4273-95AB-9B0B32D4A483@anl.gov>
References: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
	<51DE3BAE-2B7F-4273-95AB-9B0B32D4A483@anl.gov>
Message-ID: <BE3AB8B0-FB4F-43F3-B515-C00A9ED81191@gmail.com>

Thanks, Barry. I should have realized that was an ancient version. The cluster does have Open MPI 4.0.1 so I'll see if we can't use that instead. (I'm sure that the old version is there just to provide continuity - the weird thing is that the previous, quite similar, cluster used Open MPI 1.6.5 and that seemed to work fine with this application :D )

> Am 01.11.2019 um 16:24 schrieb Smith, Barry F. <bsmith at mcs.anl.gov>:
> 
> 
>  Certain OpenMPI versions have bugs where even when you properly duplicate and then free  communicators it eventually "runs out of communicators". This is a definitely a bug and was fixed in later OpenMPI versions.  We wasted a lot of time tracking down this bug in the past. By now it is an old version of OpenMPI; the OpenMPI site https://www.open-mpi.org/software/ompi/v4.0/ lists the buggy versions as retired. 
> 
>   So the question is should PETSc attempt to change its behavior or add functionality or hacks to work around this bug?
> 
>   My answer is NO. This is a "NEW" cluster! A "NEW" cluster is not running OpenMPI 2.1 by definition of new.  The cluster manager needs to remove the buggy version of OpenMPI from their system. If the cluster manager is incapable of doing the most elementary part of the their job (removing buggy code) then the application person is stuck having to put hacks into their code to work around the bugs on their cluster; it cannot be PETSc's responsibility to distorted itself due to ancient bugs in other software.
> 
>  Barry
> 
> Note that this OpenMPI bug does not affect very many MPI or PETSc codes. It only affects those codes that completely correctly call duplicate and free many times. This is why PETSc configure doesn't blacklist the OpenMPI version (though perhaps it should).
> 
> 
> 
>> On Nov 1, 2019, at 5:41 AM, Patrick Sanan via petsc-users <petsc-users at mcs.anl.gov> wrote:
>> 
>> Context: I'm trying to track down an error that (only) arises when running a Fortran 90 code, using PETSc, on a new cluster. The code creates and destroys a linear system (Mat,Vec, and KSP) at each of (many) timesteps. The error message from a user looks like this, which leads me to suspect that MPI_Comm_dup() is being called many times and this is eventually a problem for this particular MPI implementation (Open MPI 2.1.0):
>> 
>> [lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
>> [lo-a2-058:21425] *** reported by process [4222287873,2]
>> [lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
>> [lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
>> [lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>> [lo-a2-058:21425] ***    and potentially your MPI job)
>> 
>> Question: I remember some discussion recently (but can't find the thread) about not calling MPI_Comm_dup() too many times from PetscCommDuplicate(), which would allow one to safely use the (admittedly not optimal) approach used in this application code. Is that a correct understanding and would the fixes made in that context also apply to Fortran? I don't fully understand the details of the MPI techniques used, so thought I'd ask here. 
>> 
>> If I hack a simple build-solve-destroy example to run several loops, I see a notable difference between C and Fortran examples. With the attached ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP tutorials examples ex23.c and ex21f.F90, respectively, I see the following. Note that in the Fortran case, it appears that communicators are actually duplicated in each loop, but in the C case, this only happens in the first loop:
>> 
>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep PetscCommDuplicate
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> 
>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep PetscCommDuplicate
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>> 
>> 
>> 
>> <ex221f.F90><ex223.c>
> 


From bsmith at mcs.anl.gov  Fri Nov  1 13:39:17 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Fri, 1 Nov 2019 18:39:17 +0000
Subject: [petsc-users] Do the guards against calling MPI_Comm_dup() in
 PetscCommDuplicate() apply with Fortran?
In-Reply-To: <BE3AB8B0-FB4F-43F3-B515-C00A9ED81191@gmail.com>
References: <CA+z91Td3w14XMqgJ_UepobaKHR2DExneWVV7bQw-jwLcGb6Ehw@mail.gmail.com>
	<51DE3BAE-2B7F-4273-95AB-9B0B32D4A483@anl.gov>
	<BE3AB8B0-FB4F-43F3-B515-C00A9ED81191@gmail.com>
Message-ID: <1BC6A575-727A-4108-BDFC-86EAA82BF680@mcs.anl.gov>


> On Nov 1, 2019, at 10:54 AM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> 
> Thanks, Barry. I should have realized that was an ancient version. The cluster does have Open MPI 4.0.1 so I'll see if we can't use that instead. (I'm sure that the old version is there just to provide continuity - the weird thing is that the previous, quite similar, cluster used Open MPI 1.6.5 and that seemed to work fine with this application :D )

  Yes, the bug was introduced into OpenMPI at some point and then removed at a later point, so it is actually completely reasonable that the older OpenMPI worked fine.

   Barry

> 
>> Am 01.11.2019 um 16:24 schrieb Smith, Barry F. <bsmith at mcs.anl.gov>:
>> 
>> 
>> Certain OpenMPI versions have bugs where even when you properly duplicate and then free  communicators it eventually "runs out of communicators". This is a definitely a bug and was fixed in later OpenMPI versions.  We wasted a lot of time tracking down this bug in the past. By now it is an old version of OpenMPI; the OpenMPI site https://www.open-mpi.org/software/ompi/v4.0/ lists the buggy versions as retired. 
>> 
>>  So the question is should PETSc attempt to change its behavior or add functionality or hacks to work around this bug?
>> 
>>  My answer is NO. This is a "NEW" cluster! A "NEW" cluster is not running OpenMPI 2.1 by definition of new.  The cluster manager needs to remove the buggy version of OpenMPI from their system. If the cluster manager is incapable of doing the most elementary part of the their job (removing buggy code) then the application person is stuck having to put hacks into their code to work around the bugs on their cluster; it cannot be PETSc's responsibility to distorted itself due to ancient bugs in other software.
>> 
>> Barry
>> 
>> Note that this OpenMPI bug does not affect very many MPI or PETSc codes. It only affects those codes that completely correctly call duplicate and free many times. This is why PETSc configure doesn't blacklist the OpenMPI version (though perhaps it should).
>> 
>> 
>> 
>>> On Nov 1, 2019, at 5:41 AM, Patrick Sanan via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>> 
>>> Context: I'm trying to track down an error that (only) arises when running a Fortran 90 code, using PETSc, on a new cluster. The code creates and destroys a linear system (Mat,Vec, and KSP) at each of (many) timesteps. The error message from a user looks like this, which leads me to suspect that MPI_Comm_dup() is being called many times and this is eventually a problem for this particular MPI implementation (Open MPI 2.1.0):
>>> 
>>> [lo-a2-058:21425] *** An error occurred in MPI_Comm_dup
>>> [lo-a2-058:21425] *** reported by process [4222287873,2]
>>> [lo-a2-058:21425] *** on communicator MPI COMMUNICATOR 65534 DUP FROM 65533
>>> [lo-a2-058:21425] *** MPI_ERR_INTERN: internal error
>>> [lo-a2-058:21425] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>>> [lo-a2-058:21425] ***    and potentially your MPI job)
>>> 
>>> Question: I remember some discussion recently (but can't find the thread) about not calling MPI_Comm_dup() too many times from PetscCommDuplicate(), which would allow one to safely use the (admittedly not optimal) approach used in this application code. Is that a correct understanding and would the fixes made in that context also apply to Fortran? I don't fully understand the details of the MPI techniques used, so thought I'd ask here. 
>>> 
>>> If I hack a simple build-solve-destroy example to run several loops, I see a notable difference between C and Fortran examples. With the attached ex223.c and ex221f.F90, which just add outer loops (5 iterations) to KSP tutorials examples ex23.c and ex21f.F90, respectively, I see the following. Note that in the Fortran case, it appears that communicators are actually duplicated in each loop, but in the C case, this only happens in the first loop:
>>> 
>>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex223 -info | grep PetscCommDuplicate
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> 
>>> [(arch-maint-extra-opt) tutorials (maint *$%=)]$ ./ex221f -info | grep PetscCommDuplicate
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 268435455
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
>>> 
>>> 
>>> 
>>> <ex221f.F90><ex223.c>
>> 
> 


From sajidsyed2021 at u.northwestern.edu  Fri Nov  1 14:06:33 2019
From: sajidsyed2021 at u.northwestern.edu (Sajid Ali)
Date: Fri, 1 Nov 2019 14:06:33 -0500
Subject: [petsc-users] VecDuplicate for FFTW-Vec causes VecDestroy to fail
 conditionally on VecLoad
Message-ID: <CAOGsD9gxHfzqyPswVX4XxH1nF7DGKpnWukD6OGkQqu5_2mEJWg@mail.gmail.com>

Hi PETSc-developers,

I'm unable to debug a crash with VecDestroy that seems to depend only on
whether or not a VecLoad was performed on a vector that was generated by
duplicating one generated by MatCreateVecsFFTW.

I'm attaching two examples ex1.c and ex2.c. The first one just creates
vectors aligned as per FFTW layout, duplicates one of them and destroys all
at the end. A bug related to this was fixed sometime between the 3.11
release and 3.12 release. I've tested this code with the versions 3.11.1
and 3.12.1 and as expected it runs with no issues for 3.12.1 and fails with
3.11.1.

Now, the second one just adds a few lines which load a vector from memory
to the duplicated vector before destroying all. For some reason, this code
fails for both 3.11.1 and 3.12.1 versions. I'm lost as to what may cause
this error and would appreciate any help in how to debug this. Thanks in
advance for the help!

PS: I've attached the two codes, ex1.c/ex2.c, the log files for both make
and run and finally a bash script that was run to compile/log and control
the version of petsc used.


--
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1_make_3_12_1
Type: application/octet-stream
Size: 3604 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0011.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1_make_3_11_1
Type: application/octet-stream
Size: 3768 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0012.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1.c
Type: application/octet-stream
Size: 1809 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0013.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1_log_3_12_1
Type: application/octet-stream
Size: 351 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0014.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1_log_3_11_1
Type: application/octet-stream
Size: 433628 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0015.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex2.c
Type: application/octet-stream
Size: 2186 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0016.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex2_make_3_11_1
Type: application/octet-stream
Size: 3768 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0017.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex2_make_3_12_1
Type: application/octet-stream
Size: 3604 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0018.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex2_log_3_12_1
Type: application/octet-stream
Size: 433628 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0019.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.sh
Type: application/octet-stream
Size: 514 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0020.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex2_log_3_11_1
Type: application/octet-stream
Size: 434412 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/f240800f/attachment-0021.obj>

From jczhang at mcs.anl.gov  Fri Nov  1 16:50:10 2019
From: jczhang at mcs.anl.gov (Zhang, Junchao)
Date: Fri, 1 Nov 2019 21:50:10 +0000
Subject: [petsc-users] VecDuplicate for FFTW-Vec causes VecDestroy to
 fail conditionally on VecLoad
In-Reply-To: <CAOGsD9gxHfzqyPswVX4XxH1nF7DGKpnWukD6OGkQqu5_2mEJWg@mail.gmail.com>
References: <CAOGsD9gxHfzqyPswVX4XxH1nF7DGKpnWukD6OGkQqu5_2mEJWg@mail.gmail.com>
Message-ID: <CA+MQGp8xFDn8RtE7xeecLUpn3vRHfQRDbL6NDN0Wyg7+uhjejg@mail.gmail.com>

I know nothing about Vec FFTW, but if you can provide hdf5 files in your test, I will see if I can reproduce it.
--Junchao Zhang


On Fri, Nov 1, 2019 at 2:08 PM Sajid Ali via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi PETSc-developers,

I'm unable to debug a crash with VecDestroy that seems to depend only on whether or not a VecLoad was performed on a vector that was generated by duplicating one generated by MatCreateVecsFFTW.

I'm attaching two examples ex1.c and ex2.c. The first one just creates vectors aligned as per FFTW layout, duplicates one of them and destroys all at the end. A bug related to this was fixed sometime between the 3.11 release and 3.12 release. I've tested this code with the versions 3.11.1 and 3.12.1 and as expected it runs with no issues for 3.12.1 and fails with 3.11.1.

Now, the second one just adds a few lines which load a vector from memory to the duplicated vector before destroying all. For some reason, this code fails for both 3.11.1 and 3.12.1 versions. I'm lost as to what may cause this error and would appreciate any help in how to debug this. Thanks in advance for the help!

PS: I've attached the two codes, ex1.c/ex2.c, the log files for both make and run and finally a bash script that was run to compile/log and control the version of petsc used.


--
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io<http://s-sajid-ali.github.io>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/79910655/attachment.html>

From bsmith at mcs.anl.gov  Fri Nov  1 18:10:38 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Fri, 1 Nov 2019 23:10:38 +0000
Subject: [petsc-users] VecDuplicate for FFTW-Vec causes VecDestroy to
 fail conditionally on VecLoad
In-Reply-To: <CA+MQGp8xFDn8RtE7xeecLUpn3vRHfQRDbL6NDN0Wyg7+uhjejg@mail.gmail.com>
References: <CAOGsD9gxHfzqyPswVX4XxH1nF7DGKpnWukD6OGkQqu5_2mEJWg@mail.gmail.com>
	<CA+MQGp8xFDn8RtE7xeecLUpn3vRHfQRDbL6NDN0Wyg7+uhjejg@mail.gmail.com>
Message-ID: <9495B365-A1AE-4840-8E2D-45CD01DE3D41@anl.gov>


> On Nov 1, 2019, at 4:50 PM, Zhang, Junchao via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> I know nothing about Vec FFTW,

  You are lucky :-)

> but if you can provide hdf5 files in your test, I will see if I can reproduce it.
> --Junchao Zhang
> 
> 
> On Fri, Nov 1, 2019 at 2:08 PM Sajid Ali via petsc-users <petsc-users at mcs.anl.gov> wrote:
> Hi PETSc-developers, 
> 
> I'm unable to debug a crash with VecDestroy that seems to depend only on whether or not a VecLoad was performed on a vector that was generated by duplicating one generated by MatCreateVecsFFTW. 
> 
> I'm attaching two examples ex1.c and ex2.c. The first one just creates vectors aligned as per FFTW layout, duplicates one of them and destroys all at the end. A bug related to this was fixed sometime between the 3.11 release and 3.12 release. I've tested this code with the versions 3.11.1 and 3.12.1 and as expected it runs with no issues for 3.12.1 and fails with 3.11.1.
> 
> Now, the second one just adds a few lines which load a vector from memory to the duplicated vector before destroying all. For some reason, this code fails for both 3.11.1 and 3.12.1 versions. I'm lost as to what may cause this error and would appreciate any help in how to debug this. Thanks in advance for the help! 
> 
> PS: I've attached the two codes, ex1.c/ex2.c, the log files for both make and run and finally a bash script that was run to compile/log and control the version of petsc used. 
> 
> 
> --
> Sajid Ali
> Applied Physics
> Northwestern University
> s-sajid-ali.github.io


From sajidsyed2021 at u.northwestern.edu  Fri Nov  1 18:49:56 2019
From: sajidsyed2021 at u.northwestern.edu (Sajid Ali)
Date: Fri, 1 Nov 2019 18:49:56 -0500
Subject: [petsc-users] VecDuplicate for FFTW-Vec causes VecDestroy to
 fail conditionally on VecLoad
In-Reply-To: <9495B365-A1AE-4840-8E2D-45CD01DE3D41@anl.gov>
References: <CAOGsD9gxHfzqyPswVX4XxH1nF7DGKpnWukD6OGkQqu5_2mEJWg@mail.gmail.com>
	<CA+MQGp8xFDn8RtE7xeecLUpn3vRHfQRDbL6NDN0Wyg7+uhjejg@mail.gmail.com>
	<9495B365-A1AE-4840-8E2D-45CD01DE3D41@anl.gov>
Message-ID: <CAOGsD9jo-pNNSkHVLOKXJRKN7HF6rF3oXT_49q-BJx1wM8zGjg@mail.gmail.com>

 Hi Junchao/Barry,

It doesn't really matter what the h5 file contains,  so I'm attaching a
lightly edited script of src/vec/vec/examples/tutorials/ex10.c which should
produce a vector to be used as input for the above test case. (I'm working
with ` --with-scalar-type=complex`).

Now that I think of it, fixing this bug is not important, I can workaround
the issue by creating a new vector with VecCreateMPI and accept the small
loss in performance of VecPointwiseMult due to misaligned layouts. If it's
a small fix it may be worth the time, but fixing this is not a big priority
right now. If it's a complicated fix, this issue can serve as a note to
future users.


Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/87b01f8b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex10.c
Type: application/octet-stream
Size: 1568 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191101/87b01f8b/attachment.obj>

From juaneah at gmail.com  Sun Nov  3 23:41:50 2019
From: juaneah at gmail.com (Emmanuel Ayala)
Date: Sun, 3 Nov 2019 23:41:50 -0600
Subject: [petsc-users] doubts on VecScatterCreate
Message-ID: <CAMo+o5j367nbE++umf38owsfm5qaCz=KAFt2myu=wUWJFOrwow@mail.gmail.com>

Hi everyone, thanks in advance.

I have three parallel vectors: A, B and C. A and B have different sizes,
and C must be contain these two vectors (MatLab notation C=[A;B]). I need
to do some operations on C then put back the proper portion of C on A and
B, then I do some computations on A and B y put again on C, and the loop
repeats.

For these propose I use Scatters:

C is created as a parallel vector with size of (sizeA + sizeB) with
petsc_decide for parallel layout. The vectors have been distributed on the
same amount of processes.

For the specific case with order [A;B]

VecGetOwnershipRange(A,&start,&end);
ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);
ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_toC1);// this is
redundant
VecScatterCreate(A,is_fromA,C,is_toC1,&scatter1);

VecGetSize(A,&sizeA)
VecGetOwnershipRange(B,&start,&end);
ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromB);
ISCreateStride(MPI_COMM_WORLD,(end-start),(start+sizeA),1,&is_toC2);
//shifts the index location
VecScatterCreate(B,is_fromB,C,is_toC2,&scatter2);

Then I can use
VecScatterBegin(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
VecScatterEnd(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);

and
VecScatterBegin(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
VecScatterEnd(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);

and the same with B.
I used MPI_COMM SELF and I got the same results.

*The situation is: My results look good for the portion of B, but no for
the portion of A, there is something that I'm doing wrong with the
scattering?*

Best regards.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191103/b7ba25e2/attachment.html>

From juaneah at gmail.com  Sun Nov  3 23:49:38 2019
From: juaneah at gmail.com (Emmanuel Ayala)
Date: Sun, 3 Nov 2019 23:49:38 -0600
Subject: [petsc-users] doubts on VecScatterCreate
In-Reply-To: <CAMo+o5j367nbE++umf38owsfm5qaCz=KAFt2myu=wUWJFOrwow@mail.gmail.com>
References: <CAMo+o5j367nbE++umf38owsfm5qaCz=KAFt2myu=wUWJFOrwow@mail.gmail.com>
Message-ID: <CAMo+o5jW69dFRijEoeQdax8gjMKARarFo0BvsZGtb1rYT3TQWw@mail.gmail.com>

*I mean, the portion for A look like messy distribution, as if scatter was
done wrong*

El dom., 3 de nov. de 2019 a la(s) 23:41, Emmanuel Ayala (juaneah at gmail.com)
escribi?:

> Hi everyone, thanks in advance.
>
> I have three parallel vectors: A, B and C. A and B have different sizes,
> and C must be contain these two vectors (MatLab notation C=[A;B]). I need
> to do some operations on C then put back the proper portion of C on A and
> B, then I do some computations on A and B y put again on C, and the loop
> repeats.
>
> For these propose I use Scatters:
>
> C is created as a parallel vector with size of (sizeA + sizeB) with
> petsc_decide for parallel layout. The vectors have been distributed on the
> same amount of processes.
>
> For the specific case with order [A;B]
>
> VecGetOwnershipRange(A,&start,&end);
> ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);
> ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_toC1);// this is
> redundant
> VecScatterCreate(A,is_fromA,C,is_toC1,&scatter1);
>
> VecGetSize(A,&sizeA)
> VecGetOwnershipRange(B,&start,&end);
> ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromB);
> ISCreateStride(MPI_COMM_WORLD,(end-start),(start+sizeA),1,&is_toC2);
> //shifts the index location
> VecScatterCreate(B,is_fromB,C,is_toC2,&scatter2);
>
> Then I can use
> VecScatterBegin(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
> VecScatterEnd(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
>
> and
> VecScatterBegin(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
> VecScatterEnd(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
>
> and the same with B.
> I used MPI_COMM SELF and I got the same results.
>
> *The situation is: My results look good for the portion of B, but no for
> the portion of A, there is something that I'm doing wrong with the
> scattering?*
>
> Best regards.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191103/3dfd4f18/attachment.html>

From bsmith at mcs.anl.gov  Mon Nov  4 08:47:47 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Mon, 4 Nov 2019 14:47:47 +0000
Subject: [petsc-users] doubts on VecScatterCreate
In-Reply-To: <CAMo+o5j367nbE++umf38owsfm5qaCz=KAFt2myu=wUWJFOrwow@mail.gmail.com>
References: <CAMo+o5j367nbE++umf38owsfm5qaCz=KAFt2myu=wUWJFOrwow@mail.gmail.com>
Message-ID: <35D3FEA6-6B71-44F4-B746-23461B581E9C@anl.gov>


   It works for me. Please send a complete code that fails.


> On Nov 3, 2019, at 11:41 PM, Emmanuel Ayala via petsc-users <petsc-users at mcs.anl.gov> wrote:
>
> Hi everyone, thanks in advance.
>
> I have three parallel vectors: A, B and C. A and B have different sizes, and C must be contain these two vectors (MatLab notation C=[A;B]). I need to do some operations on C then put back the proper portion of C on A and B, then I do some computations on A and B y put again on C, and the loop repeats.
>
> For these propose I use Scatters:
>
> C is created as a parallel vector with size of (sizeA + sizeB) with petsc_decide for parallel layout. The vectors have been distributed on the same amount of processes.
>
> For the specific case with order [A;B]
>
> VecGetOwnershipRange(A,&start,&end);
> ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);
> ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_toC1);// this is redundant
> VecScatterCreate(A,is_fromA,C,is_toC1,&scatter1);
>
> VecGetSize(A,&sizeA)
> VecGetOwnershipRange(B,&start,&end);
> ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromB);
> ISCreateStride(MPI_COMM_WORLD,(end-start),(start+sizeA),1,&is_toC2); //shifts the index location
> VecScatterCreate(B,is_fromB,C,is_toC2,&scatter2);
>
> Then I can use
> VecScatterBegin(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
> VecScatterEnd(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
>
> and
> VecScatterBegin(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
> VecScatterEnd(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
>
> and the same with B.
> I used MPI_COMM SELF and I got the same results.
>
> The situation is: My results look good for the portion of B, but no for the portion of A, there is something that I'm doing wrong with the scattering?
>
> Best regards.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/af8d6938/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1.c
Type: application/octet-stream
Size: 1481 bytes
Desc: ex1.c
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/af8d6938/attachment.obj>

From perceval.desforges at polytechnique.edu  Mon Nov  4 11:33:54 2019
From: perceval.desforges at polytechnique.edu (Perceval Desforges)
Date: Mon, 04 Nov 2019 18:33:54 +0100
Subject: [petsc-users] SLEPC no speedup in parallel
Message-ID: <82480440c0cf3973ca6e935413279be3@polytechnique.edu>

Dear petsc and slepc developpers, 

I am using slepc to solve an eigenvalue problem. Since I need all the
eigenvalues in a certain interval, I use the spectrum slicing technique
with mumps. However I do not understand: when I run my code with more
than one processor, there is no speedup at all, and it even slows down,
and I don't understand why. 

I wanted to test further and I ran the same code without spectrum
slicing, and asking for about the same amount of eigenvalues. The
calculation was much slower (about 10 times slower), but using more than
one processor sped it up. 

Is this normal behavior or am I doing something wrong? 

Thanks, 

Best regards, 

Perceval,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/7f4bc155/attachment.html>

From jroman at dsic.upv.es  Mon Nov  4 11:45:36 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 4 Nov 2019 18:45:36 +0100
Subject: [petsc-users] SLEPC no speedup in parallel
In-Reply-To: <82480440c0cf3973ca6e935413279be3@polytechnique.edu>
References: <82480440c0cf3973ca6e935413279be3@polytechnique.edu>
Message-ID: <710EDF59-3233-470E-8E1F-22E3B504556B@dsic.upv.es>

Did you follow the instructions in section 3.4.5 of the SLEPc users manual?
Send the output of -eps_view

Jose


> El 4 nov 2019, a las 18:33, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> 
> Dear petsc and slepc developpers,
> 
> I am using slepc to solve an eigenvalue problem. Since I need all the eigenvalues in a certain interval, I use the spectrum slicing technique with mumps. However I do not understand: when I run my code with more than one processor, there is no speedup at all, and it even slows down, and I don't understand why.
> 
> I wanted to test further and I ran the same code without spectrum slicing, and asking for about the same amount of eigenvalues. The calculation was much slower (about 10 times slower), but using more than one processor sped it up.
> 
> Is this normal behavior or am I doing something wrong?
> 
> Thanks,
> 
> Best regards,
> 
> Perceval,
> 
> 
> 


From alexlindsay239 at gmail.com  Mon Nov  4 12:44:14 2019
From: alexlindsay239 at gmail.com (Alexander Lindsay)
Date: Mon, 4 Nov 2019 12:44:14 -0600
Subject: [petsc-users] VI: RS vs SS
In-Reply-To: <D937CCEF-8B70-4E8A-9491-D6F45CD4E0D5@mcs.anl.gov>
References: <CANFcJrFjh9mZR3gE+cpKE9fqYmjC97CQ=qT3UKay6TH72RWArQ@mail.gmail.com>
	<CAP2=TMjXW=Rp_=7A=4oRnW0btrNbMVP0ok_8hB0xL4ua_BXz8A@mail.gmail.com>
	<F0A41450-707F-4266-971A-F26672F312DC@anl.gov>
	<6DA5E815-6FB8-465F-98A5-3BA67F668AFB@mcs.anl.gov>
	<CANFcJrEuGiv53KGvc+mGDkSd1E6_KO20EKNVsz5JMHP+4Ym+Hw@mail.gmail.com>
	<7D085275-4E5C-4152-9784-51A703688A0A@mcs.anl.gov>
	<CANFcJrH66S_eA8mA0GgzesZUhVp4Dkz7+grWg=ABMRruAPJRew@mail.gmail.com>
	<D937CCEF-8B70-4E8A-9491-D6F45CD4E0D5@mcs.anl.gov>
Message-ID: <CANFcJrFJhqgT6Z50ZAw9rZevO+wPGRYSw_hc++bUongjc05ZSQ@mail.gmail.com>

I'm not too familiar with the M and q notation. However, I've attached A
and b for the unconstrained linear problem in PETSc binary format (don't
know if they'll go through on this list...). l and u are 0 and 10
respectively.

On Fri, Nov 1, 2019 at 10:19 AM Munson, Todd <tmunson at mcs.anl.gov> wrote:

>
> Yes, that looks weird.  Can you send me directly the linear problem (M, q,
> l, and u)?  I
> will take a look and run some other diagnostics with some of my other
> tools.
>
> Thanks, Todd.
>
> > On Nov 1, 2019, at 10:14 AM, Alexander Lindsay <alexlindsay239 at gmail.com>
> wrote:
> >
> > No, the matrix is not symmetric because of how we impose some Dirichlet
> conditions on the boundary. I could easily give you the Jacobian, for one
> of the "bad" problems. But at least in the case of RSLS, I don't know
> whether the algorithm is performing badly, or whether the slow convergence
> is simply a property of the algorithm. Here's a VI monitor history for a
> representative "bad" solve.
> >
> >   0 SNES VI Function norm 0.229489 Active lower constraints 0/1 upper
> constraints 0/1 Percent of total 0. Percent of bounded 0.
> >   1 SNES VI Function norm 0.365268 Active lower constraints 83/85 upper
> constraints 83/85 Percent of total 0.207241 Percent of bounded 0.
> >   2 SNES VI Function norm 0.495088 Active lower constraints 82/84 upper
> constraints 82/84 Percent of total 0.204744 Percent of bounded 0.
> >   3 SNES VI Function norm 0.478328 Active lower constraints 81/83 upper
> constraints 81/83 Percent of total 0.202247 Percent of bounded 0.
> >   4 SNES VI Function norm 0.46163 Active lower constraints 80/82 upper
> constraints 80/82 Percent of total 0.19975 Percent of bounded 0.
> >   5 SNES VI Function norm 0.444996 Active lower constraints 79/81 upper
> constraints 79/81 Percent of total 0.197253 Percent of bounded 0.
> >   6 SNES VI Function norm 0.428424 Active lower constraints 78/80 upper
> constraints 78/80 Percent of total 0.194757 Percent of bounded 0.
> >   7 SNES VI Function norm 0.411916 Active lower constraints 77/79 upper
> constraints 77/79 Percent of total 0.19226 Percent of bounded 0.
> >   8 SNES VI Function norm 0.395472 Active lower constraints 76/78 upper
> constraints 76/78 Percent of total 0.189763 Percent of bounded 0.
> >   9 SNES VI Function norm 0.379092 Active lower constraints 75/77 upper
> constraints 75/77 Percent of total 0.187266 Percent of bounded 0.
> >  10 SNES VI Function norm 0.362776 Active lower constraints 74/76 upper
> constraints 74/76 Percent of total 0.184769 Percent of bounded 0.
> >  11 SNES VI Function norm 0.346525 Active lower constraints 73/75 upper
> constraints 73/75 Percent of total 0.182272 Percent of bounded 0.
> >  12 SNES VI Function norm 0.330338 Active lower constraints 72/74 upper
> constraints 72/74 Percent of total 0.179775 Percent of bounded 0.
> >  13 SNES VI Function norm 0.314217 Active lower constraints 71/73 upper
> constraints 71/73 Percent of total 0.177278 Percent of bounded 0.
> >  14 SNES VI Function norm 0.298162 Active lower constraints 70/72 upper
> constraints 70/72 Percent of total 0.174782 Percent of bounded 0.
> >  15 SNES VI Function norm 0.282173 Active lower constraints 69/71 upper
> constraints 69/71 Percent of total 0.172285 Percent of bounded 0.
> >  16 SNES VI Function norm 0.26625 Active lower constraints 68/70 upper
> constraints 68/70 Percent of total 0.169788 Percent of bounded 0.
> >  17 SNES VI Function norm 0.250393 Active lower constraints 67/69 upper
> constraints 67/69 Percent of total 0.167291 Percent of bounded 0.
> >  18 SNES VI Function norm 0.234604 Active lower constraints 66/68 upper
> constraints 66/68 Percent of total 0.164794 Percent of bounded 0.
> >  19 SNES VI Function norm 0.218882 Active lower constraints 65/67 upper
> constraints 65/67 Percent of total 0.162297 Percent of bounded 0.
> >  20 SNES VI Function norm 0.203229 Active lower constraints 64/66 upper
> constraints 64/66 Percent of total 0.1598 Percent of bounded 0.
> >  21 SNES VI Function norm 0.187643 Active lower constraints 63/65 upper
> constraints 63/65 Percent of total 0.157303 Percent of bounded 0.
> >  22 SNES VI Function norm 0.172126 Active lower constraints 62/64 upper
> constraints 62/64 Percent of total 0.154806 Percent of bounded 0.
> >  23 SNES VI Function norm 0.156679 Active lower constraints 61/63 upper
> constraints 61/63 Percent of total 0.15231 Percent of bounded 0.
> >  24 SNES VI Function norm 0.141301 Active lower constraints 60/62 upper
> constraints 60/62 Percent of total 0.149813 Percent of bounded 0.
> >  25 SNES VI Function norm 0.125993 Active lower constraints 59/61 upper
> constraints 59/61 Percent of total 0.147316 Percent of bounded 0.
> >  26 SNES VI Function norm 0.110755 Active lower constraints 58/60 upper
> constraints 58/60 Percent of total 0.144819 Percent of bounded 0.
> >  27 SNES VI Function norm 0.0955886 Active lower constraints 57/59 upper
> constraints 57/59 Percent of total 0.142322 Percent of bounded 0.
> >  28 SNES VI Function norm 0.0804936 Active lower constraints 56/58 upper
> constraints 56/58 Percent of total 0.139825 Percent of bounded 0.
> >  29 SNES VI Function norm 0.0654705 Active lower constraints 55/57 upper
> constraints 55/57 Percent of total 0.137328 Percent of bounded 0.
> >  30 SNES VI Function norm 0.0505198 Active lower constraints 54/56 upper
> constraints 54/56 Percent of total 0.134831 Percent of bounded 0.
> >  31 SNES VI Function norm 0.0356422 Active lower constraints 53/55 upper
> constraints 53/55 Percent of total 0.132335 Percent of bounded 0.
> >  32 SNES VI Function norm 0.020838 Active lower constraints 52/54 upper
> constraints 52/54 Percent of total 0.129838 Percent of bounded 0.
> >  33 SNES VI Function norm 0.0061078 Active lower constraints 51/53 upper
> constraints 51/53 Percent of total 0.127341 Percent of bounded 0.
> >  34 SNES VI Function norm 2.2664e-12 Active lower constraints 51/52
> upper constraints 51/52 Percent of total 0.127341 Percent of bounded 0.
> >
> > I've read that in some cases the VI solver is simply unable to move the
> constraint set more than one grid cell per non-linear iteration. That looks
> like what I'm seeing here...
> >
> > On Tue, Oct 29, 2019 at 7:15 AM Munson, Todd <tmunson at mcs.anl.gov>
> wrote:
> >
> > Hi,
> >
> > Is the matrix for the linear PDE symmetric?  If so, then the VI is
> equivalent to
> > finding the stationary points of a bound-constrained quadratic program
> and you
> > may want to use the TAO Newton Trust-Region or Line-Search methods for
> > bound-constrained optimization problems.
> >
> > Alp: are there flags set when a problem is linear with a symmetric
> matrix?  Maybe
> > we can do an internal reformulation in those cases to use the
> optimization tools.
> >
> > Is there an easy way to get the matrix and the constant vector for one
> of the
> > problems that fails or does not perform well?  Typically, the TAO RSLS
> > methods will work well for the types of problems that you have and if
> > they are not, then I can go about finding out why and making some
> > improvements.
> >
> > Monotone in this case is that your matrix is positive semidefinite;
> x^TMx >= 0 for
> > all x.  For M symmetric, this is the same as M having all nonnegative
> eigenvalues.
> >
> > Todd.
> >
> > > On Oct 28, 2019, at 11:14 PM, Alexander Lindsay <
> alexlindsay239 at gmail.com> wrote:
> > >
> > > On Thu, Oct 24, 2019 at 4:52 AM Munson, Todd <tmunson at mcs.anl.gov>
> wrote:
> > >
> > > Hi,
> > >
> > > For these problems, how large are they?  And are they linear or
> nonlinear?
> > > What I can do is use some fancier tools to help with what is going on
> with
> > > the solvers in certain cases.
> > >
> > > For the results cited above:
> > >
> > > 100 elements -> 101 dofs
> > > 1,000 elements -> 1,001 dofs
> > > 10,000 elements -> 10,001 dofs
> > >
> > > The PDE is linear with simple bounds constraints on the variable: 0 <=
> u <= 10
> > >
> > >
> > > For Barry's question, the matrix in the SS solver is a diagonal matrix
> plus
> > > a column scaling of the Jacobian.
> > >
> > > Note: semismooth, reduced space and interior point methods mainly work
> for
> > > problems that are strictly monotone.
> > >
> > > Dumb question, but monotone in what way?
> > >
> > > Thanks for the replies!
> > >
> > > Alex
> > >
> > > Finding out what is going on with
> > > your problems with some additional diagnostics might yield some
> > > insights.
> > >
> > > Todd.
> > >
> > > > On Oct 24, 2019, at 3:36 AM, Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> > > >
> > > >
> > > > See bottom
> > > >
> > > >
> > > >> On Oct 14, 2019, at 1:12 PM, Justin Chang via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > > >>
> > > >> It might depend on your application, but for my stuff on maximum
> principles for advection-diffusion, I found RS to be much better than SS.
> Here?s the paper I wrote documenting the performance numbers I came across
> > > >>
> > > >> https://www.sciencedirect.com/science/article/pii/S0045782516316176
> > > >>
> > > >> Or the arXiV version:
> > > >>
> > > >> https://arxiv.org/pdf/1611.08758.pdf
> > > >>
> > > >>
> > > >> On Mon, Oct 14, 2019 at 1:07 PM Alexander Lindsay via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > > >> I've been working on mechanical contact in MOOSE for a while, and
> it's led to me to think about general inequality constraint enforcement.
> I've been playing around with both `vinewtonssls` and `vinewtonrsls`. In
> Benson's and Munson's Flexible Complementarity Solvers paper, they were
> able to solve 73.7% of their problems with SS and 65.5% with RS which led
> them to conclude that the SS method is generally more robust.  We have had
> at least one instance where a MOOSE user reported an order of magnitude
> reduction in non-linear iterations when switching from SS to RS. Moreover,
> when running the problem described in this issue, I get these results:
> > > >>
> > > >> num_elements = 100
> > > >> SS nl iterations = 53
> > > >> RS nl iterations = 22
> > > >>
> > > >> num_elements = 1000
> > > >> SS nl iterations = 123
> > > >> RS nl iterations = 140
> > > >>
> > > >> num_elements = 10000
> > > >> SS: fails to converge within 50 nl iterations during the second
> time step whether using a `basic` or `bt` line search
> > > >> RS: fails to converge within 50 nl iterations during the second
> time step whether using a `basic` or `bt` line search (although I believe
> `vinewtonrsls` performs a line-search that is guaranteed to keep the
> degrees of freedom within their bounds)
> > > >>
> > > >> So depending on the number of elements, it appears that either SS
> or RS may be more performant. I guess since I can get different relative
> performance with even the same PDE, it would be silly for me to ask for
> guidance on when to use which? In the conclusion of Benson's and Munson's
> paper, they mention using mesh sequencing for generating initial guesses on
> finer meshes. Does anyone know whether there have been any publications
> using PETSc/TAO and mesh sequencing for solving large VI problems?
> > > >>
> > > >> A related question: what needs to be done to allow SS to run with
> `-snes_mf_operator`? RS already appears to support the option.
> > > >
> > > >   This may not make sense. Is the operator used in the SS solution
> process derivable from the function that is being optimized with the
> constraints or some strange scaled beast?
> > > >>
> > > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/be09b0c4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: A.dat
Type: application/octet-stream
Size: 32032 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/be09b0c4/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: b.dat
Type: application/octet-stream
Size: 6416 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/be09b0c4/attachment-0003.obj>

From aph at email.arizona.edu  Mon Nov  4 14:14:43 2019
From: aph at email.arizona.edu (Anthony Paul Haas)
Date: Mon, 4 Nov 2019 13:14:43 -0700
Subject: [petsc-users] --with-64-bit-indices=1
Message-ID: <CAEyxMWX+q9xa408fB8mtTUBF6DH4F-m6X0Auuf09APLTAvqXSg@mail.gmail.com>

Hello,

I ran into an issue while using Mumps from Petsc. I got the following error
(see below please). Somebody suggested that I compile Petsc with
--with-64-bit-indices=1. Will that suffice? Also I compiled my own version
of Petsc on Cray Onyx (HPCMP) but although I compiled
--with-debugging=0,  Petsc
was very very slow (compared to the version of Petsc available from the
Cray admins). Do you have a list of flags that I should compile Petsc with
for Cray supercomputers?

Thanks,

Anthony

INFOG(1)=-51. I saw in the mumps manual that:

An external ordering (Metis/ParMetis, SCOTCH/PT-SCOTCH, PORD), with 32-bit
default

integers, is invoked to processing a graph of size larger than 2^31-1.
INFO(2) holds the size

required to store the graph as a number of integer values;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/eeb680a5/attachment.html>

From bsmith at mcs.anl.gov  Mon Nov  4 14:46:10 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Mon, 4 Nov 2019 20:46:10 +0000
Subject: [petsc-users] --with-64-bit-indices=1
In-Reply-To: <CAEyxMWX+q9xa408fB8mtTUBF6DH4F-m6X0Auuf09APLTAvqXSg@mail.gmail.com>
References: <CAEyxMWX+q9xa408fB8mtTUBF6DH4F-m6X0Auuf09APLTAvqXSg@mail.gmail.com>
Message-ID: <AF33D4C0-44B7-45E4-96D5-04E7A6691F83@anl.gov>


> On Nov 4, 2019, at 2:14 PM, Anthony Paul Haas via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hello,
> 
> I ran into an issue while using Mumps from Petsc. I got the following error (see below please). Somebody suggested that I compile Petsc with --with-64-bit-indices=1. Will that suffice?

Currently PETSc and MUMPS do not work together with --with-64-bit-indices=1. 

> Also I compiled my own version of Petsc on Cray Onyx (HPCMP) but although I compiled --with-debugging=0,  Petsc was very very slow (compared to the version of Petsc available from the Cray admins). Do you have a list of flags that I should compile Petsc with for Cray supercomputers?

No idea why it would be particularly slower. No way to know what compiler options they used.

You also have a choice of different compilers on Cray, perhaps that makes a difference.

> 
> Thanks,
> 
> Anthony
> 
> INFOG(1)=-51. I saw in the mumps manual that:
> 
> An external ordering (Metis/ParMetis, SCOTCH/PT-SCOTCH, PORD), with 32-bit default
> integers, is invoked to processing a graph of size larger than 2^31-1. INFO(2) holds the size
> required to store the graph as a number of integer values;

   This is strange. Since PETSc cannot when using 32 bit indices produce such a large graph I cannot explain how this message was generated. Perhaps there was an integer overflow


> 
> 


From fdkong.jd at gmail.com  Mon Nov  4 17:28:17 2019
From: fdkong.jd at gmail.com (Fande Kong)
Date: Mon, 4 Nov 2019 16:28:17 -0700
Subject: [petsc-users] Select a preconditioner for SLEPc eigenvalue
 solver Jacobi-Davidson
In-Reply-To: <1CC5E48C-7709-44E0-84F9-7FBD46297069@dsic.upv.es>
References: <CAN5Wd-LAQaMAcX9cJZRWOQ99JGZd+3a3K8e7nQFqj=g2+NcVqg@mail.gmail.com>
	<1CC5E48C-7709-44E0-84F9-7FBD46297069@dsic.upv.es>
Message-ID: <CAN5Wd-LMhWq-pR574YemA0aucDDm6LTkq-WpUnA95ZOpysCryg@mail.gmail.com>

Thanks Jose,

I think I understand now. Another question: what is the right way to setup
a linear preconditioning matrix for the inner linear solver of JD?

I was trying to do something like this:

  /*
     Create eigensolver context
  */
  ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRQ(ierr);

  /*
     Set operators. In this case, it is a standard eigenvalue problem
  */
  ierr = EPSSetOperators(eps,A,NULL);CHKERRQ(ierr);
  ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
  ierr = EPSGetST(eps,&st);CHKERRQ(ierr);
  ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);

  /*
     Set solver parameters at runtime
  */
  ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);

  /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
                      Solve the eigensystem
     - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

  ierr = EPSSolve(eps);CHKERRQ(ierr);


But did not work. A complete example is attached.  I could try to dig into
the code, but you may already know the answer.


On Wed, Oct 23, 2019 at 3:58 AM Jose E. Roman <jroman at dsic.upv.es> wrote:

> Yes, it is confusing. Here is the explanation: when you use a target, the
> preconditioner is built from matrix A-sigma*B. By default, instead of
> TARGET_MAGNITUDE we set LARGEST_MAGNITUDE, and in Jacobi-Davidson we treat
> this case by setting sigma=PETSC_MAX_REAL. In this case, the preconditioner
> is built from matrix B. The thing is that in a standard eigenproblem we
> have B=I, and hence there is no point in using a preconditioner, that is
> why we set PCNONE.
>
> Jose
>
>
> > El 22 oct 2019, a las 19:57, Fande Kong via petsc-users <
> petsc-users at mcs.anl.gov> escribi?:
> >
> > Hi All,
> >
> > It looks like the preconditioner is hard-coded in the Jacobi-Davidson
> solver. I could not select a preconditioner rather than the default setting.
> >
> > For example, I was trying to select LU, but PC NONE was still used.  I
> ran standard example 2 in slepc/src/eps/examples/tutorials, and had the
> following results.
> >
> >
> > Thanks,
> >
> > Fande
> >
> >
> > ./ex2 -eps_type jd -st_ksp_type gmres  -st_pc_type lu   -eps_view
> >
> > 2-D Laplacian Eigenproblem, N=100 (10x10 grid)
> >
> > EPS Object: 1 MPI processes
> >   type: jd
> >     search subspace is orthogonalized
> >     block size=1
> >     type of the initial subspace: non-Krylov
> >     size of the subspace after restarting: 6
> >     number of vectors after restarting from the previous iteration: 1
> >     threshold for changing the target in the correction equation (fix):
> 0.01
> >   problem type: symmetric eigenvalue problem
> >   selected portion of the spectrum: largest eigenvalues in magnitude
> >   number of eigenvalues (nev): 1
> >   number of column vectors (ncv): 17
> >   maximum dimension of projected problem (mpd): 17
> >   maximum number of iterations: 1700
> >   tolerance: 1e-08
> >   convergence test: relative to the eigenvalue
> > BV Object: 1 MPI processes
> >   type: svec
> >   17 columns of global length 100
> >   vector orthogonalization method: classical Gram-Schmidt
> >   orthogonalization refinement: if needed (eta: 0.7071)
> >   block orthogonalization method: GS
> >   doing matmult as a single matrix-matrix product
> > DS Object: 1 MPI processes
> >   type: hep
> >   solving the problem with: Implicit QR method (_steqr)
> > ST Object: 1 MPI processes
> >   type: precond
> >   shift: 1.79769e+308
> >   number of matrices: 1
> >   KSP Object: (st_) 1 MPI processes
> >     type: gmres
> >       restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >       happy breakdown tolerance 1e-30
> >     maximum iterations=90, initial guess is zero
> >     tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.
> >     left preconditioning
> >     using PRECONDITIONED norm type for convergence test
> >   PC Object: (st_) 1 MPI processes
> >     type: none
> >     linear system matrix = precond matrix:
> >     Mat Object: 1 MPI processes
> >       type: shell
> >       rows=100, cols=100
> >  Solution method: jd
> >
> >  Number of requested eigenvalues: 1
> >  Linear eigensolve converged (1 eigenpair) due to CONVERGED_TOL;
> iterations 20
> >  ---------------------- --------------------
> >             k             ||Ax-kx||/||kx||
> >  ---------------------- --------------------
> >         7.837972            7.71944e-10
> >  ---------------------- --------------------
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/2bcb34c7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex3.c
Type: application/octet-stream
Size: 7372 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191104/2bcb34c7/attachment-0001.obj>

From bsmith at mcs.anl.gov  Mon Nov  4 19:38:30 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 5 Nov 2019 01:38:30 +0000
Subject: [petsc-users] --with-64-bit-indices=1
In-Reply-To: <CAEyxMWWMQki5YWhi8VoMOG=Hp1L5YidpEXzda26b=jjgzohwFQ@mail.gmail.com>
References: <CAEyxMWX+q9xa408fB8mtTUBF6DH4F-m6X0Auuf09APLTAvqXSg@mail.gmail.com>
	<AF33D4C0-44B7-45E4-96D5-04E7A6691F83@anl.gov>
	<CAEyxMWWMQki5YWhi8VoMOG=Hp1L5YidpEXzda26b=jjgzohwFQ@mail.gmail.com>
Message-ID: <0249846E-62A3-459D-A9F2-2717F1A622F6@mcs.anl.gov>


  You should upgrade to the lastest PETSc and that will also upgrade to the latest MUMPS. There may be better error checking and bounds checking now. 

  Fundamentally until there is solid support for MUMPS with 64 bit indices you are stuck. SuperLU_DIST does fully support 64 bits but I have heard it is slower.


   Barry


> On Nov 4, 2019, at 4:19 PM, Anthony Paul Haas <aph at email.arizona.edu> wrote:
> 
> Hi Barry,
> 
> Thanks for your answer. Integer overflow seems to make sense. I am trying to do a direct inversion with Mumps LU. The code works for smaller grids but this case is pretty large (mesh is 12,001x 301). I am also attaching the output of the code in case that could provide more info. Do you know how I should proceed? 
> 
> Thanks,
> 
> Anthony
> 
> On Mon, Nov 4, 2019 at 1:46 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
> 
> 
> 
> > On Nov 4, 2019, at 2:14 PM, Anthony Paul Haas via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > 
> > Hello,
> > 
> > I ran into an issue while using Mumps from Petsc. I got the following error (see below please). Somebody suggested that I compile Petsc with --with-64-bit-indices=1. Will that suffice?
> 
> Currently PETSc and MUMPS do not work together with --with-64-bit-indices=1. 
> 
> > Also I compiled my own version of Petsc on Cray Onyx (HPCMP) but although I compiled --with-debugging=0,  Petsc was very very slow (compared to the version of Petsc available from the Cray admins). Do you have a list of flags that I should compile Petsc with for Cray supercomputers?
> 
> No idea why it would be particularly slower. No way to know what compiler options they used.
> 
> You also have a choice of different compilers on Cray, perhaps that makes a difference.
> 
> > 
> > Thanks,
> > 
> > Anthony
> > 
> > INFOG(1)=-51. I saw in the mumps manual that:
> > 
> > An external ordering (Metis/ParMetis, SCOTCH/PT-SCOTCH, PORD), with 32-bit default
> > integers, is invoked to processing a graph of size larger than 2^31-1. INFO(2) holds the size
> > required to store the graph as a number of integer values;
> 
>    This is strange. Since PETSc cannot when using 32 bit indices produce such a large graph I cannot explain how this message was generated. Perhaps there was an integer overflow
> 
> 
> > 
> > 
> 
> <Case2_240.o2725922>


From perceval.desforges at polytechnique.edu  Tue Nov  5 07:55:58 2019
From: perceval.desforges at polytechnique.edu (Perceval Desforges)
Date: Tue, 05 Nov 2019 14:55:58 +0100
Subject: [petsc-users] SLEPC no speedup in parallel
In-Reply-To: <710EDF59-3233-470E-8E1F-22E3B504556B@dsic.upv.es>
References: <82480440c0cf3973ca6e935413279be3@polytechnique.edu>
	<710EDF59-3233-470E-8E1F-22E3B504556B@dsic.upv.es>
Message-ID: <f7086b4f947eebe54372de094b6226e2@polytechnique.edu>

Hello, 

After carefully looking at the example in the tutorial suggested in
section 3.4.5 of the manual, I managed to determine that the problem was
caused by me calling EPSSetFromOptions(eps) after
EPSKrylovSchurSetPartitions(eps,size) and not before. It now works fine.


Sorry! 

Best regards, 

Perceval, 

Le 2019-11-04 18:45, Jose E. Roman a ?crit :

> Did you follow the instructions in section 3.4.5 of the SLEPc users manual?
> Send the output of -eps_view
> 
> Jose
> 
>> El 4 nov 2019, a las 18:33, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>> 
>> Dear petsc and slepc developpers,
>> 
>> I am using slepc to solve an eigenvalue problem. Since I need all the eigenvalues in a certain interval, I use the spectrum slicing technique with mumps. However I do not understand: when I run my code with more than one processor, there is no speedup at all, and it even slows down, and I don't understand why.
>> 
>> I wanted to test further and I ran the same code without spectrum slicing, and asking for about the same amount of eigenvalues. The calculation was much slower (about 10 times slower), but using more than one processor sped it up.
>> 
>> Is this normal behavior or am I doing something wrong?
>> 
>> Thanks,
>> 
>> Best regards,
>> 
>> Perceval,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191105/46771ea7/attachment.html>

From jroman at dsic.upv.es  Tue Nov  5 10:07:18 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 5 Nov 2019 17:07:18 +0100
Subject: [petsc-users] Select a preconditioner for SLEPc eigenvalue
 solver Jacobi-Davidson
In-Reply-To: <CAN5Wd-LMhWq-pR574YemA0aucDDm6LTkq-WpUnA95ZOpysCryg@mail.gmail.com>
References: <CAN5Wd-LAQaMAcX9cJZRWOQ99JGZd+3a3K8e7nQFqj=g2+NcVqg@mail.gmail.com>
	<1CC5E48C-7709-44E0-84F9-7FBD46297069@dsic.upv.es>
	<CAN5Wd-LMhWq-pR574YemA0aucDDm6LTkq-WpUnA95ZOpysCryg@mail.gmail.com>
Message-ID: <CE4A75FE-2A72-4992-B994-FB9B72F146F2@dsic.upv.es>

Currently, the function that passes the preconditioner matrix is specific of STPRECOND, so you have to add
  ierr = STSetType(st,STPRECOND);CHKERRQ(ierr);
before
  ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
otherwise this latter call is ignored.

We may be changing a little bit the way in which ST is initialized, and maybe we modify this as well. It is not decided yet.

Jose


> El 5 nov 2019, a las 0:28, Fande Kong <fdkong.jd at gmail.com> escribi?:
> 
> Thanks Jose,
> 
> I think I understand now. Another question: what is the right way to setup a linear preconditioning matrix for the inner linear solver of JD?
> 
> I was trying to do something like this:
> 
>   /*
>      Create eigensolver context
>   */
>   ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRQ(ierr);
> 
>   /*
>      Set operators. In this case, it is a standard eigenvalue problem
>   */
>   ierr = EPSSetOperators(eps,A,NULL);CHKERRQ(ierr);
>   ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
>   ierr = EPSGetST(eps,&st);CHKERRQ(ierr);
>   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> 
>   /*
>      Set solver parameters at runtime
>   */
>   ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);
> 
>   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>                       Solve the eigensystem
>      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
> 
>   ierr = EPSSolve(eps);CHKERRQ(ierr);
> 
> 
> But did not work. A complete example is attached.  I could try to dig into the code, but you may already know the answer.
> 
> 
> On Wed, Oct 23, 2019 at 3:58 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> Yes, it is confusing. Here is the explanation: when you use a target, the preconditioner is built from matrix A-sigma*B. By default, instead of TARGET_MAGNITUDE we set LARGEST_MAGNITUDE, and in Jacobi-Davidson we treat this case by setting sigma=PETSC_MAX_REAL. In this case, the preconditioner is built from matrix B. The thing is that in a standard eigenproblem we have B=I, and hence there is no point in using a preconditioner, that is why we set PCNONE.
> 
> Jose
> 
> 
> > El 22 oct 2019, a las 19:57, Fande Kong via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> > 
> > Hi All,
> > 
> > It looks like the preconditioner is hard-coded in the Jacobi-Davidson solver. I could not select a preconditioner rather than the default setting.
> > 
> > For example, I was trying to select LU, but PC NONE was still used.  I ran standard example 2 in slepc/src/eps/examples/tutorials, and had the following results.
> > 
> > 
> > Thanks,
> > 
> > Fande
> > 
> > 
> > ./ex2 -eps_type jd -st_ksp_type gmres  -st_pc_type lu   -eps_view  
> > 
> > 2-D Laplacian Eigenproblem, N=100 (10x10 grid)
> > 
> > EPS Object: 1 MPI processes
> >   type: jd
> >     search subspace is orthogonalized
> >     block size=1
> >     type of the initial subspace: non-Krylov
> >     size of the subspace after restarting: 6
> >     number of vectors after restarting from the previous iteration: 1
> >     threshold for changing the target in the correction equation (fix): 0.01
> >   problem type: symmetric eigenvalue problem
> >   selected portion of the spectrum: largest eigenvalues in magnitude
> >   number of eigenvalues (nev): 1
> >   number of column vectors (ncv): 17
> >   maximum dimension of projected problem (mpd): 17
> >   maximum number of iterations: 1700
> >   tolerance: 1e-08
> >   convergence test: relative to the eigenvalue
> > BV Object: 1 MPI processes
> >   type: svec
> >   17 columns of global length 100
> >   vector orthogonalization method: classical Gram-Schmidt
> >   orthogonalization refinement: if needed (eta: 0.7071)
> >   block orthogonalization method: GS
> >   doing matmult as a single matrix-matrix product
> > DS Object: 1 MPI processes
> >   type: hep
> >   solving the problem with: Implicit QR method (_steqr)
> > ST Object: 1 MPI processes
> >   type: precond
> >   shift: 1.79769e+308
> >   number of matrices: 1
> >   KSP Object: (st_) 1 MPI processes
> >     type: gmres
> >       restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
> >       happy breakdown tolerance 1e-30
> >     maximum iterations=90, initial guess is zero
> >     tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.
> >     left preconditioning
> >     using PRECONDITIONED norm type for convergence test
> >   PC Object: (st_) 1 MPI processes
> >     type: none
> >     linear system matrix = precond matrix:
> >     Mat Object: 1 MPI processes
> >       type: shell
> >       rows=100, cols=100
> >  Solution method: jd
> > 
> >  Number of requested eigenvalues: 1
> >  Linear eigensolve converged (1 eigenpair) due to CONVERGED_TOL; iterations 20
> >  ---------------------- --------------------
> >             k             ||Ax-kx||/||kx||
> >  ---------------------- --------------------
> >         7.837972            7.71944e-10
> >  ---------------------- --------------------
> > 
> > 
> > 
> 
> <ex3.c>


From fdkong.jd at gmail.com  Tue Nov  5 11:13:58 2019
From: fdkong.jd at gmail.com (Fande Kong)
Date: Tue, 5 Nov 2019 10:13:58 -0700
Subject: [petsc-users] Select a preconditioner for SLEPc eigenvalue
 solver Jacobi-Davidson
In-Reply-To: <CE4A75FE-2A72-4992-B994-FB9B72F146F2@dsic.upv.es>
References: <CAN5Wd-LAQaMAcX9cJZRWOQ99JGZd+3a3K8e7nQFqj=g2+NcVqg@mail.gmail.com>
	<1CC5E48C-7709-44E0-84F9-7FBD46297069@dsic.upv.es>
	<CAN5Wd-LMhWq-pR574YemA0aucDDm6LTkq-WpUnA95ZOpysCryg@mail.gmail.com>
	<CE4A75FE-2A72-4992-B994-FB9B72F146F2@dsic.upv.es>
Message-ID: <CAN5Wd-+prdh7LJgjL59uMBbPvZBrwwL0+L3o29iXcNHib9UrwQ@mail.gmail.com>

How about I want to determine the ST type on runtime?

 mpirun -n 1 ./ex3  -eps_type jd -st_ksp_type gmres  -st_pc_type none
-eps_view  -eps_target  0 -eps_monitor  -st_ksp_monitor

ST is indeed STPrecond, but the passed preconditioning matrix is still
ignored.

EPS Object: 1 MPI processes
  type: jd
    search subspace is orthogonalized
    block size=1
    type of the initial subspace: non-Krylov
    size of the subspace after restarting: 6
    number of vectors after restarting from the previous iteration: 1
    threshold for changing the target in the correction equation (fix): 0.01
  problem type: symmetric eigenvalue problem
  selected portion of the spectrum: closest to target: 0. (in magnitude)
  number of eigenvalues (nev): 1
  number of column vectors (ncv): 17
  maximum dimension of projected problem (mpd): 17
  maximum number of iterations: 1700
  tolerance: 1e-08
  convergence test: relative to the eigenvalue
BV Object: 1 MPI processes
  type: svec
  17 columns of global length 100
  vector orthogonalization method: classical Gram-Schmidt
  orthogonalization refinement: if needed (eta: 0.7071)
  block orthogonalization method: GS
  doing matmult as a single matrix-matrix product
DS Object: 1 MPI processes
  type: hep
  solving the problem with: Implicit QR method (_steqr)
ST Object: 1 MPI processes
  type: precond
  shift: 0.
  number of matrices: 1
  KSP Object: (st_) 1 MPI processes
    type: gmres
      restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
      happy breakdown tolerance 1e-30
    maximum iterations=90, initial guess is zero
    tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.
    left preconditioning
    using PRECONDITIONED norm type for convergence test
  PC Object: (st_) 1 MPI processes
    type: none
    linear system matrix = precond matrix:
    Mat Object: 1 MPI processes
      type: shell
      rows=100, cols=100
 Solution method: jd


Preconding matrix should be a SeqAIJ not shell.


Fande,

On Tue, Nov 5, 2019 at 9:07 AM Jose E. Roman <jroman at dsic.upv.es> wrote:

> Currently, the function that passes the preconditioner matrix is specific
> of STPRECOND, so you have to add
>   ierr = STSetType(st,STPRECOND);CHKERRQ(ierr);
> before
>   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> otherwise this latter call is ignored.
>
> We may be changing a little bit the way in which ST is initialized, and
> maybe we modify this as well. It is not decided yet.
>
> Jose
>
>
> > El 5 nov 2019, a las 0:28, Fande Kong <fdkong.jd at gmail.com> escribi?:
> >
> > Thanks Jose,
> >
> > I think I understand now. Another question: what is the right way to
> setup a linear preconditioning matrix for the inner linear solver of JD?
> >
> > I was trying to do something like this:
> >
> >   /*
> >      Create eigensolver context
> >   */
> >   ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRQ(ierr);
> >
> >   /*
> >      Set operators. In this case, it is a standard eigenvalue problem
> >   */
> >   ierr = EPSSetOperators(eps,A,NULL);CHKERRQ(ierr);
> >   ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
> >   ierr = EPSGetST(eps,&st);CHKERRQ(ierr);
> >   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> >
> >   /*
> >      Set solver parameters at runtime
> >   */
> >   ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);
> >
> >   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >                       Solve the eigensystem
> >      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> */
> >
> >   ierr = EPSSolve(eps);CHKERRQ(ierr);
> >
> >
> > But did not work. A complete example is attached.  I could try to dig
> into the code, but you may already know the answer.
> >
> >
> > On Wed, Oct 23, 2019 at 3:58 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > Yes, it is confusing. Here is the explanation: when you use a target,
> the preconditioner is built from matrix A-sigma*B. By default, instead of
> TARGET_MAGNITUDE we set LARGEST_MAGNITUDE, and in Jacobi-Davidson we treat
> this case by setting sigma=PETSC_MAX_REAL. In this case, the preconditioner
> is built from matrix B. The thing is that in a standard eigenproblem we
> have B=I, and hence there is no point in using a preconditioner, that is
> why we set PCNONE.
> >
> > Jose
> >
> >
> > > El 22 oct 2019, a las 19:57, Fande Kong via petsc-users <
> petsc-users at mcs.anl.gov> escribi?:
> > >
> > > Hi All,
> > >
> > > It looks like the preconditioner is hard-coded in the Jacobi-Davidson
> solver. I could not select a preconditioner rather than the default setting.
> > >
> > > For example, I was trying to select LU, but PC NONE was still used.  I
> ran standard example 2 in slepc/src/eps/examples/tutorials, and had the
> following results.
> > >
> > >
> > > Thanks,
> > >
> > > Fande
> > >
> > >
> > > ./ex2 -eps_type jd -st_ksp_type gmres  -st_pc_type lu   -eps_view
> > >
> > > 2-D Laplacian Eigenproblem, N=100 (10x10 grid)
> > >
> > > EPS Object: 1 MPI processes
> > >   type: jd
> > >     search subspace is orthogonalized
> > >     block size=1
> > >     type of the initial subspace: non-Krylov
> > >     size of the subspace after restarting: 6
> > >     number of vectors after restarting from the previous iteration: 1
> > >     threshold for changing the target in the correction equation
> (fix): 0.01
> > >   problem type: symmetric eigenvalue problem
> > >   selected portion of the spectrum: largest eigenvalues in magnitude
> > >   number of eigenvalues (nev): 1
> > >   number of column vectors (ncv): 17
> > >   maximum dimension of projected problem (mpd): 17
> > >   maximum number of iterations: 1700
> > >   tolerance: 1e-08
> > >   convergence test: relative to the eigenvalue
> > > BV Object: 1 MPI processes
> > >   type: svec
> > >   17 columns of global length 100
> > >   vector orthogonalization method: classical Gram-Schmidt
> > >   orthogonalization refinement: if needed (eta: 0.7071)
> > >   block orthogonalization method: GS
> > >   doing matmult as a single matrix-matrix product
> > > DS Object: 1 MPI processes
> > >   type: hep
> > >   solving the problem with: Implicit QR method (_steqr)
> > > ST Object: 1 MPI processes
> > >   type: precond
> > >   shift: 1.79769e+308
> > >   number of matrices: 1
> > >   KSP Object: (st_) 1 MPI processes
> > >     type: gmres
> > >       restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> > >       happy breakdown tolerance 1e-30
> > >     maximum iterations=90, initial guess is zero
> > >     tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.
> > >     left preconditioning
> > >     using PRECONDITIONED norm type for convergence test
> > >   PC Object: (st_) 1 MPI processes
> > >     type: none
> > >     linear system matrix = precond matrix:
> > >     Mat Object: 1 MPI processes
> > >       type: shell
> > >       rows=100, cols=100
> > >  Solution method: jd
> > >
> > >  Number of requested eigenvalues: 1
> > >  Linear eigensolve converged (1 eigenpair) due to CONVERGED_TOL;
> iterations 20
> > >  ---------------------- --------------------
> > >             k             ||Ax-kx||/||kx||
> > >  ---------------------- --------------------
> > >         7.837972            7.71944e-10
> > >  ---------------------- --------------------
> > >
> > >
> > >
> >
> > <ex3.c>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191105/5039c22e/attachment-0001.html>

From jczhang at mcs.anl.gov  Tue Nov  5 11:32:59 2019
From: jczhang at mcs.anl.gov (Zhang, Junchao)
Date: Tue, 5 Nov 2019 17:32:59 +0000
Subject: [petsc-users] VecDuplicate for FFTW-Vec causes VecDestroy to
 fail conditionally on VecLoad
In-Reply-To: <CAOGsD9jo-pNNSkHVLOKXJRKN7HF6rF3oXT_49q-BJx1wM8zGjg@mail.gmail.com>
References: <CAOGsD9gxHfzqyPswVX4XxH1nF7DGKpnWukD6OGkQqu5_2mEJWg@mail.gmail.com>
	<CA+MQGp8xFDn8RtE7xeecLUpn3vRHfQRDbL6NDN0Wyg7+uhjejg@mail.gmail.com>
	<9495B365-A1AE-4840-8E2D-45CD01DE3D41@anl.gov>
	<CAOGsD9jo-pNNSkHVLOKXJRKN7HF6rF3oXT_49q-BJx1wM8zGjg@mail.gmail.com>
Message-ID: <CA+MQGp-ovQj6ZjogqvEb7o9-+_nB1LR-Z4XbKzpa6YFJfruCqA@mail.gmail.com>

Fixed in https://gitlab.com/petsc/petsc/merge_requests/2262
--Junchao Zhang


On Fri, Nov 1, 2019 at 6:51 PM Sajid Ali <sajidsyed2021 at u.northwestern.edu<mailto:sajidsyed2021 at u.northwestern.edu>> wrote:
Hi Junchao/Barry,

It doesn't really matter what the h5 file contains,  so I'm attaching a lightly edited script of src/vec/vec/examples/tutorials/ex10.c which should produce a vector to be used as input for the above test case. (I'm working with ` --with-scalar-type=complex`).

Now that I think of it, fixing this bug is not important, I can workaround the issue by creating a new vector with VecCreateMPI and accept the small loss in performance of VecPointwiseMult due to misaligned layouts. If it's a small fix it may be worth the time, but fixing this is not a big priority right now. If it's a complicated fix, this issue can serve as a note to future users.


Thank You,
Sajid Ali
Applied Physics
Northwestern University
s-sajid-ali.github.io<http://s-sajid-ali.github.io>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191105/26e11e15/attachment.html>

From jroman at dsic.upv.es  Tue Nov  5 11:33:13 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 5 Nov 2019 18:33:13 +0100
Subject: [petsc-users] Select a preconditioner for SLEPc eigenvalue
 solver Jacobi-Davidson
In-Reply-To: <CAN5Wd-+prdh7LJgjL59uMBbPvZBrwwL0+L3o29iXcNHib9UrwQ@mail.gmail.com>
References: <CAN5Wd-LAQaMAcX9cJZRWOQ99JGZd+3a3K8e7nQFqj=g2+NcVqg@mail.gmail.com>
	<1CC5E48C-7709-44E0-84F9-7FBD46297069@dsic.upv.es>
	<CAN5Wd-LMhWq-pR574YemA0aucDDm6LTkq-WpUnA95ZOpysCryg@mail.gmail.com>
	<CE4A75FE-2A72-4992-B994-FB9B72F146F2@dsic.upv.es>
	<CAN5Wd-+prdh7LJgjL59uMBbPvZBrwwL0+L3o29iXcNHib9UrwQ@mail.gmail.com>
Message-ID: <00026233-9A46-4FB0-92C1-DEF642B0682F@dsic.upv.es>

JD sets STPRECOND at EPSSetUp(), if it was not set before. So I guess you need to add -st_type precond on the command line, so that it is set at EPSSetFromOptions().

Jose


> El 5 nov 2019, a las 18:13, Fande Kong <fdkong.jd at gmail.com> escribi?:
> 
> How about I want to determine the ST type on runtime? 
> 
>  mpirun -n 1 ./ex3  -eps_type jd -st_ksp_type gmres  -st_pc_type none   -eps_view  -eps_target  0 -eps_monitor  -st_ksp_monitor 
> 
> ST is indeed STPrecond, but the passed preconditioning matrix is still ignored.
> 
> EPS Object: 1 MPI processes
>   type: jd
>     search subspace is orthogonalized
>     block size=1
>     type of the initial subspace: non-Krylov
>     size of the subspace after restarting: 6
>     number of vectors after restarting from the previous iteration: 1
>     threshold for changing the target in the correction equation (fix): 0.01
>   problem type: symmetric eigenvalue problem
>   selected portion of the spectrum: closest to target: 0. (in magnitude)
>   number of eigenvalues (nev): 1
>   number of column vectors (ncv): 17
>   maximum dimension of projected problem (mpd): 17
>   maximum number of iterations: 1700
>   tolerance: 1e-08
>   convergence test: relative to the eigenvalue
> BV Object: 1 MPI processes
>   type: svec
>   17 columns of global length 100
>   vector orthogonalization method: classical Gram-Schmidt
>   orthogonalization refinement: if needed (eta: 0.7071)
>   block orthogonalization method: GS
>   doing matmult as a single matrix-matrix product
> DS Object: 1 MPI processes
>   type: hep
>   solving the problem with: Implicit QR method (_steqr)
> ST Object: 1 MPI processes
>   type: precond
>   shift: 0.
>   number of matrices: 1
>   KSP Object: (st_) 1 MPI processes
>     type: gmres
>       restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>       happy breakdown tolerance 1e-30
>     maximum iterations=90, initial guess is zero
>     tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.
>     left preconditioning
>     using PRECONDITIONED norm type for convergence test
>   PC Object: (st_) 1 MPI processes
>     type: none
>     linear system matrix = precond matrix:
>     Mat Object: 1 MPI processes
>       type: shell
>       rows=100, cols=100
>  Solution method: jd
> 
> 
> Preconding matrix should be a SeqAIJ not shell.
> 
> 
> Fande,
> 
> On Tue, Nov 5, 2019 at 9:07 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> Currently, the function that passes the preconditioner matrix is specific of STPRECOND, so you have to add
>   ierr = STSetType(st,STPRECOND);CHKERRQ(ierr);
> before
>   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> otherwise this latter call is ignored.
> 
> We may be changing a little bit the way in which ST is initialized, and maybe we modify this as well. It is not decided yet.
> 
> Jose
> 
> 
> > El 5 nov 2019, a las 0:28, Fande Kong <fdkong.jd at gmail.com> escribi?:
> > 
> > Thanks Jose,
> > 
> > I think I understand now. Another question: what is the right way to setup a linear preconditioning matrix for the inner linear solver of JD?
> > 
> > I was trying to do something like this:
> > 
> >   /*
> >      Create eigensolver context
> >   */
> >   ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRQ(ierr);
> > 
> >   /*
> >      Set operators. In this case, it is a standard eigenvalue problem
> >   */
> >   ierr = EPSSetOperators(eps,A,NULL);CHKERRQ(ierr);
> >   ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
> >   ierr = EPSGetST(eps,&st);CHKERRQ(ierr);
> >   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> > 
> >   /*
> >      Set solver parameters at runtime
> >   */
> >   ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);
> > 
> >   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >                       Solve the eigensystem
> >      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
> > 
> >   ierr = EPSSolve(eps);CHKERRQ(ierr);
> > 
> > 
> > But did not work. A complete example is attached.  I could try to dig into the code, but you may already know the answer.
> > 
> > 
> > On Wed, Oct 23, 2019 at 3:58 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > Yes, it is confusing. Here is the explanation: when you use a target, the preconditioner is built from matrix A-sigma*B. By default, instead of TARGET_MAGNITUDE we set LARGEST_MAGNITUDE, and in Jacobi-Davidson we treat this case by setting sigma=PETSC_MAX_REAL. In this case, the preconditioner is built from matrix B. The thing is that in a standard eigenproblem we have B=I, and hence there is no point in using a preconditioner, that is why we set PCNONE.
> > 
> > Jose
> > 
> > 
> > > El 22 oct 2019, a las 19:57, Fande Kong via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> > > 
> > > Hi All,
> > > 
> > > It looks like the preconditioner is hard-coded in the Jacobi-Davidson solver. I could not select a preconditioner rather than the default setting.
> > > 
> > > For example, I was trying to select LU, but PC NONE was still used.  I ran standard example 2 in slepc/src/eps/examples/tutorials, and had the following results.
> > > 
> > > 
> > > Thanks,
> > > 
> > > Fande
> > > 
> > > 
> > > ./ex2 -eps_type jd -st_ksp_type gmres  -st_pc_type lu   -eps_view  
> > > 
> > > 2-D Laplacian Eigenproblem, N=100 (10x10 grid)
> > > 
> > > EPS Object: 1 MPI processes
> > >   type: jd
> > >     search subspace is orthogonalized
> > >     block size=1
> > >     type of the initial subspace: non-Krylov
> > >     size of the subspace after restarting: 6
> > >     number of vectors after restarting from the previous iteration: 1
> > >     threshold for changing the target in the correction equation (fix): 0.01
> > >   problem type: symmetric eigenvalue problem
> > >   selected portion of the spectrum: largest eigenvalues in magnitude
> > >   number of eigenvalues (nev): 1
> > >   number of column vectors (ncv): 17
> > >   maximum dimension of projected problem (mpd): 17
> > >   maximum number of iterations: 1700
> > >   tolerance: 1e-08
> > >   convergence test: relative to the eigenvalue
> > > BV Object: 1 MPI processes
> > >   type: svec
> > >   17 columns of global length 100
> > >   vector orthogonalization method: classical Gram-Schmidt
> > >   orthogonalization refinement: if needed (eta: 0.7071)
> > >   block orthogonalization method: GS
> > >   doing matmult as a single matrix-matrix product
> > > DS Object: 1 MPI processes
> > >   type: hep
> > >   solving the problem with: Implicit QR method (_steqr)
> > > ST Object: 1 MPI processes
> > >   type: precond
> > >   shift: 1.79769e+308
> > >   number of matrices: 1
> > >   KSP Object: (st_) 1 MPI processes
> > >     type: gmres
> > >       restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
> > >       happy breakdown tolerance 1e-30
> > >     maximum iterations=90, initial guess is zero
> > >     tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.
> > >     left preconditioning
> > >     using PRECONDITIONED norm type for convergence test
> > >   PC Object: (st_) 1 MPI processes
> > >     type: none
> > >     linear system matrix = precond matrix:
> > >     Mat Object: 1 MPI processes
> > >       type: shell
> > >       rows=100, cols=100
> > >  Solution method: jd
> > > 
> > >  Number of requested eigenvalues: 1
> > >  Linear eigensolve converged (1 eigenpair) due to CONVERGED_TOL; iterations 20
> > >  ---------------------- --------------------
> > >             k             ||Ax-kx||/||kx||
> > >  ---------------------- --------------------
> > >         7.837972            7.71944e-10
> > >  ---------------------- --------------------
> > > 
> > > 
> > > 
> > 
> > <ex3.c>
> 


From fdkong.jd at gmail.com  Tue Nov  5 11:55:50 2019
From: fdkong.jd at gmail.com (Fande Kong)
Date: Tue, 5 Nov 2019 10:55:50 -0700
Subject: [petsc-users] Select a preconditioner for SLEPc eigenvalue
 solver Jacobi-Davidson
In-Reply-To: <00026233-9A46-4FB0-92C1-DEF642B0682F@dsic.upv.es>
References: <CAN5Wd-LAQaMAcX9cJZRWOQ99JGZd+3a3K8e7nQFqj=g2+NcVqg@mail.gmail.com>
	<1CC5E48C-7709-44E0-84F9-7FBD46297069@dsic.upv.es>
	<CAN5Wd-LMhWq-pR574YemA0aucDDm6LTkq-WpUnA95ZOpysCryg@mail.gmail.com>
	<CE4A75FE-2A72-4992-B994-FB9B72F146F2@dsic.upv.es>
	<CAN5Wd-+prdh7LJgjL59uMBbPvZBrwwL0+L3o29iXcNHib9UrwQ@mail.gmail.com>
	<00026233-9A46-4FB0-92C1-DEF642B0682F@dsic.upv.es>
Message-ID: <CAN5Wd-+CrZPvwf_+ry+BH-03=t8QOvMJHuz2rFXRCQT-4XbbMw@mail.gmail.com>

OK, I figured it out!

I need to add the code :

  ierr = EPSGetST(eps,&st);CHKERRQ(ierr);
  ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);

after   ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);

It might be a good idea to document this if we intend to do so.

Fande,


On Tue, Nov 5, 2019 at 10:33 AM Jose E. Roman <jroman at dsic.upv.es> wrote:

> JD sets STPRECOND at EPSSetUp(), if it was not set before. So I guess you
> need to add -st_type precond on the command line, so that it is set at
> EPSSetFromOptions().
>
> Jose
>
>
> > El 5 nov 2019, a las 18:13, Fande Kong <fdkong.jd at gmail.com> escribi?:
> >
> > How about I want to determine the ST type on runtime?
> >
> >  mpirun -n 1 ./ex3  -eps_type jd -st_ksp_type gmres  -st_pc_type none
>  -eps_view  -eps_target  0 -eps_monitor  -st_ksp_monitor
> >
> > ST is indeed STPrecond, but the passed preconditioning matrix is still
> ignored.
> >
> > EPS Object: 1 MPI processes
> >   type: jd
> >     search subspace is orthogonalized
> >     block size=1
> >     type of the initial subspace: non-Krylov
> >     size of the subspace after restarting: 6
> >     number of vectors after restarting from the previous iteration: 1
> >     threshold for changing the target in the correction equation (fix):
> 0.01
> >   problem type: symmetric eigenvalue problem
> >   selected portion of the spectrum: closest to target: 0. (in magnitude)
> >   number of eigenvalues (nev): 1
> >   number of column vectors (ncv): 17
> >   maximum dimension of projected problem (mpd): 17
> >   maximum number of iterations: 1700
> >   tolerance: 1e-08
> >   convergence test: relative to the eigenvalue
> > BV Object: 1 MPI processes
> >   type: svec
> >   17 columns of global length 100
> >   vector orthogonalization method: classical Gram-Schmidt
> >   orthogonalization refinement: if needed (eta: 0.7071)
> >   block orthogonalization method: GS
> >   doing matmult as a single matrix-matrix product
> > DS Object: 1 MPI processes
> >   type: hep
> >   solving the problem with: Implicit QR method (_steqr)
> > ST Object: 1 MPI processes
> >   type: precond
> >   shift: 0.
> >   number of matrices: 1
> >   KSP Object: (st_) 1 MPI processes
> >     type: gmres
> >       restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >       happy breakdown tolerance 1e-30
> >     maximum iterations=90, initial guess is zero
> >     tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.
> >     left preconditioning
> >     using PRECONDITIONED norm type for convergence test
> >   PC Object: (st_) 1 MPI processes
> >     type: none
> >     linear system matrix = precond matrix:
> >     Mat Object: 1 MPI processes
> >       type: shell
> >       rows=100, cols=100
> >  Solution method: jd
> >
> >
> > Preconding matrix should be a SeqAIJ not shell.
> >
> >
> > Fande,
> >
> > On Tue, Nov 5, 2019 at 9:07 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > Currently, the function that passes the preconditioner matrix is
> specific of STPRECOND, so you have to add
> >   ierr = STSetType(st,STPRECOND);CHKERRQ(ierr);
> > before
> >   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> > otherwise this latter call is ignored.
> >
> > We may be changing a little bit the way in which ST is initialized, and
> maybe we modify this as well. It is not decided yet.
> >
> > Jose
> >
> >
> > > El 5 nov 2019, a las 0:28, Fande Kong <fdkong.jd at gmail.com> escribi?:
> > >
> > > Thanks Jose,
> > >
> > > I think I understand now. Another question: what is the right way to
> setup a linear preconditioning matrix for the inner linear solver of JD?
> > >
> > > I was trying to do something like this:
> > >
> > >   /*
> > >      Create eigensolver context
> > >   */
> > >   ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRQ(ierr);
> > >
> > >   /*
> > >      Set operators. In this case, it is a standard eigenvalue problem
> > >   */
> > >   ierr = EPSSetOperators(eps,A,NULL);CHKERRQ(ierr);
> > >   ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRQ(ierr);
> > >   ierr = EPSGetST(eps,&st);CHKERRQ(ierr);
> > >   ierr = STPrecondSetMatForPC(st,B);CHKERRQ(ierr);
> > >
> > >   /*
> > >      Set solver parameters at runtime
> > >   */
> > >   ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);
> > >
> > >   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> -
> > >                       Solve the eigensystem
> > >      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - */
> > >
> > >   ierr = EPSSolve(eps);CHKERRQ(ierr);
> > >
> > >
> > > But did not work. A complete example is attached.  I could try to dig
> into the code, but you may already know the answer.
> > >
> > >
> > > On Wed, Oct 23, 2019 at 3:58 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > Yes, it is confusing. Here is the explanation: when you use a target,
> the preconditioner is built from matrix A-sigma*B. By default, instead of
> TARGET_MAGNITUDE we set LARGEST_MAGNITUDE, and in Jacobi-Davidson we treat
> this case by setting sigma=PETSC_MAX_REAL. In this case, the preconditioner
> is built from matrix B. The thing is that in a standard eigenproblem we
> have B=I, and hence there is no point in using a preconditioner, that is
> why we set PCNONE.
> > >
> > > Jose
> > >
> > >
> > > > El 22 oct 2019, a las 19:57, Fande Kong via petsc-users <
> petsc-users at mcs.anl.gov> escribi?:
> > > >
> > > > Hi All,
> > > >
> > > > It looks like the preconditioner is hard-coded in the
> Jacobi-Davidson solver. I could not select a preconditioner rather than the
> default setting.
> > > >
> > > > For example, I was trying to select LU, but PC NONE was still used.
> I ran standard example 2 in slepc/src/eps/examples/tutorials, and had the
> following results.
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Fande
> > > >
> > > >
> > > > ./ex2 -eps_type jd -st_ksp_type gmres  -st_pc_type lu   -eps_view
> > > >
> > > > 2-D Laplacian Eigenproblem, N=100 (10x10 grid)
> > > >
> > > > EPS Object: 1 MPI processes
> > > >   type: jd
> > > >     search subspace is orthogonalized
> > > >     block size=1
> > > >     type of the initial subspace: non-Krylov
> > > >     size of the subspace after restarting: 6
> > > >     number of vectors after restarting from the previous iteration: 1
> > > >     threshold for changing the target in the correction equation
> (fix): 0.01
> > > >   problem type: symmetric eigenvalue problem
> > > >   selected portion of the spectrum: largest eigenvalues in magnitude
> > > >   number of eigenvalues (nev): 1
> > > >   number of column vectors (ncv): 17
> > > >   maximum dimension of projected problem (mpd): 17
> > > >   maximum number of iterations: 1700
> > > >   tolerance: 1e-08
> > > >   convergence test: relative to the eigenvalue
> > > > BV Object: 1 MPI processes
> > > >   type: svec
> > > >   17 columns of global length 100
> > > >   vector orthogonalization method: classical Gram-Schmidt
> > > >   orthogonalization refinement: if needed (eta: 0.7071)
> > > >   block orthogonalization method: GS
> > > >   doing matmult as a single matrix-matrix product
> > > > DS Object: 1 MPI processes
> > > >   type: hep
> > > >   solving the problem with: Implicit QR method (_steqr)
> > > > ST Object: 1 MPI processes
> > > >   type: precond
> > > >   shift: 1.79769e+308
> > > >   number of matrices: 1
> > > >   KSP Object: (st_) 1 MPI processes
> > > >     type: gmres
> > > >       restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> > > >       happy breakdown tolerance 1e-30
> > > >     maximum iterations=90, initial guess is zero
> > > >     tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.
> > > >     left preconditioning
> > > >     using PRECONDITIONED norm type for convergence test
> > > >   PC Object: (st_) 1 MPI processes
> > > >     type: none
> > > >     linear system matrix = precond matrix:
> > > >     Mat Object: 1 MPI processes
> > > >       type: shell
> > > >       rows=100, cols=100
> > > >  Solution method: jd
> > > >
> > > >  Number of requested eigenvalues: 1
> > > >  Linear eigensolve converged (1 eigenpair) due to CONVERGED_TOL;
> iterations 20
> > > >  ---------------------- --------------------
> > > >             k             ||Ax-kx||/||kx||
> > > >  ---------------------- --------------------
> > > >         7.837972            7.71944e-10
> > > >  ---------------------- --------------------
> > > >
> > > >
> > > >
> > >
> > > <ex3.c>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191105/84dd2f2a/attachment-0001.html>

From hgbk2008 at gmail.com  Tue Nov  5 15:09:28 2019
From: hgbk2008 at gmail.com (hg)
Date: Tue, 5 Nov 2019 22:09:28 +0100
Subject: [petsc-users] solve problem with pastix
Message-ID: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>

Hello

I got crashed when using Pastix as solver for KSP. The error message looks
like:

....
NUMBER of BUBBLE 1
COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
** End of Partition & Distribution phase **
   Time to analyze                              0.225 s
   Number of nonzeros in factorized matrix      708784076
   Fill-in                                      12.2337
   Number of operations (LU)                    2.80185e+12
   Prediction Time to factorize (AMD 6180  MKL) 394 s
0 : SolverMatrix size (without coefficients)    32.4 MB
0 : Number of nonzeros (local block structure)  365309391
 Numerical Factorization (LU) :
0 : Internal CSC size                           1.08 GB
   Time to fill internal csc                    6.66 s
   --- Sopalin : Allocation de la structure globale ---
   --- Fin Sopalin Init                             ---
   --- Initialisation des tableaux globaux          ---
sched_setaffinity: Invalid argument
[node083:165071] *** Process received signal ***
[node083:165071] Signal: Aborted (6)
[node083:165071] Signal code:  (-6)
[node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
[node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
[node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
[node083:165071] [ 3]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
[node083:165071] [ 4] Launching 1 threads (1 commputation, 0 communication,
0 out-of-core)
   --- Sopalin : Local structure allocation         ---
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
[node083:165071] [ 5]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
[node083:165071] [ 6]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
[node083:165071] [ 7]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
[node083:165071] [ 8]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
[node083:165071] [ 9]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
[node083:165071] [10]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
[node083:165071] [11]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
[node083:165071] [12]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
[node083:165071] [13]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
[node083:165071] [14]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
[node083:165071] [15]
/sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]

Does anyone have an idea what is the problem and how to fix it? The PETSc
parameters I used are as below:

-pc_type lu
-pc_factor_mat_solver_package pastix
-mat_pastix_verbose 2
-mat_pastix_threadnbr 1

Giang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191105/0b1f49ca/attachment.html>

From knepley at gmail.com  Tue Nov  5 15:49:52 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Nov 2019 16:49:52 -0500
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
Message-ID: <CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>

On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <petsc-users at mcs.anl.gov>
wrote:

> Hello
>
> I got crashed when using Pastix as solver for KSP. The error message looks
> like:
>
> ....
> NUMBER of BUBBLE 1
> COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
> ** End of Partition & Distribution phase **
>    Time to analyze                              0.225 s
>    Number of nonzeros in factorized matrix      708784076
>    Fill-in                                      12.2337
>    Number of operations (LU)                    2.80185e+12
>    Prediction Time to factorize (AMD 6180  MKL) 394 s
> 0 : SolverMatrix size (without coefficients)    32.4 MB
> 0 : Number of nonzeros (local block structure)  365309391
>  Numerical Factorization (LU) :
> 0 : Internal CSC size                           1.08 GB
>    Time to fill internal csc                    6.66 s
>    --- Sopalin : Allocation de la structure globale ---
>    --- Fin Sopalin Init                             ---
>    --- Initialisation des tableaux globaux          ---
> sched_setaffinity: Invalid argument
> [node083:165071] *** Process received signal ***
> [node083:165071] Signal: Aborted (6)
> [node083:165071] Signal code:  (-6)
> [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
> [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
> [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
> [node083:165071] [ 3]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
> [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
> communication, 0 out-of-core)
>    --- Sopalin : Local structure allocation         ---
>
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
> [node083:165071] [ 5]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
> [node083:165071] [ 6]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
> [node083:165071] [ 7]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
> [node083:165071] [ 8]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
> [node083:165071] [ 9]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
> [node083:165071] [10]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
> [node083:165071] [11]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
> [node083:165071] [12]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
> [node083:165071] [13]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
> [node083:165071] [14]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
> [node083:165071] [15]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
>
> Does anyone have an idea what is the problem and how to fix it? The PETSc
> parameters I used are as below:
>

It looks like PasTix is having trouble setting the thread affinity:

sched_setaffinity: Invalid argument

so it may be your build of PasTix.

  Thanks,

     Matt


> -pc_type lu
> -pc_factor_mat_solver_package pastix
> -mat_pastix_verbose 2
> -mat_pastix_threadnbr 1
>
> Giang
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191105/cccc92ef/attachment.html>

From hgbk2008 at gmail.com  Tue Nov  5 16:32:08 2019
From: hgbk2008 at gmail.com (hg)
Date: Tue, 5 Nov 2019 23:32:08 +0100
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
Message-ID: <CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>

Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and also
OMP_NUM_THREADS to 1

Giang


On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <petsc-users at mcs.anl.gov>
> wrote:
>
>> Hello
>>
>> I got crashed when using Pastix as solver for KSP. The error message
>> looks like:
>>
>> ....
>> NUMBER of BUBBLE 1
>> COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
>> ** End of Partition & Distribution phase **
>>    Time to analyze                              0.225 s
>>    Number of nonzeros in factorized matrix      708784076
>>    Fill-in                                      12.2337
>>    Number of operations (LU)                    2.80185e+12
>>    Prediction Time to factorize (AMD 6180  MKL) 394 s
>> 0 : SolverMatrix size (without coefficients)    32.4 MB
>> 0 : Number of nonzeros (local block structure)  365309391
>>  Numerical Factorization (LU) :
>> 0 : Internal CSC size                           1.08 GB
>>    Time to fill internal csc                    6.66 s
>>    --- Sopalin : Allocation de la structure globale ---
>>    --- Fin Sopalin Init                             ---
>>    --- Initialisation des tableaux globaux          ---
>> sched_setaffinity: Invalid argument
>> [node083:165071] *** Process received signal ***
>> [node083:165071] Signal: Aborted (6)
>> [node083:165071] Signal code:  (-6)
>> [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
>> [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
>> [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
>> [node083:165071] [ 3]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
>> [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
>> communication, 0 out-of-core)
>>    --- Sopalin : Local structure allocation         ---
>>
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
>> [node083:165071] [ 5]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
>> [node083:165071] [ 6]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
>> [node083:165071] [ 7]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
>> [node083:165071] [ 8]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
>> [node083:165071] [ 9]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
>> [node083:165071] [10]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
>> [node083:165071] [11]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
>> [node083:165071] [12]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
>> [node083:165071] [13]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
>> [node083:165071] [14]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
>> [node083:165071] [15]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
>>
>> Does anyone have an idea what is the problem and how to fix it? The PETSc
>> parameters I used are as below:
>>
>
> It looks like PasTix is having trouble setting the thread affinity:
>
> sched_setaffinity: Invalid argument
>
> so it may be your build of PasTix.
>
>   Thanks,
>
>      Matt
>
>
>> -pc_type lu
>> -pc_factor_mat_solver_package pastix
>> -mat_pastix_verbose 2
>> -mat_pastix_threadnbr 1
>>
>> Giang
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191105/c1c5c1e5/attachment-0001.html>

From knepley at gmail.com  Tue Nov  5 19:01:09 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Nov 2019 20:01:09 -0500
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
Message-ID: <CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>

I have no idea. That is a good question for the PasTix list.

  Thanks,

    Matt

On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:

> Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and also
> OMP_NUM_THREADS to 1
>
> Giang
>
>
> On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>>> Hello
>>>
>>> I got crashed when using Pastix as solver for KSP. The error message
>>> looks like:
>>>
>>> ....
>>> NUMBER of BUBBLE 1
>>> COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
>>> ** End of Partition & Distribution phase **
>>>    Time to analyze                              0.225 s
>>>    Number of nonzeros in factorized matrix      708784076
>>>    Fill-in                                      12.2337
>>>    Number of operations (LU)                    2.80185e+12
>>>    Prediction Time to factorize (AMD 6180  MKL) 394 s
>>> 0 : SolverMatrix size (without coefficients)    32.4 MB
>>> 0 : Number of nonzeros (local block structure)  365309391
>>>  Numerical Factorization (LU) :
>>> 0 : Internal CSC size                           1.08 GB
>>>    Time to fill internal csc                    6.66 s
>>>    --- Sopalin : Allocation de la structure globale ---
>>>    --- Fin Sopalin Init                             ---
>>>    --- Initialisation des tableaux globaux          ---
>>> sched_setaffinity: Invalid argument
>>> [node083:165071] *** Process received signal ***
>>> [node083:165071] Signal: Aborted (6)
>>> [node083:165071] Signal code:  (-6)
>>> [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
>>> [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
>>> [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
>>> [node083:165071] [ 3]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
>>> [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
>>> communication, 0 out-of-core)
>>>    --- Sopalin : Local structure allocation         ---
>>>
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
>>> [node083:165071] [ 5]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
>>> [node083:165071] [ 6]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
>>> [node083:165071] [ 7]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
>>> [node083:165071] [ 8]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
>>> [node083:165071] [ 9]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
>>> [node083:165071] [10]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
>>> [node083:165071] [11]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
>>> [node083:165071] [12]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
>>> [node083:165071] [13]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
>>> [node083:165071] [14]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
>>> [node083:165071] [15]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
>>>
>>> Does anyone have an idea what is the problem and how to fix it? The
>>> PETSc parameters I used are as below:
>>>
>>
>> It looks like PasTix is having trouble setting the thread affinity:
>>
>> sched_setaffinity: Invalid argument
>>
>> so it may be your build of PasTix.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> -pc_type lu
>>> -pc_factor_mat_solver_package pastix
>>> -mat_pastix_verbose 2
>>> -mat_pastix_threadnbr 1
>>>
>>> Giang
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191105/c14d46d6/attachment.html>

From bsmith at mcs.anl.gov  Tue Nov  5 21:36:59 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Wed, 6 Nov 2019 03:36:59 +0000
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
Message-ID: <937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>


  Google finds this https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186


> On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> I have no idea. That is a good question for the PasTix list.
> 
>   Thanks,
> 
>     Matt
> 
> On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
> Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and also OMP_NUM_THREADS to 1
> 
> Giang
> 
> 
> On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com> wrote:
> On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <petsc-users at mcs.anl.gov> wrote:
> Hello
> 
> I got crashed when using Pastix as solver for KSP. The error message looks like:
> 
> ....
> NUMBER of BUBBLE 1
> COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
> ** End of Partition & Distribution phase **
>    Time to analyze                              0.225 s
>    Number of nonzeros in factorized matrix      708784076
>    Fill-in                                      12.2337
>    Number of operations (LU)                    2.80185e+12
>    Prediction Time to factorize (AMD 6180  MKL) 394 s
> 0 : SolverMatrix size (without coefficients)    32.4 MB
> 0 : Number of nonzeros (local block structure)  365309391
>  Numerical Factorization (LU) :
> 0 : Internal CSC size                           1.08 GB
>    Time to fill internal csc                    6.66 s
>    --- Sopalin : Allocation de la structure globale ---
>    --- Fin Sopalin Init                             ---
>    --- Initialisation des tableaux globaux          ---
> sched_setaffinity: Invalid argument
> [node083:165071] *** Process received signal ***
> [node083:165071] Signal: Aborted (6)
> [node083:165071] Signal code:  (-6)
> [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
> [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
> [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
> [node083:165071] [ 3] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
> [node083:165071] [ 4] Launching 1 threads (1 commputation, 0 communication, 0 out-of-core)
>    --- Sopalin : Local structure allocation         ---
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
> [node083:165071] [ 5] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
> [node083:165071] [ 6] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
> [node083:165071] [ 7] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
> [node083:165071] [ 8] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
> [node083:165071] [ 9] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
> [node083:165071] [10] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
> [node083:165071] [11] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
> [node083:165071] [12] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
> [node083:165071] [13] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
> [node083:165071] [14] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
> [node083:165071] [15] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
> 
> Does anyone have an idea what is the problem and how to fix it? The PETSc parameters I used are as below:
> 
> It looks like PasTix is having trouble setting the thread affinity:
> 
> sched_setaffinity: Invalid argument
> 
> so it may be your build of PasTix.
> 
>   Thanks,
> 
>      Matt
>  
> -pc_type lu
> -pc_factor_mat_solver_package pastix
> -mat_pastix_verbose 2
> -mat_pastix_threadnbr 1
> 
> Giang
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/


From hgbk2008 at gmail.com  Wed Nov  6 03:12:53 2019
From: hgbk2008 at gmail.com (hg)
Date: Wed, 6 Nov 2019 10:12:53 +0100
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
	<937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
Message-ID: <CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>

sched_setaffinity: Invalid argument only happens when I launch the job with
sbatch. Running without scheduler is fine. I think this has something to do
with pastix.

Giang


On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>   Google finds this
> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
>
>
>
> > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >
> > I have no idea. That is a good question for the PasTix list.
> >
> >   Thanks,
> >
> >     Matt
> >
> > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
> > Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and
> also OMP_NUM_THREADS to 1
> >
> > Giang
> >
> >
> > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > Hello
> >
> > I got crashed when using Pastix as solver for KSP. The error message
> looks like:
> >
> > ....
> > NUMBER of BUBBLE 1
> > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
> > ** End of Partition & Distribution phase **
> >    Time to analyze                              0.225 s
> >    Number of nonzeros in factorized matrix      708784076
> >    Fill-in                                      12.2337
> >    Number of operations (LU)                    2.80185e+12
> >    Prediction Time to factorize (AMD 6180  MKL) 394 s
> > 0 : SolverMatrix size (without coefficients)    32.4 MB
> > 0 : Number of nonzeros (local block structure)  365309391
> >  Numerical Factorization (LU) :
> > 0 : Internal CSC size                           1.08 GB
> >    Time to fill internal csc                    6.66 s
> >    --- Sopalin : Allocation de la structure globale ---
> >    --- Fin Sopalin Init                             ---
> >    --- Initialisation des tableaux globaux          ---
> > sched_setaffinity: Invalid argument
> > [node083:165071] *** Process received signal ***
> > [node083:165071] Signal: Aborted (6)
> > [node083:165071] Signal code:  (-6)
> > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
> > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
> > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
> > [node083:165071] [ 3]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
> > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
> communication, 0 out-of-core)
> >    --- Sopalin : Local structure allocation         ---
> >
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
> > [node083:165071] [ 5]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
> > [node083:165071] [ 6]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
> > [node083:165071] [ 7]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
> > [node083:165071] [ 8]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
> > [node083:165071] [ 9]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
> > [node083:165071] [10]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
> > [node083:165071] [11]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
> > [node083:165071] [12]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
> > [node083:165071] [13]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
> > [node083:165071] [14]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
> > [node083:165071] [15]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
> >
> > Does anyone have an idea what is the problem and how to fix it? The
> PETSc parameters I used are as below:
> >
> > It looks like PasTix is having trouble setting the thread affinity:
> >
> > sched_setaffinity: Invalid argument
> >
> > so it may be your build of PasTix.
> >
> >   Thanks,
> >
> >      Matt
> >
> > -pc_type lu
> > -pc_factor_mat_solver_package pastix
> > -mat_pastix_verbose 2
> > -mat_pastix_threadnbr 1
> >
> > Giang
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191106/60c57d9b/attachment-0001.html>

From hgbk2008 at gmail.com  Wed Nov  6 03:40:06 2019
From: hgbk2008 at gmail.com (hg)
Date: Wed, 6 Nov 2019 10:40:06 +0100
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
	<937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
	<CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>
Message-ID: <CAJW_hKd2zAagj7Bb0+vHu3c5QVRrL=femq54HBw4w-g2d9w_8w@mail.gmail.com>

Look into
arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c
I saw something like:

#ifdef HAVE_OLD_SCHED_SETAFFINITY
    if(sched_setaffinity(0,&mask) < 0)
#else /* HAVE_OLD_SCHED_SETAFFINITY */
    if(sched_setaffinity(0,sizeof(mask),&mask) < 0)
#endif /* HAVE_OLD_SCHED_SETAFFINITY */
      {
  perror("sched_setaffinity");
  EXIT(MOD_SOPALIN, INTERNAL_ERR);
      }

Is there possibility that Petsc turn on HAVE_OLD_SCHED_SETAFFINITY during
compilation?

May I know how to trigger re-compilation of external packages with petsc? I
may go in there and check what's going on.

Giang


On Wed, Nov 6, 2019 at 10:12 AM hg <hgbk2008 at gmail.com> wrote:

> sched_setaffinity: Invalid argument only happens when I launch the job
> with sbatch. Running without scheduler is fine. I think this has something
> to do with pastix.
>
> Giang
>
>
> On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
>>
>>   Google finds this
>> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
>>
>>
>>
>> > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> >
>> > I have no idea. That is a good question for the PasTix list.
>> >
>> >   Thanks,
>> >
>> >     Matt
>> >
>> > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
>> > Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and
>> also OMP_NUM_THREADS to 1
>> >
>> > Giang
>> >
>> >
>> > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> > Hello
>> >
>> > I got crashed when using Pastix as solver for KSP. The error message
>> looks like:
>> >
>> > ....
>> > NUMBER of BUBBLE 1
>> > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
>> > ** End of Partition & Distribution phase **
>> >    Time to analyze                              0.225 s
>> >    Number of nonzeros in factorized matrix      708784076
>> >    Fill-in                                      12.2337
>> >    Number of operations (LU)                    2.80185e+12
>> >    Prediction Time to factorize (AMD 6180  MKL) 394 s
>> > 0 : SolverMatrix size (without coefficients)    32.4 MB
>> > 0 : Number of nonzeros (local block structure)  365309391
>> >  Numerical Factorization (LU) :
>> > 0 : Internal CSC size                           1.08 GB
>> >    Time to fill internal csc                    6.66 s
>> >    --- Sopalin : Allocation de la structure globale ---
>> >    --- Fin Sopalin Init                             ---
>> >    --- Initialisation des tableaux globaux          ---
>> > sched_setaffinity: Invalid argument
>> > [node083:165071] *** Process received signal ***
>> > [node083:165071] Signal: Aborted (6)
>> > [node083:165071] Signal code:  (-6)
>> > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
>> > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
>> > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
>> > [node083:165071] [ 3]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
>> > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
>> communication, 0 out-of-core)
>> >    --- Sopalin : Local structure allocation         ---
>> >
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
>> > [node083:165071] [ 5]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
>> > [node083:165071] [ 6]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
>> > [node083:165071] [ 7]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
>> > [node083:165071] [ 8]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
>> > [node083:165071] [ 9]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
>> > [node083:165071] [10]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
>> > [node083:165071] [11]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
>> > [node083:165071] [12]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
>> > [node083:165071] [13]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
>> > [node083:165071] [14]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
>> > [node083:165071] [15]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
>> >
>> > Does anyone have an idea what is the problem and how to fix it? The
>> PETSc parameters I used are as below:
>> >
>> > It looks like PasTix is having trouble setting the thread affinity:
>> >
>> > sched_setaffinity: Invalid argument
>> >
>> > so it may be your build of PasTix.
>> >
>> >   Thanks,
>> >
>> >      Matt
>> >
>> > -pc_type lu
>> > -pc_factor_mat_solver_package pastix
>> > -mat_pastix_verbose 2
>> > -mat_pastix_threadnbr 1
>> >
>> > Giang
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> > https://www.cse.buffalo.edu/~knepley/
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> > https://www.cse.buffalo.edu/~knepley/
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191106/affae6c7/attachment.html>

From knepley at gmail.com  Wed Nov  6 04:02:58 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 6 Nov 2019 05:02:58 -0500
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAJW_hKd2zAagj7Bb0+vHu3c5QVRrL=femq54HBw4w-g2d9w_8w@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
	<937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
	<CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>
	<CAJW_hKd2zAagj7Bb0+vHu3c5QVRrL=femq54HBw4w-g2d9w_8w@mail.gmail.com>
Message-ID: <CAMYG4GnF6WWxpYMWD9xQMYNM_5uoj9xM=z8P4cgac--MmVhAtg@mail.gmail.com>

On Wed, Nov 6, 2019 at 4:40 AM hg <hgbk2008 at gmail.com> wrote:

> Look into
> arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c
> I saw something like:
>
> #ifdef HAVE_OLD_SCHED_SETAFFINITY
>     if(sched_setaffinity(0,&mask) < 0)
> #else /* HAVE_OLD_SCHED_SETAFFINITY */
>     if(sched_setaffinity(0,sizeof(mask),&mask) < 0)
> #endif /* HAVE_OLD_SCHED_SETAFFINITY */
>       {
>   perror("sched_setaffinity");
>   EXIT(MOD_SOPALIN, INTERNAL_ERR);
>       }
>
> Is there possibility that Petsc turn on HAVE_OLD_SCHED_SETAFFINITY during
> compilation?
>
> May I know how to trigger re-compilation of external packages with petsc?
> I may go in there and check what's going on.
>

If we built it during configure, then you can just go to

  $PETSC_DIR/$PETSC_ARCH/externalpackages/*pastix*/

and rebuild/install it to test. If you want configure to do it, you have to
delete

  $PETSC_DIR/$PETSC_ARCH/lib/petsc/conf/pkg.conf.pastix

and reconfigure.

  Thanks,

     Matt


> Giang
>
>
> On Wed, Nov 6, 2019 at 10:12 AM hg <hgbk2008 at gmail.com> wrote:
>
>> sched_setaffinity: Invalid argument only happens when I launch the job
>> with sbatch. Running without scheduler is fine. I think this has something
>> to do with pastix.
>>
>> Giang
>>
>>
>> On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov>
>> wrote:
>>
>>>
>>>   Google finds this
>>> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
>>>
>>>
>>>
>>> > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> >
>>> > I have no idea. That is a good question for the PasTix list.
>>> >
>>> >   Thanks,
>>> >
>>> >     Matt
>>> >
>>> > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
>>> > Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and
>>> also OMP_NUM_THREADS to 1
>>> >
>>> > Giang
>>> >
>>> >
>>> > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>> > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> > Hello
>>> >
>>> > I got crashed when using Pastix as solver for KSP. The error message
>>> looks like:
>>> >
>>> > ....
>>> > NUMBER of BUBBLE 1
>>> > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
>>> > ** End of Partition & Distribution phase **
>>> >    Time to analyze                              0.225 s
>>> >    Number of nonzeros in factorized matrix      708784076
>>> >    Fill-in                                      12.2337
>>> >    Number of operations (LU)                    2.80185e+12
>>> >    Prediction Time to factorize (AMD 6180  MKL) 394 s
>>> > 0 : SolverMatrix size (without coefficients)    32.4 MB
>>> > 0 : Number of nonzeros (local block structure)  365309391
>>> >  Numerical Factorization (LU) :
>>> > 0 : Internal CSC size                           1.08 GB
>>> >    Time to fill internal csc                    6.66 s
>>> >    --- Sopalin : Allocation de la structure globale ---
>>> >    --- Fin Sopalin Init                             ---
>>> >    --- Initialisation des tableaux globaux          ---
>>> > sched_setaffinity: Invalid argument
>>> > [node083:165071] *** Process received signal ***
>>> > [node083:165071] Signal: Aborted (6)
>>> > [node083:165071] Signal code:  (-6)
>>> > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
>>> > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
>>> > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
>>> > [node083:165071] [ 3]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
>>> > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
>>> communication, 0 out-of-core)
>>> >    --- Sopalin : Local structure allocation         ---
>>> >
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
>>> > [node083:165071] [ 5]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
>>> > [node083:165071] [ 6]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
>>> > [node083:165071] [ 7]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
>>> > [node083:165071] [ 8]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
>>> > [node083:165071] [ 9]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
>>> > [node083:165071] [10]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
>>> > [node083:165071] [11]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
>>> > [node083:165071] [12]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
>>> > [node083:165071] [13]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
>>> > [node083:165071] [14]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
>>> > [node083:165071] [15]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
>>> >
>>> > Does anyone have an idea what is the problem and how to fix it? The
>>> PETSc parameters I used are as below:
>>> >
>>> > It looks like PasTix is having trouble setting the thread affinity:
>>> >
>>> > sched_setaffinity: Invalid argument
>>> >
>>> > so it may be your build of PasTix.
>>> >
>>> >   Thanks,
>>> >
>>> >      Matt
>>> >
>>> > -pc_type lu
>>> > -pc_factor_mat_solver_package pastix
>>> > -mat_pastix_verbose 2
>>> > -mat_pastix_threadnbr 1
>>> >
>>> > Giang
>>> >
>>> >
>>> >
>>> > --
>>> > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > -- Norbert Wiener
>>> >
>>> > https://www.cse.buffalo.edu/~knepley/
>>> >
>>> >
>>> > --
>>> > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > -- Norbert Wiener
>>> >
>>> > https://www.cse.buffalo.edu/~knepley/
>>>
>>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191106/c86b5aa9/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed Nov  6 09:52:20 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Wed, 6 Nov 2019 15:52:20 +0000
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAMYG4GnF6WWxpYMWD9xQMYNM_5uoj9xM=z8P4cgac--MmVhAtg@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
	<937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
	<CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>
	<CAJW_hKd2zAagj7Bb0+vHu3c5QVRrL=femq54HBw4w-g2d9w_8w@mail.gmail.com>
	<CAMYG4GnF6WWxpYMWD9xQMYNM_5uoj9xM=z8P4cgac--MmVhAtg@mail.gmail.com>
Message-ID: <1E28761A-883F-4B66-9BA8-8367881D5BCB@mcs.anl.gov>


  You can also just look at configure.log where it will show the calling sequence of how PETSc configured and built Pastix. The recipe is in config/BuildSystem/config/packages/PaStiX.py we don't monkey with low level things like the affinity of external packages. My guess is that your cluster system has inconsistent parts related to this, that one tool works and another does not indicates they are inconsistent with respect to each other in what they expect.

   Barry


> On Nov 6, 2019, at 4:02 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Wed, Nov 6, 2019 at 4:40 AM hg <hgbk2008 at gmail.com> wrote:
> Look into arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c I saw something like:
> 
> #ifdef HAVE_OLD_SCHED_SETAFFINITY
>     if(sched_setaffinity(0,&mask) < 0)
> #else /* HAVE_OLD_SCHED_SETAFFINITY */
>     if(sched_setaffinity(0,sizeof(mask),&mask) < 0)
> #endif /* HAVE_OLD_SCHED_SETAFFINITY */
>       {
>   perror("sched_setaffinity");
>   EXIT(MOD_SOPALIN, INTERNAL_ERR);
>       }
> 
> Is there possibility that Petsc turn on HAVE_OLD_SCHED_SETAFFINITY during compilation?
> 
> May I know how to trigger re-compilation of external packages with petsc? I may go in there and check what's going on.
> 
> If we built it during configure, then you can just go to
> 
>   $PETSC_DIR/$PETSC_ARCH/externalpackages/*pastix*/
> 
> and rebuild/install it to test. If you want configure to do it, you have to delete
> 
>   $PETSC_DIR/$PETSC_ARCH/lib/petsc/conf/pkg.conf.pastix
> 
> and reconfigure.
> 
>   Thanks,
> 
>      Matt
>  
> Giang
> 
> 
> On Wed, Nov 6, 2019 at 10:12 AM hg <hgbk2008 at gmail.com> wrote:
> sched_setaffinity: Invalid argument only happens when I launch the job with sbatch. Running without scheduler is fine. I think this has something to do with pastix.
> 
> Giang
> 
> 
> On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
>   Google finds this https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
> 
> 
> 
> > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > 
> > I have no idea. That is a good question for the PasTix list.
> > 
> >   Thanks,
> > 
> >     Matt
> > 
> > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
> > Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and also OMP_NUM_THREADS to 1
> > 
> > Giang
> > 
> > 
> > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com> wrote:
> > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > Hello
> > 
> > I got crashed when using Pastix as solver for KSP. The error message looks like:
> > 
> > ....
> > NUMBER of BUBBLE 1
> > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
> > ** End of Partition & Distribution phase **
> >    Time to analyze                              0.225 s
> >    Number of nonzeros in factorized matrix      708784076
> >    Fill-in                                      12.2337
> >    Number of operations (LU)                    2.80185e+12
> >    Prediction Time to factorize (AMD 6180  MKL) 394 s
> > 0 : SolverMatrix size (without coefficients)    32.4 MB
> > 0 : Number of nonzeros (local block structure)  365309391
> >  Numerical Factorization (LU) :
> > 0 : Internal CSC size                           1.08 GB
> >    Time to fill internal csc                    6.66 s
> >    --- Sopalin : Allocation de la structure globale ---
> >    --- Fin Sopalin Init                             ---
> >    --- Initialisation des tableaux globaux          ---
> > sched_setaffinity: Invalid argument
> > [node083:165071] *** Process received signal ***
> > [node083:165071] Signal: Aborted (6)
> > [node083:165071] Signal code:  (-6)
> > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
> > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
> > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
> > [node083:165071] [ 3] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
> > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0 communication, 0 out-of-core)
> >    --- Sopalin : Local structure allocation         ---
> > /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
> > [node083:165071] [ 5] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
> > [node083:165071] [ 6] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
> > [node083:165071] [ 7] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
> > [node083:165071] [ 8] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
> > [node083:165071] [ 9] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
> > [node083:165071] [10] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
> > [node083:165071] [11] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
> > [node083:165071] [12] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
> > [node083:165071] [13] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
> > [node083:165071] [14] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
> > [node083:165071] [15] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
> > 
> > Does anyone have an idea what is the problem and how to fix it? The PETSc parameters I used are as below:
> > 
> > It looks like PasTix is having trouble setting the thread affinity:
> > 
> > sched_setaffinity: Invalid argument
> > 
> > so it may be your build of PasTix.
> > 
> >   Thanks,
> > 
> >      Matt
> >  
> > -pc_type lu
> > -pc_factor_mat_solver_package pastix
> > -mat_pastix_verbose 2
> > -mat_pastix_threadnbr 1
> > 
> > Giang
> > 
> > 
> > 
> > -- 
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> > 
> > https://www.cse.buffalo.edu/~knepley/
> > 
> > 
> > -- 
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> > 
> > https://www.cse.buffalo.edu/~knepley/
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/


From hgbk2008 at gmail.com  Wed Nov  6 17:18:18 2019
From: hgbk2008 at gmail.com (hg)
Date: Thu, 7 Nov 2019 00:18:18 +0100
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <1E28761A-883F-4B66-9BA8-8367881D5BCB@mcs.anl.gov>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
	<937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
	<CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>
	<CAJW_hKd2zAagj7Bb0+vHu3c5QVRrL=femq54HBw4w-g2d9w_8w@mail.gmail.com>
	<CAMYG4GnF6WWxpYMWD9xQMYNM_5uoj9xM=z8P4cgac--MmVhAtg@mail.gmail.com>
	<1E28761A-883F-4B66-9BA8-8367881D5BCB@mcs.anl.gov>
Message-ID: <CAJW_hKcEVFb9EvCAUBWr9mEe9X6Hag4kyfc9d2_r-Z8+L3hYcA@mail.gmail.com>

Hi Barry

Maybe you're right, sched_setaffinity returns EINVAL in my case, Probably
the scheduler does not allow the process to bind to thread on its own.

Giang


On Wed, Nov 6, 2019 at 4:52 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>   You can also just look at configure.log where it will show the calling
> sequence of how PETSc configured and built Pastix. The recipe is in
> config/BuildSystem/config/packages/PaStiX.py we don't monkey with low level
> things like the affinity of external packages. My guess is that your
> cluster system has inconsistent parts related to this, that one tool works
> and another does not indicates they are inconsistent with respect to each
> other in what they expect.
>
>    Barry
>
>
>
>
> > On Nov 6, 2019, at 4:02 AM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > On Wed, Nov 6, 2019 at 4:40 AM hg <hgbk2008 at gmail.com> wrote:
> > Look into
> arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c
> I saw something like:
> >
> > #ifdef HAVE_OLD_SCHED_SETAFFINITY
> >     if(sched_setaffinity(0,&mask) < 0)
> > #else /* HAVE_OLD_SCHED_SETAFFINITY */
> >     if(sched_setaffinity(0,sizeof(mask),&mask) < 0)
> > #endif /* HAVE_OLD_SCHED_SETAFFINITY */
> >       {
> >   perror("sched_setaffinity");
> >   EXIT(MOD_SOPALIN, INTERNAL_ERR);
> >       }
> >
> > Is there possibility that Petsc turn on HAVE_OLD_SCHED_SETAFFINITY
> during compilation?
> >
> > May I know how to trigger re-compilation of external packages with
> petsc? I may go in there and check what's going on.
> >
> > If we built it during configure, then you can just go to
> >
> >   $PETSC_DIR/$PETSC_ARCH/externalpackages/*pastix*/
> >
> > and rebuild/install it to test. If you want configure to do it, you have
> to delete
> >
> >   $PETSC_DIR/$PETSC_ARCH/lib/petsc/conf/pkg.conf.pastix
> >
> > and reconfigure.
> >
> >   Thanks,
> >
> >      Matt
> >
> > Giang
> >
> >
> > On Wed, Nov 6, 2019 at 10:12 AM hg <hgbk2008 at gmail.com> wrote:
> > sched_setaffinity: Invalid argument only happens when I launch the job
> with sbatch. Running without scheduler is fine. I think this has something
> to do with pastix.
> >
> > Giang
> >
> >
> > On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> >
> >   Google finds this
> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
> >
> >
> >
> > > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > >
> > > I have no idea. That is a good question for the PasTix list.
> > >
> > >   Thanks,
> > >
> > >     Matt
> > >
> > > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
> > > Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and
> also OMP_NUM_THREADS to 1
> > >
> > > Giang
> > >
> > >
> > > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com>
> wrote:
> > > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > > Hello
> > >
> > > I got crashed when using Pastix as solver for KSP. The error message
> looks like:
> > >
> > > ....
> > > NUMBER of BUBBLE 1
> > > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
> > > ** End of Partition & Distribution phase **
> > >    Time to analyze                              0.225 s
> > >    Number of nonzeros in factorized matrix      708784076
> > >    Fill-in                                      12.2337
> > >    Number of operations (LU)                    2.80185e+12
> > >    Prediction Time to factorize (AMD 6180  MKL) 394 s
> > > 0 : SolverMatrix size (without coefficients)    32.4 MB
> > > 0 : Number of nonzeros (local block structure)  365309391
> > >  Numerical Factorization (LU) :
> > > 0 : Internal CSC size                           1.08 GB
> > >    Time to fill internal csc                    6.66 s
> > >    --- Sopalin : Allocation de la structure globale ---
> > >    --- Fin Sopalin Init                             ---
> > >    --- Initialisation des tableaux globaux          ---
> > > sched_setaffinity: Invalid argument
> > > [node083:165071] *** Process received signal ***
> > > [node083:165071] Signal: Aborted (6)
> > > [node083:165071] Signal code:  (-6)
> > > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
> > > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
> > > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
> > > [node083:165071] [ 3]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
> > > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
> communication, 0 out-of-core)
> > >    --- Sopalin : Local structure allocation         ---
> > >
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
> > > [node083:165071] [ 5]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
> > > [node083:165071] [ 6]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
> > > [node083:165071] [ 7]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
> > > [node083:165071] [ 8]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
> > > [node083:165071] [ 9]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
> > > [node083:165071] [10]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
> > > [node083:165071] [11]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
> > > [node083:165071] [12]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
> > > [node083:165071] [13]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
> > > [node083:165071] [14]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
> > > [node083:165071] [15]
> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
> > >
> > > Does anyone have an idea what is the problem and how to fix it? The
> PETSc parameters I used are as below:
> > >
> > > It looks like PasTix is having trouble setting the thread affinity:
> > >
> > > sched_setaffinity: Invalid argument
> > >
> > > so it may be your build of PasTix.
> > >
> > >   Thanks,
> > >
> > >      Matt
> > >
> > > -pc_type lu
> > > -pc_factor_mat_solver_package pastix
> > > -mat_pastix_verbose 2
> > > -mat_pastix_threadnbr 1
> > >
> > > Giang
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > > -- Norbert Wiener
> > >
> > > https://www.cse.buffalo.edu/~knepley/
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > > -- Norbert Wiener
> > >
> > > https://www.cse.buffalo.edu/~knepley/
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191107/66318b2a/attachment.html>

From s.a.hack at utwente.nl  Thu Nov  7 05:44:18 2019
From: s.a.hack at utwente.nl (s.a.hack at utwente.nl)
Date: Thu, 7 Nov 2019 11:44:18 +0000
Subject: [petsc-users] nondeterministic behavior of MUMPS when filtering out
 zero rows and columns
Message-ID: <CBAEE6AC-5A2E-4C52-B8F8-4CAC93B6A646@contoso.com>

Hi,

I am doing calculations with version 3.12.0 of PETSc.
Using the finite-element method, I solve the Maxwell equations on the interior of a 3D domain, coupled with boundary condition auxiliary equations on the boundary of the domain. The auxiliary equations employ auxiliary variables g.

For ease of implementation of element matrix assembly, the auxiliary variables g are defined on the entire domain. However, only the basis functions for g with nonzero value at the boundary give nonzero entries in the system matrix.

The element matrices hence have the structure
[ A B; C D]
at the boundary.

In the interior the element matrices have the structure
[A 0; 0 0].

The degrees of freedom in the system matrix can be ordered by element [u_e1 g_e1 u_e2 g_e2 ?] or by parallel process [u_p1 g_p1 u_p2 g_p2 ?].

To solve the system matrix, I need to filter out zero rows and columns:
error = MatFindNonzeroRows(stiffnessMatrix, &nonzeroRows);
CHKERRABORT(PETSC_COMM_WORLD, error);
error = MatCreateSubMatrix(stiffnessMatrix, nonzeroRows, nonzeroRows, MAT_INITIAL_MATRIX, &stiffnessMatrixSubMatrix);
CHKERRABORT(PETSC_COMM_WORLD, error);

I solve the system matrix in parallel on multiple nodes connected with InfiniBand.
The problem is that the MUMPS solver frequently (nondeterministically) hangs during KSPSolve() (after KSPSetUp() is completed).
Running with the options -ksp_view and -info the last printed statement is:

[0] VecScatterCreate_SF(): Using StarForest for vector scatter
In the calculations where the program does not hang, the calculated solution is correct.

The problem doesn?t occur for calculations on a single node, or for calculations with the SuperLU solver (but SuperLU will not allow calculations that are as large).
The problem also doesn?t seem to occur for small problems.
The problem doesn?t occur either when I put ones on the diagonal, but this is computationally expensive:
error = MatFindZeroRows(stiffnessMatrix, &zeroRows);
CHKERRABORT(PETSC_COMM_WORLD, error);
error = MatZeroRowsColumnsIS(stiffnessMatrix, zeroRows, diagEntry, PETSC_IGNORE, PETSC_IGNORE);
CHKERRABORT(PETSC_COMM_WORLD, error);

Would you have any ideas on what I could check?

Best regards,
Sjoerd

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191107/594460ac/attachment-0001.html>

From hzhang at mcs.anl.gov  Thu Nov  7 09:28:26 2019
From: hzhang at mcs.anl.gov (Zhang, Hong)
Date: Thu, 7 Nov 2019 15:28:26 +0000
Subject: [petsc-users] nondeterministic behavior of MUMPS when filtering
 out zero rows and columns
In-Reply-To: <CBAEE6AC-5A2E-4C52-B8F8-4CAC93B6A646@contoso.com>
References: <CBAEE6AC-5A2E-4C52-B8F8-4CAC93B6A646@contoso.com>
Message-ID: <CAGCphBv4ZYhN6hQpwRqFHLcy6jPnB9aAa=6RqbNoytMU_jD43g@mail.gmail.com>

Run your code with option '-ksp_error_if_not_converged' to get more info.
Hong

On Thu, Nov 7, 2019 at 5:45 AM s.a.hack--- via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi,

I am doing calculations with version 3.12.0 of PETSc.
Using the finite-element method, I solve the Maxwell equations on the interior of a 3D domain, coupled with boundary condition auxiliary equations on the boundary of the domain. The auxiliary equations employ auxiliary variables g.

For ease of implementation of element matrix assembly, the auxiliary variables g are defined on the entire domain. However, only the basis functions for g with nonzero value at the boundary give nonzero entries in the system matrix.

The element matrices hence have the structure
[ A B; C D]
at the boundary.

In the interior the element matrices have the structure
[A 0; 0 0].

The degrees of freedom in the system matrix can be ordered by element [u_e1 g_e1 u_e2 g_e2 ?] or by parallel process [u_p1 g_p1 u_p2 g_p2 ?].

To solve the system matrix, I need to filter out zero rows and columns:
error = MatFindNonzeroRows(stiffnessMatrix, &nonzeroRows);
CHKERRABORT(PETSC_COMM_WORLD, error);
error = MatCreateSubMatrix(stiffnessMatrix, nonzeroRows, nonzeroRows, MAT_INITIAL_MATRIX, &stiffnessMatrixSubMatrix);
CHKERRABORT(PETSC_COMM_WORLD, error);

I solve the system matrix in parallel on multiple nodes connected with InfiniBand.
The problem is that the MUMPS solver frequently (nondeterministically) hangs during KSPSolve() (after KSPSetUp() is completed).
Running with the options -ksp_view and -info the last printed statement is:

[0] VecScatterCreate_SF(): Using StarForest for vector scatter
In the calculations where the program does not hang, the calculated solution is correct.

The problem doesn?t occur for calculations on a single node, or for calculations with the SuperLU solver (but SuperLU will not allow calculations that are as large).
The problem also doesn?t seem to occur for small problems.
The problem doesn?t occur either when I put ones on the diagonal, but this is computationally expensive:
error = MatFindZeroRows(stiffnessMatrix, &zeroRows);
CHKERRABORT(PETSC_COMM_WORLD, error);
error = MatZeroRowsColumnsIS(stiffnessMatrix, zeroRows, diagEntry, PETSC_IGNORE, PETSC_IGNORE);
CHKERRABORT(PETSC_COMM_WORLD, error);

Would you have any ideas on what I could check?

Best regards,
Sjoerd

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191107/b049e185/attachment.html>

From knepley at gmail.com  Thu Nov  7 10:40:05 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 7 Nov 2019 11:40:05 -0500
Subject: [petsc-users] nondeterministic behavior of MUMPS when filtering
 out zero rows and columns
In-Reply-To: <CBAEE6AC-5A2E-4C52-B8F8-4CAC93B6A646@contoso.com>
References: <CBAEE6AC-5A2E-4C52-B8F8-4CAC93B6A646@contoso.com>
Message-ID: <CAMYG4G=jwMr9pEJ-myTc_KpOpXgHxP-C-ge6kocrj4KDfptQNQ@mail.gmail.com>

On Thu, Nov 7, 2019 at 6:44 AM s.a.hack--- via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi,
>
>
>
> I am doing calculations with version 3.12.0 of PETSc.
>
> Using the finite-element method, I solve the Maxwell equations on the
> interior of a 3D domain, coupled with boundary condition auxiliary
> equations on the boundary of the domain. The auxiliary equations employ
> auxiliary variables g.
>
>
>
> For ease of implementation of element matrix assembly, the auxiliary
> variables g are defined on the entire domain. However, only the basis
> functions for g with nonzero value at the boundary give nonzero entries in
> the system matrix.
>
>
>
> The element matrices hence have the structure
>
> [ A B; C D]
>
> at the boundary.
>
>
>
> In the interior the element matrices have the structure
>
> [A 0; 0 0].
>
>
>
> The degrees of freedom in the system matrix can be ordered by element
> [u_e1 g_e1 u_e2 g_e2 ?] or by parallel process [u_p1 g_p1 u_p2 g_p2 ?].
>
>
>
> To solve the system matrix, I need to filter out zero rows and columns:
>
> error = MatFindNonzeroRows(stiffnessMatrix, &nonzeroRows);
>
> CHKERRABORT(PETSC_COMM_WORLD, error);
>
> error = MatCreateSubMatrix(stiffnessMatrix, nonzeroRows, nonzeroRows,
> MAT_INITIAL_MATRIX, &stiffnessMatrixSubMatrix);
>
> CHKERRABORT(PETSC_COMM_WORLD, error);
>
>
>
> I solve the system matrix in parallel on multiple nodes connected with
> InfiniBand.
>
> The problem is that the MUMPS solver frequently (nondeterministically)
> hangs during KSPSolve() (after KSPSetUp() is completed).
>
> Running with the options -ksp_view and -info the last printed statement is:
>
> [0] VecScatterCreate_SF(): Using StarForest for vector scatter
>
There is a bug in some older MPI implementations. You can try using

  -vec_assembly_legacy -matstash_legacy

to see if you avoid the bug.

> In the calculations where the program does not hang, the calculated
> solution is correct.
>
>
>
> The problem doesn?t occur for calculations on a single node, or for
> calculations with the SuperLU solver (but SuperLU will not allow
> calculations that are as large).
>

SuperLU_dist can do large problems. Use --download-superlu_dist


> The problem also doesn?t seem to occur for small problems.
>
> The problem doesn?t occur either when I put ones on the diagonal, but this
> is computationally expensive:
>
> error = MatFindZeroRows(stiffnessMatrix, &zeroRows);
>
> CHKERRABORT(PETSC_COMM_WORLD, error);
>
> error = MatZeroRowsColumnsIS(stiffnessMatrix, zeroRows, diagEntry,
> PETSC_IGNORE, PETSC_IGNORE);
>
> CHKERRABORT(PETSC_COMM_WORLD, error);
>

The two function calls above are expensive? I can you run it with -log_view
and send the timing?

  Thanks,

    Matt


>
>
> Would you have any ideas on what I could check?
>
>
>
> Best regards,
>
> Sjoerd
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191107/bf8aa68f/attachment.html>

From Alexander.vonRamm at lrz.de  Thu Nov  7 11:11:51 2019
From: Alexander.vonRamm at lrz.de (von Ramm, Alexander)
Date: Thu, 7 Nov 2019 17:11:51 +0000
Subject: [petsc-users] DMNetwork - how to interprete the arguments of
 DMNetworkSetSizes ?
Message-ID: <4f6def37121145cbb9ca32e28495f4c6@lrz.de>

Hello together, 
I'm trying to figure out how to create a DMNetwork, but the proper way to set the parameters eludes me (also there is some discrepancy between the manual https://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf (page 166) and the online documenation https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMNetwork/DMNetworkSetSizes.html ).
Currently I'm trying to set a up a simple Network with 8 nodes and 7 edges and distribute it over 2 processors. In the call DMNetworkSetSizes does Nsubnet need to be 1 (one global network, without any further subnetworks) or 2 (one subnetwork per processor) (my guess would be the former). 
My current attempt looks like the following: 

int main( int argc, char *argv[]) {
        PetscInitialize(&argc, &argv, NULL, NULL);

        DM dm;
        PetscInt NSubnet = 1;
        PetscInt nV[1] = {8};
        PetscInt nE[1] = {7};
        PetscInt NsubnetCouple = 0;
        PetscInt nec[0];

        DMNetworkCreate(PETSC_COMM_WORLD, &dm);

        DMNetworkSetSizes(dm, NSubnet, nV, nE, NsubnetCouple, nec);

        PetscInt *edgeList;
        PetscMalloc1(14, &edgeList);
        edgeList[0] = 0;
        edgeList[1] = 4;

        edgeList[2] = 1;
        edgeList[3] = 4;

        edgeList[4] = 2;
        edgeList[5] = 5;

        edgeList[6] = 3;
        edgeList[7] = 5;

        edgeList[8] = 4;
        edgeList[9] = 6;

        edgeList[10] = 5;
        edgeList[11] = 6;

        edgeList[12] = 6;
        edgeList[13] = 7;

        PetscInt *edges[1];
        edges[0] = edgeList;
        DMNetworkSetEdgeList(dm, edges, NULL);

        DMNetworkLayoutSetUp(dm);

        return 0;
}

Except from the Online Documenation I wasn't able to find any information/example where  newest version Petsc was used when setting up a DMNetwork. (I found a few examples using older versions, however I could not figure out how these could solved using the newest version). If some could explain to me the the correct interpretation of the parameters of DMNetworkSetSizes ? Any pointers to examples using the newest API would also be much appreciated. 

Thanks and best Regards, 
Alex 


From hzhang at mcs.anl.gov  Thu Nov  7 13:53:55 2019
From: hzhang at mcs.anl.gov (Zhang, Hong)
Date: Thu, 7 Nov 2019 19:53:55 +0000
Subject: [petsc-users] DMNetwork - how to interprete the arguments of
 DMNetworkSetSizes ?
In-Reply-To: <4f6def37121145cbb9ca32e28495f4c6@lrz.de>
References: <4f6def37121145cbb9ca32e28495f4c6@lrz.de>
Message-ID: <CAGCphBuDaQqY_Zm4vgDj2Hr=JPUOXKkr_mB+Hhg37wEC2acLJg@mail.gmail.com>

Alex,
DMNetwork is under active development, thus our manual might not be updated. I'll check it.
I'm attaching a paper which might be used as a manual for the latest DMNetwork.

For your case, Nsubnet = 1, which is determined by application, not number of processes.
There are several examples using DMNetwork in petsc, e.g., under ~petsc,
$ git grep DMNetworkCreate *
...
src/ksp/ksp/examples/tutorials/network/ex1.c:  ierr = DMNetworkCreate(PETSC_COMM_WORLD,&dmnetwork);CHKERRQ(ierr);
src/ksp/ksp/examples/tutorials/network/ex1_nest.c:  ierr = DMNetworkCreate(PETSC_COMM_WORLD,&networkdm);CHKERRQ(ierr);
src/ksp/ksp/examples/tutorials/network/ex2.c:  ierr = DMNetworkCreate(PETSC_COMM_WORLD,&networkdm);CHKERRQ(ierr);
src/snes/examples/tutorials/network/ex1.c:  ierr = DMNetworkCreate(PETSC_COMM_WORLD,&networkdm);CHKERRQ(ierr);
src/snes/examples/tutorials/network/power/power.c:    ierr = DMNetworkCreate(PETSC_COMM_WORLD,&networkdm);CHKERRQ(ierr);
src/snes/examples/tutorials/network/power/power2.c:    ierr = DMNetworkCreate(PETSC_COMM_WORLD,&networkdm);CHKERRQ(ierr);
src/snes/examples/tutorials/network/water/water.c:  ierr = DMNetworkCreate(PETSC_COMM_WORLD,&networkdm);CHKERRQ(ierr);
src/ts/examples/tutorials/network/wash/pipes1.c:  ierr = DMNetworkCreate(PETSC_COMM_WORLD,&networkdm);CHKERRQ(ierr);
src/ts/examples/tutorials/power_grid/stability_9bus/ex9busdmnetwork.c:  ierr = DMNetworkCreate(PETSC_COMM_WORLD,&networkdm);CHKERRQ(ierr);

Hong


On Thu, Nov 7, 2019 at 11:12 AM von Ramm, Alexander via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello together,
I'm trying to figure out how to create a DMNetwork, but the proper way to set the parameters eludes me (also there is some discrepancy between the manual https://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf (page 166) and the online documenation https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMNetwork/DMNetworkSetSizes.html ).
Currently I'm trying to set a up a simple Network with 8 nodes and 7 edges and distribute it over 2 processors. In the call DMNetworkSetSizes does Nsubnet need to be 1 (one global network, without any further subnetworks) or 2 (one subnetwork per processor) (my guess would be the former).
My current attempt looks like the following:

int main( int argc, char *argv[]) {
        PetscInitialize(&argc, &argv, NULL, NULL);

        DM dm;
        PetscInt NSubnet = 1;
        PetscInt nV[1] = {8};
        PetscInt nE[1] = {7};
        PetscInt NsubnetCouple = 0;
        PetscInt nec[0];

        DMNetworkCreate(PETSC_COMM_WORLD, &dm);

        DMNetworkSetSizes(dm, NSubnet, nV, nE, NsubnetCouple, nec);

        PetscInt *edgeList;
        PetscMalloc1(14, &edgeList);
        edgeList[0] = 0;
        edgeList[1] = 4;

        edgeList[2] = 1;
        edgeList[3] = 4;

        edgeList[4] = 2;
        edgeList[5] = 5;

        edgeList[6] = 3;
        edgeList[7] = 5;

        edgeList[8] = 4;
        edgeList[9] = 6;

        edgeList[10] = 5;
        edgeList[11] = 6;

        edgeList[12] = 6;
        edgeList[13] = 7;

        PetscInt *edges[1];
        edges[0] = edgeList;
        DMNetworkSetEdgeList(dm, edges, NULL);

        DMNetworkLayoutSetUp(dm);

        return 0;
}

Except from the Online Documenation I wasn't able to find any information/example where  newest version Petsc was used when setting up a DMNetwork. (I found a few examples using older versions, however I could not figure out how these could solved using the newest version). If some could explain to me the the correct interpretation of the parameters of DMNetworkSetSizes ? Any pointers to examples using the newest API would also be much appreciated.

Thanks and best Regards,
Alex


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191107/3050221f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmnetwork-2019.pdf
Type: application/pdf
Size: 1415801 bytes
Desc: dmnetwork-2019.pdf
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191107/3050221f/attachment-0001.pdf>

From shrirang.abhyankar at pnnl.gov  Thu Nov  7 14:47:17 2019
From: shrirang.abhyankar at pnnl.gov (Abhyankar, Shrirang G)
Date: Thu, 7 Nov 2019 20:47:17 +0000
Subject: [petsc-users] DMNetwork - how to interprete the arguments of
 DMNetworkSetSizes ?
In-Reply-To: <4f6def37121145cbb9ca32e28495f4c6@lrz.de>
References: <4f6def37121145cbb9ca32e28495f4c6@lrz.de>
Message-ID: <6FE8D7D5-16FE-4172-A40D-5395FB416E72@pnnl.gov>


From: petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of "von Ramm, Alexander via petsc-users" <petsc-users at mcs.anl.gov>
Reply-To: "von Ramm, Alexander" <Alexander.vonRamm at lrz.de>
Date: Thursday, November 7, 2019 at 11:12 AM
To: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: [petsc-users] DMNetwork - how to interprete the arguments of DMNetworkSetSizes ?

Hello together,
I'm trying to figure out how to create a DMNetwork, but the proper way to set the parameters eludes me (also there is some discrepancy between the manual https://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf (page 166) and the online documenation https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMNetwork/DMNetworkSetSizes.html ).

Thanks for pointing out the discrepancy. We?ll update the user manual.

Currently I'm trying to set a up a simple Network with 8 nodes and 7 edges and distribute it over 2 processors. In the call DMNetworkSetSizes does Nsubnet need to be 1 (one global network, without any further subnetworks) or 2 (one subnetwork per processor) (my guess would be the former).

Yes, Nsubnet = 1 for your application since you just have a single network, i.e., no subnetworks.

My current attempt looks like the following:

int main( int argc, char *argv[]) {
        PetscInitialize(&argc, &argv, NULL, NULL);

        DM dm;
        PetscInt NSubnet = 1;
        PetscInt nV[1] = {8};
        PetscInt nE[1] = {7};
        PetscInt NsubnetCouple = 0;
        PetscInt nec[0];

You do not need nec. You can simply set it to NULL.

        DMNetworkCreate(PETSC_COMM_WORLD, &dm);

        DMNetworkSetSizes(dm, NSubnet, nV, nE, NsubnetCouple, nec);

DMNetworkSetSizes(dm, NSubnet, nV, nE, NsubnetCouple, NULL);

        PetscInt *edgeList;
        PetscMalloc1(14, &edgeList);
        edgeList[0] = 0;
        edgeList[1] = 4;

        edgeList[2] = 1;
        edgeList[3] = 4;

        edgeList[4] = 2;
        edgeList[5] = 5;

        edgeList[6] = 3;
        edgeList[7] = 5;

        edgeList[8] = 4;
        edgeList[9] = 6;

        edgeList[10] = 5;
        edgeList[11] = 6;

        edgeList[12] = 6;
        edgeList[13] = 7;

        PetscInt *edges[1];
        edges[0] = edgeList;
        DMNetworkSetEdgeList(dm, edges, NULL);

        DMNetworkLayoutSetUp(dm);

        return 0;
}


Except from the Online Documenation I wasn't able to find any information/example where  newest version Petsc was used when setting up a DMNetwork. (I found a few examples using older versions, however I could not figure out how these could solved using the newest version).

If some could explain to me the the correct interpretation of the parameters of DMNetworkSetSizes ? Any pointers to examples using the newest API would also be much appreciated.

Most of the examples with DMNetwork have a single network (Nsubnet = 1), except src/snes/examples/tutorials/network/ex1.c. This is a water + electric network simulation that has two subnetworks.

Shri

Thanks and best Regards,
Alex


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191107/7ea62f67/attachment.html>

From bsmith at mcs.anl.gov  Fri Nov  8 00:05:10 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Fri, 8 Nov 2019 06:05:10 +0000
Subject: [petsc-users] nondeterministic behavior of MUMPS when filtering
 out zero rows and columns
In-Reply-To: <CBAEE6AC-5A2E-4C52-B8F8-4CAC93B6A646@contoso.com>
References: <CBAEE6AC-5A2E-4C52-B8F8-4CAC93B6A646@contoso.com>
Message-ID: <B33CC35D-2866-4EEA-9BF5-7CDD4657621F@anl.gov>


   Make sure you have the latest PETSc and MUMPS installed; they have fixed bugs in MUMPs over time.

   Hanging locations are best found with a debugger; there is really no other way. If you have a parallel debugger like DDT use it. If you don't you can use the PETSc option -start_in_debugger to have PETSc start a line debugger in an xterm for each process. Type cont in each window and when it "hangs" do control C in the windows and type bt It will show the traceback where it is hanging on each process. Send us the output.

   Barry

Another approach that avoids the debugger is to send to one of the MPI processes a signal, term would be a good one to use. If you are luck that process with catch the signal and print a traceback of where it is when the signal occurred. If you are super lucky you can send the signal to several processes and get several tracebacks.


> On Nov 7, 2019, at 5:44 AM, s.a.hack--- via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi,
>  
> I am doing calculations with version 3.12.0 of PETSc.
> Using the finite-element method, I solve the Maxwell equations on the interior of a 3D domain, coupled with boundary condition auxiliary equations on the boundary of the domain. The auxiliary equations employ auxiliary variables g.
>  
> For ease of implementation of element matrix assembly, the auxiliary variables g are defined on the entire domain. However, only the basis functions for g with nonzero value at the boundary give nonzero entries in the system matrix.
>  
> The element matrices hence have the structure
> [ A B; C D]
> at the boundary.
>  
> In the interior the element matrices have the structure
> [A 0; 0 0].
>  
> The degrees of freedom in the system matrix can be ordered by element [u_e1 g_e1 u_e2 g_e2 ?] or by parallel process [u_p1 g_p1 u_p2 g_p2 ?].
>  
> To solve the system matrix, I need to filter out zero rows and columns:
> error = MatFindNonzeroRows(stiffnessMatrix, &nonzeroRows);
> CHKERRABORT(PETSC_COMM_WORLD, error);
> error = MatCreateSubMatrix(stiffnessMatrix, nonzeroRows, nonzeroRows, MAT_INITIAL_MATRIX, &stiffnessMatrixSubMatrix);
> CHKERRABORT(PETSC_COMM_WORLD, error);
>  
> I solve the system matrix in parallel on multiple nodes connected with InfiniBand.
> The problem is that the MUMPS solver frequently (nondeterministically) hangs during KSPSolve() (after KSPSetUp() is completed).
> Running with the options -ksp_view and -info the last printed statement is:
> [0] VecScatterCreate_SF(): Using StarForest for vector scatter
> In the calculations where the program does not hang, the calculated solution is correct. 
>  
> The problem doesn?t occur for calculations on a single node, or for calculations with the SuperLU solver (but SuperLU will not allow calculations that are as large).
> The problem also doesn?t seem to occur for small problems.
> The problem doesn?t occur either when I put ones on the diagonal, but this is computationally expensive:
> error = MatFindZeroRows(stiffnessMatrix, &zeroRows);
> CHKERRABORT(PETSC_COMM_WORLD, error);
> error = MatZeroRowsColumnsIS(stiffnessMatrix, zeroRows, diagEntry, PETSC_IGNORE, PETSC_IGNORE);
> CHKERRABORT(PETSC_COMM_WORLD, error);
>  
> Would you have any ideas on what I could check?
>  
> Best regards,
> Sjoerd


From Alexander.vonRamm at lrz.de  Fri Nov  8 01:06:22 2019
From: Alexander.vonRamm at lrz.de (von Ramm, Alexander)
Date: Fri, 8 Nov 2019 07:06:22 +0000
Subject: [petsc-users] DMNetwork - how to interprete the arguments of
 DMNetworkSetSizes ?
In-Reply-To: <6FE8D7D5-16FE-4172-A40D-5395FB416E72@pnnl.gov>
References: <4f6def37121145cbb9ca32e28495f4c6@lrz.de>,
	<6FE8D7D5-16FE-4172-A40D-5395FB416E72@pnnl.gov>
Message-ID: <e29f3977daed489fb7b82aadadb3fc32@lrz.de>

Hi Shri, Hi Hong,
thanks a lot for the information. This already helps a lot.
Best,
Alex
________________________________________
From: Abhyankar, Shrirang G <shrirang.abhyankar at pnnl.gov>
Sent: Thursday, November 7, 2019 9:47:17 PM
To: von Ramm, Alexander; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] DMNetwork - how to interprete the arguments of DMNetworkSetSizes ?

From: petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of "von Ramm, Alexander via petsc-users" <petsc-users at mcs.anl.gov>
Reply-To: "von Ramm, Alexander" <Alexander.vonRamm at lrz.de>
Date: Thursday, November 7, 2019 at 11:12 AM
To: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: [petsc-users] DMNetwork - how to interprete the arguments of DMNetworkSetSizes ?

Hello together,
I'm trying to figure out how to create a DMNetwork, but the proper way to set the parameters eludes me (also there is some discrepancy between the manual https://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf (page 166) and the online documenation https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMNetwork/DMNetworkSetSizes.html ).

Thanks for pointing out the discrepancy. We?ll update the user manual.

Currently I'm trying to set a up a simple Network with 8 nodes and 7 edges and distribute it over 2 processors. In the call DMNetworkSetSizes does Nsubnet need to be 1 (one global network, without any further subnetworks) or 2 (one subnetwork per processor) (my guess would be the former).

Yes, Nsubnet = 1 for your application since you just have a single network, i.e., no subnetworks.

My current attempt looks like the following:

int main( int argc, char *argv[]) {
        PetscInitialize(&argc, &argv, NULL, NULL);

        DM dm;
        PetscInt NSubnet = 1;
        PetscInt nV[1] = {8};
        PetscInt nE[1] = {7};
        PetscInt NsubnetCouple = 0;
        PetscInt nec[0];

You do not need nec. You can simply set it to NULL.

        DMNetworkCreate(PETSC_COMM_WORLD, &dm);

        DMNetworkSetSizes(dm, NSubnet, nV, nE, NsubnetCouple, nec);

DMNetworkSetSizes(dm, NSubnet, nV, nE, NsubnetCouple, NULL);

        PetscInt *edgeList;
        PetscMalloc1(14, &edgeList);
        edgeList[0] = 0;
        edgeList[1] = 4;

        edgeList[2] = 1;
        edgeList[3] = 4;

        edgeList[4] = 2;
        edgeList[5] = 5;

        edgeList[6] = 3;
        edgeList[7] = 5;

        edgeList[8] = 4;
        edgeList[9] = 6;

        edgeList[10] = 5;
        edgeList[11] = 6;

        edgeList[12] = 6;
        edgeList[13] = 7;

        PetscInt *edges[1];
        edges[0] = edgeList;
        DMNetworkSetEdgeList(dm, edges, NULL);

        DMNetworkLayoutSetUp(dm);

        return 0;
}


Except from the Online Documenation I wasn't able to find any information/example where  newest version Petsc was used when setting up a DMNetwork. (I found a few examples using older versions, however I could not figure out how these could solved using the newest version).

If some could explain to me the the correct interpretation of the parameters of DMNetworkSetSizes ? Any pointers to examples using the newest API would also be much appreciated.

Most of the examples with DMNetwork have a single network (Nsubnet = 1), except src/snes/examples/tutorials/network/ex1.c. This is a water + electric network simulation that has two subnetworks.

Shri

Thanks and best Regards,
Alex


From juaneah at gmail.com  Fri Nov  8 01:22:28 2019
From: juaneah at gmail.com (Emmanuel Ayala)
Date: Fri, 8 Nov 2019 01:22:28 -0600
Subject: [petsc-users] doubts on VecScatterCreate
In-Reply-To: <35D3FEA6-6B71-44F4-B746-23461B581E9C@anl.gov>
References: <CAMo+o5j367nbE++umf38owsfm5qaCz=KAFt2myu=wUWJFOrwow@mail.gmail.com>
	<35D3FEA6-6B71-44F4-B746-23461B581E9C@anl.gov>
Message-ID: <CAMo+o5gouMXTqmOG0dOjH5-r1TFoeKWUd2cv+mxX=iK-c=v8-Q@mail.gmail.com>

Hi,

Thank you very much for the help and the quickly answer.

After check my code the problem was previous to scattering. *The scatter
routines work perfectly.*

Just to undesrtand, in the line:

ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);

if I use MPI_COMM_WORLD, it means that all the processes have a copy of the
(current local process) index set?
If I use MPI_COMM_SELF, it means that only the local process have
information about the index set?

Kind regards


El lun., 4 de nov. de 2019 a la(s) 08:47, Smith, Barry F. (
bsmith at mcs.anl.gov) escribi?:

>
>    It works for me. Please send a complete code that fails.
>
>
>
>
> > On Nov 3, 2019, at 11:41 PM, Emmanuel Ayala via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >
> > Hi everyone, thanks in advance.
> >
> > I have three parallel vectors: A, B and C. A and B have different sizes,
> and C must be contain these two vectors (MatLab notation C=[A;B]). I need
> to do some operations on C then put back the proper portion of C on A and
> B, then I do some computations on A and B y put again on C, and the loop
> repeats.
> >
> > For these propose I use Scatters:
> >
> > C is created as a parallel vector with size of (sizeA + sizeB) with
> petsc_decide for parallel layout. The vectors have been distributed on the
> same amount of processes.
> >
> > For the specific case with order [A;B]
> >
> > VecGetOwnershipRange(A,&start,&end);
> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);
> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_toC1);// this is
> redundant
> > VecScatterCreate(A,is_fromA,C,is_toC1,&scatter1);
> >
> > VecGetSize(A,&sizeA)
> > VecGetOwnershipRange(B,&start,&end);
> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromB);
> > ISCreateStride(MPI_COMM_WORLD,(end-start),(start+sizeA),1,&is_toC2);
> //shifts the index location
> > VecScatterCreate(B,is_fromB,C,is_toC2,&scatter2);
> >
> > Then I can use
> > VecScatterBegin(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
> > VecScatterEnd(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
> >
> > and
> > VecScatterBegin(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
> > VecScatterEnd(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
> >
> > and the same with B.
> > I used MPI_COMM SELF and I got the same results.
> >
> > The situation is: My results look good for the portion of B, but no for
> the portion of A, there is something that I'm doing wrong with the
> scattering?
> >
> > Best regards.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191108/cacf90c6/attachment.html>

From knepley at gmail.com  Fri Nov  8 04:23:34 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 8 Nov 2019 05:23:34 -0500
Subject: [petsc-users] doubts on VecScatterCreate
In-Reply-To: <CAMo+o5gouMXTqmOG0dOjH5-r1TFoeKWUd2cv+mxX=iK-c=v8-Q@mail.gmail.com>
References: <CAMo+o5j367nbE++umf38owsfm5qaCz=KAFt2myu=wUWJFOrwow@mail.gmail.com>
	<35D3FEA6-6B71-44F4-B746-23461B581E9C@anl.gov>
	<CAMo+o5gouMXTqmOG0dOjH5-r1TFoeKWUd2cv+mxX=iK-c=v8-Q@mail.gmail.com>
Message-ID: <CAMYG4GkOOAwohw+4vdp+D2rgk9tjAaR+1FdjaZNUPSyKpF3Ftw@mail.gmail.com>

On Fri, Nov 8, 2019 at 2:23 AM Emmanuel Ayala via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi,
>
> Thank you very much for the help and the quickly answer.
>
> After check my code the problem was previous to scattering. *The scatter
> routines work perfectly.*
>
> Just to undesrtand, in the line:
>
> ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);
>
> if I use MPI_COMM_WORLD, it means that all the processes have a copy of
> the (current local process) index set?
> If I use MPI_COMM_SELF, it means that only the local process have
> information about the index set?
>

No, each process has whatever you give it in this call. The comm determines
what happens with collective calls, like


https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISGetTotalIndices.html

  Thanks,

    Matt


> Kind regards
>
>
>
> El lun., 4 de nov. de 2019 a la(s) 08:47, Smith, Barry F. (
> bsmith at mcs.anl.gov) escribi?:
>
>>
>>    It works for me. Please send a complete code that fails.
>>
>>
>>
>>
>> > On Nov 3, 2019, at 11:41 PM, Emmanuel Ayala via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> >
>> > Hi everyone, thanks in advance.
>> >
>> > I have three parallel vectors: A, B and C. A and B have different
>> sizes, and C must be contain these two vectors (MatLab notation C=[A;B]). I
>> need to do some operations on C then put back the proper portion of C on A
>> and B, then I do some computations on A and B y put again on C, and the
>> loop repeats.
>> >
>> > For these propose I use Scatters:
>> >
>> > C is created as a parallel vector with size of (sizeA + sizeB) with
>> petsc_decide for parallel layout. The vectors have been distributed on the
>> same amount of processes.
>> >
>> > For the specific case with order [A;B]
>> >
>> > VecGetOwnershipRange(A,&start,&end);
>> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);
>> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_toC1);// this is
>> redundant
>> > VecScatterCreate(A,is_fromA,C,is_toC1,&scatter1);
>> >
>> > VecGetSize(A,&sizeA)
>> > VecGetOwnershipRange(B,&start,&end);
>> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromB);
>> > ISCreateStride(MPI_COMM_WORLD,(end-start),(start+sizeA),1,&is_toC2);
>> //shifts the index location
>> > VecScatterCreate(B,is_fromB,C,is_toC2,&scatter2);
>> >
>> > Then I can use
>> > VecScatterBegin(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
>> > VecScatterEnd(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
>> >
>> > and
>> > VecScatterBegin(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
>> > VecScatterEnd(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
>> >
>> > and the same with B.
>> > I used MPI_COMM SELF and I got the same results.
>> >
>> > The situation is: My results look good for the portion of B, but no for
>> the portion of A, there is something that I'm doing wrong with the
>> scattering?
>> >
>> > Best regards.
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191108/682ae801/attachment.html>

From juaneah at gmail.com  Fri Nov  8 11:11:20 2019
From: juaneah at gmail.com (Emmanuel Ayala)
Date: Fri, 8 Nov 2019 11:11:20 -0600
Subject: [petsc-users] doubts on VecScatterCreate
In-Reply-To: <CAMYG4GkOOAwohw+4vdp+D2rgk9tjAaR+1FdjaZNUPSyKpF3Ftw@mail.gmail.com>
References: <CAMo+o5j367nbE++umf38owsfm5qaCz=KAFt2myu=wUWJFOrwow@mail.gmail.com>
	<35D3FEA6-6B71-44F4-B746-23461B581E9C@anl.gov>
	<CAMo+o5gouMXTqmOG0dOjH5-r1TFoeKWUd2cv+mxX=iK-c=v8-Q@mail.gmail.com>
	<CAMYG4GkOOAwohw+4vdp+D2rgk9tjAaR+1FdjaZNUPSyKpF3Ftw@mail.gmail.com>
Message-ID: <CAMo+o5gfBAUN7Ybv2J7+a2Cwc7kOaD8Tbbb=cjyOB0VZaZvJiw@mail.gmail.com>

Ok, thanks for the clarification.

Kind regards.

El vie., 8 de nov. de 2019 a la(s) 04:23, Matthew Knepley (knepley at gmail.com)
escribi?:

> On Fri, Nov 8, 2019 at 2:23 AM Emmanuel Ayala via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Hi,
>>
>> Thank you very much for the help and the quickly answer.
>>
>> After check my code the problem was previous to scattering. *The scatter
>> routines work perfectly.*
>>
>> Just to undesrtand, in the line:
>>
>> ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);
>>
>> if I use MPI_COMM_WORLD, it means that all the processes have a copy of
>> the (current local process) index set?
>> If I use MPI_COMM_SELF, it means that only the local process have
>> information about the index set?
>>
>
> No, each process has whatever you give it in this call. The comm
> determines what happens with collective calls, like
>
>
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISGetTotalIndices.html
>
>   Thanks,
>
>     Matt
>
>
>> Kind regards
>>
>>
>>
>> El lun., 4 de nov. de 2019 a la(s) 08:47, Smith, Barry F. (
>> bsmith at mcs.anl.gov) escribi?:
>>
>>>
>>>    It works for me. Please send a complete code that fails.
>>>
>>>
>>>
>>>
>>> > On Nov 3, 2019, at 11:41 PM, Emmanuel Ayala via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> >
>>> > Hi everyone, thanks in advance.
>>> >
>>> > I have three parallel vectors: A, B and C. A and B have different
>>> sizes, and C must be contain these two vectors (MatLab notation C=[A;B]). I
>>> need to do some operations on C then put back the proper portion of C on A
>>> and B, then I do some computations on A and B y put again on C, and the
>>> loop repeats.
>>> >
>>> > For these propose I use Scatters:
>>> >
>>> > C is created as a parallel vector with size of (sizeA + sizeB) with
>>> petsc_decide for parallel layout. The vectors have been distributed on the
>>> same amount of processes.
>>> >
>>> > For the specific case with order [A;B]
>>> >
>>> > VecGetOwnershipRange(A,&start,&end);
>>> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromA);
>>> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_toC1);// this is
>>> redundant
>>> > VecScatterCreate(A,is_fromA,C,is_toC1,&scatter1);
>>> >
>>> > VecGetSize(A,&sizeA)
>>> > VecGetOwnershipRange(B,&start,&end);
>>> > ISCreateStride(MPI_COMM_WORLD,(end-start),start,1,&is_fromB);
>>> > ISCreateStride(MPI_COMM_WORLD,(end-start),(start+sizeA),1,&is_toC2);
>>> //shifts the index location
>>> > VecScatterCreate(B,is_fromB,C,is_toC2,&scatter2);
>>> >
>>> > Then I can use
>>> > VecScatterBegin(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
>>> > VecScatterEnd(scatter1,A,C,INSERT_VALUES,SCATTER_FORWARD);
>>> >
>>> > and
>>> > VecScatterBegin(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
>>> > VecScatterEnd(scatter1,C,A,INSERT_VALUES,SCATTER_REVERSE);
>>> >
>>> > and the same with B.
>>> > I used MPI_COMM SELF and I got the same results.
>>> >
>>> > The situation is: My results look good for the portion of B, but no
>>> for the portion of A, there is something that I'm doing wrong with the
>>> scattering?
>>> >
>>> > Best regards.
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191108/86853cb0/attachment-0001.html>

From mlohry at gmail.com  Mon Nov 11 08:20:59 2019
From: mlohry at gmail.com (Mark Lohry)
Date: Mon, 11 Nov 2019 09:20:59 -0500
Subject: [petsc-users] Line search Ended due to ynorm,
 nondeterministic stagnation
Message-ID: <CAJ7aB-LXbpHNn34UMx9L2vstJ-S3uh6=bK7YnONHVjU5f0OeQQ@mail.gmail.com>

Symptom: running the same code on the same core count multiple times, 80%
of the time it converges out no problem. 20% of the time SNES seems to
reset to the previous step, showing
"Line search: Ended due to ynorm < stol*xnorm"

Sample output below. Everything is healthy until TS step 7, which starts
with a norm of 1.181275684011e-04. After that point, every KSP converges to
RTOL and I see the line search message, and the subsequent timesteps all
seem like they don't use the previous update, because they have exactly the
same residuals.

Running the same thing again, it happily proceeds through and converges out
without the stagnation.

Does this sound familiar to anyone?


5 TS dt 30. time 150.
    0 SNES Function norm 2.654593713313e-03
      0 KSP Residual norm 2.654593713313e-03
...
      41 KSP Residual norm 2.515907124549e-04
    Linear solve converged due to CONVERGED_RTOL iterations 41
        Line search: gnorm after quadratic fit 2.445531672458e-03
        Line search: Quadratically determined step,
lambda=1.5043681801723077e-01
    1 SNES Function norm 2.445531672458e-03
      0 KSP Residual norm 2.445531672458e-03
...
     40 KSP Residual norm 2.281075491298e-04
    Linear solve converged due to CONVERGED_RTOL iterations 40
        Line search: gnorm after quadratic fit 2.158781959371e-03
        Line search: Quadratically determined step,
lambda=2.1593140643569805e-01
    2 SNES Function norm 2.158781959371e-03
      0 KSP Residual norm 2.158781959371e-03
     40 KSP Residual norm 1.902363860750e-04
    Linear solve converged due to CONVERGED_RTOL iterations 40
        Line search: gnorm after quadratic fit 1.727564041943e-03
        Line search: Quadratically determined step,
lambda=3.7179194957755984e-01
    3 SNES Function norm 1.727564041943e-03
      0 KSP Residual norm 1.727564041943e-03
...
     38 KSP Residual norm 1.572811703375e-04
    Linear solve converged due to CONVERGED_RTOL iterations 38
        Line search: Using full step: fnorm 1.727564041943e-03 gnorm
1.074289112389e-03
    4 SNES Function norm 1.074289112389e-03
      0 KSP Residual norm 1.074289112389e-03
...
      18 KSP Residual norm 1.047939125382e-04
    Linear solve converged due to CONVERGED_RTOL iterations 18
        Line search: Using full step: fnorm 1.074289112389e-03 gnorm
1.049627557624e-04
    5 SNES Function norm 1.049627557624e-04
  Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 5
      TSAdapt none beuler 0: step   5 accepted t=150        + 3.000e+01
dt=3.000e+01
 6 TS dt 30. time 180.
    0 SNES Function norm 1.181275684011e-04
      0 KSP Residual norm 1.181275684011e-04
...

     44 KSP Residual norm 1.107203720458e-05
    Linear solve converged due to CONVERGED_RTOL iterations 44
        Line search: Ended due to ynorm < stol*xnorm (1.034132241459e-04 <
2.190637006857e-04).
    1 SNES Function norm 1.181275684011e-04
  Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
      TSAdapt none beuler 0: step   6 accepted t=180        + 3.000e+01
dt=3.000e+01
7 TS dt 30. time 210.
    0 SNES Function norm 1.181275684011e-04
      0 KSP Residual norm 1.181275684011e-04
...
44 KSP Residual norm 1.110103600083e-05
    Linear solve converged due to CONVERGED_RTOL iterations 44
        Line search: Ended due to ynorm < stol*xnorm (1.047067861804e-04 <
2.190637006857e-04).
    1 SNES Function norm 1.181275684011e-04
  Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
      TSAdapt none beuler 0: step   7 accepted t=210        + 3.000e+01
dt=3.000e+01
8 TS dt 30. time 240.
    0 SNES Function norm 1.181275684011e-04
      0 KSP Residual norm 1.181275684011e-04
...
     44 KSP Residual norm 1.102709658960e-05
    Linear solve converged due to CONVERGED_RTOL iterations 44
        Line search: Ended due to ynorm < stol*xnorm (1.035351833216e-04 <
2.190637006857e-04).
    1 SNES Function norm 1.181275684011e-04
  Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
      TSAdapt none beuler 0: step   8 accepted t=240        + 3.000e+01
dt=3.000e+01
         9  2.700e+02  3.000e+01  1.72412e-09  2.28859e-07  2.39889e-07
 9.69858e-08  4.71512e-06  3.30532e-02  1.83203e+00 -3.38502e-01
-3.09701e+00  1.96301e-02  3.53675e-02  3.96395e+01
9 TS dt 30. time 270.
    0 SNES Function norm 1.181275684011e-04
      0 KSP Residual norm 1.181275684011e-04
...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191111/91766627/attachment.html>

From bsmith at mcs.anl.gov  Mon Nov 11 16:49:12 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Mon, 11 Nov 2019 22:49:12 +0000
Subject: [petsc-users] Line search Ended due to ynorm,
 nondeterministic stagnation
In-Reply-To: <CAJ7aB-LXbpHNn34UMx9L2vstJ-S3uh6=bK7YnONHVjU5f0OeQQ@mail.gmail.com>
References: <CAJ7aB-LXbpHNn34UMx9L2vstJ-S3uh6=bK7YnONHVjU5f0OeQQ@mail.gmail.com>
Message-ID: <3B6F0D75-3509-453C-B2B2-A6BE7ADF9D8C@anl.gov>

 
  Mark,

   What are you using for KSP rtol ? It looks like 1.e-1 from 

>      0 KSP Residual norm 2.654593713313e-03 
> ...
>       41 KSP Residual norm 2.515907124549e-04 

   What about SNES stol, are you setting that?

>         Line search: Ended due to ynorm < stol*xnorm (1.047067861804e-04 < 2.190637006857e-04).


   Any idea of the order of magnitude of the solution? These numbers are much larger than one normally sees in this situation.

   It looks like the linear solve is just not accurate enough so a descent direction is not generated and hence the line search has to give up. The problem comes up only in some runs because the linear solve is right on the cusp of good enough to find a descent direction and depending on the order of operations in the linear solve the solution sometimes is a descent direction and sometimes is not producing inconsistent behavior.

   I would just make the linear solver more accurate, pay the cost of a bit more time in trade off for removing the problems with failure


   Barry


> On Nov 11, 2019, at 8:20 AM, Mark Lohry via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Symptom: running the same code on the same core count multiple times, 80% of the time it converges out no problem. 20% of the time SNES seems to reset to the previous step, showing 
> "Line search: Ended due to ynorm < stol*xnorm"
> 
> Sample output below. Everything is healthy until TS step 7, which starts with a norm of 1.181275684011e-04. After that point, every KSP converges to RTOL and I see the line search message, and the subsequent timesteps all seem like they don't use the previous update, because they have exactly the same residuals.
> 
> Running the same thing again, it happily proceeds through and converges out without the stagnation.
> 
> Does this sound familiar to anyone?
> 
> 
> 5 TS dt 30. time 150.
>     0 SNES Function norm 2.654593713313e-03 
>       0 KSP Residual norm 2.654593713313e-03 
> ...
>       41 KSP Residual norm 2.515907124549e-04 
>     Linear solve converged due to CONVERGED_RTOL iterations 41
>         Line search: gnorm after quadratic fit 2.445531672458e-03
>         Line search: Quadratically determined step, lambda=1.5043681801723077e-01
>     1 SNES Function norm 2.445531672458e-03
>       0 KSP Residual norm 2.445531672458e-03 
> ...
>      40 KSP Residual norm 2.281075491298e-04 
>     Linear solve converged due to CONVERGED_RTOL iterations 40
>         Line search: gnorm after quadratic fit 2.158781959371e-03
>         Line search: Quadratically determined step, lambda=2.1593140643569805e-01
>     2 SNES Function norm 2.158781959371e-03 
>       0 KSP Residual norm 2.158781959371e-03 
>      40 KSP Residual norm 1.902363860750e-04 
>     Linear solve converged due to CONVERGED_RTOL iterations 40
>         Line search: gnorm after quadratic fit 1.727564041943e-03
>         Line search: Quadratically determined step, lambda=3.7179194957755984e-01
>     3 SNES Function norm 1.727564041943e-03 
>       0 KSP Residual norm 1.727564041943e-03 
> ...
>      38 KSP Residual norm 1.572811703375e-04 
>     Linear solve converged due to CONVERGED_RTOL iterations 38
>         Line search: Using full step: fnorm 1.727564041943e-03 gnorm 1.074289112389e-03
>     4 SNES Function norm 1.074289112389e-03 
>       0 KSP Residual norm 1.074289112389e-03 
> ...
>       18 KSP Residual norm 1.047939125382e-04 
>     Linear solve converged due to CONVERGED_RTOL iterations 18
>         Line search: Using full step: fnorm 1.074289112389e-03 gnorm 1.049627557624e-04
>     5 SNES Function norm 1.049627557624e-04 
>   Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 5
>       TSAdapt none beuler 0: step   5 accepted t=150        + 3.000e+01 dt=3.000e+01 
>  6 TS dt 30. time 180.
>     0 SNES Function norm 1.181275684011e-04 
>       0 KSP Residual norm 1.181275684011e-04 
> ...
> 
>      44 KSP Residual norm 1.107203720458e-05 
>     Linear solve converged due to CONVERGED_RTOL iterations 44
>         Line search: Ended due to ynorm < stol*xnorm (1.034132241459e-04 < 2.190637006857e-04).
>     1 SNES Function norm 1.181275684011e-04 
>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
>       TSAdapt none beuler 0: step   6 accepted t=180        + 3.000e+01 dt=3.000e+01 
> 7 TS dt 30. time 210.
>     0 SNES Function norm 1.181275684011e-04 
>       0 KSP Residual norm 1.181275684011e-04 
> ...
> 44 KSP Residual norm 1.110103600083e-05 
>     Linear solve converged due to CONVERGED_RTOL iterations 44
>         Line search: Ended due to ynorm < stol*xnorm (1.047067861804e-04 < 2.190637006857e-04).
>     1 SNES Function norm 1.181275684011e-04 
>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
>       TSAdapt none beuler 0: step   7 accepted t=210        + 3.000e+01 dt=3.000e+01 
> 8 TS dt 30. time 240.
>     0 SNES Function norm 1.181275684011e-04 
>       0 KSP Residual norm 1.181275684011e-04 
> ...
>      44 KSP Residual norm 1.102709658960e-05 
>     Linear solve converged due to CONVERGED_RTOL iterations 44
>         Line search: Ended due to ynorm < stol*xnorm (1.035351833216e-04 < 2.190637006857e-04).
>     1 SNES Function norm 1.181275684011e-04 
>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
>       TSAdapt none beuler 0: step   8 accepted t=240        + 3.000e+01 dt=3.000e+01 
>          9  2.700e+02  3.000e+01  1.72412e-09  2.28859e-07  2.39889e-07  9.69858e-08  4.71512e-06  3.30532e-02  1.83203e+00 -3.38502e-01 -3.09701e+00  1.96301e-02  3.53675e-02  3.96395e+01
> 9 TS dt 30. time 270.
>     0 SNES Function norm 1.181275684011e-04 
>       0 KSP Residual norm 1.181275684011e-04 
> ...
> 
> 


From gideon.simpson at gmail.com  Mon Nov 11 19:00:32 2019
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Mon, 11 Nov 2019 20:00:32 -0500
Subject: [petsc-users] ts behavior question
Message-ID: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>

I noticed that when I am solving a problem with the ts and I am *not* using
a da, if I want to use an implicit time stepping routine:
1. I have to explicitly provide the Jacobian
2. When I do provide the Jacobian, if I want to access the elements of x(t)
to construct f(t,x), I need to use a const PetscScalar and a
VecGetArrayRead to get it to work.
3.  My code works without declaring const when I'm using an explicit scheme.

In contrast, if I solve a problem using a da, my code works, I can use
implicit schemes without having to provide the Jacobian, and I don't have
to use const anywhere.

Can someone clarify what is expected/preferred?

-- 
gideon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191111/034ef4d1/attachment.html>

From bsmith at mcs.anl.gov  Mon Nov 11 23:33:50 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 12 Nov 2019 05:33:50 +0000
Subject: [petsc-users] ts behavior question
In-Reply-To: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
References: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
Message-ID: <7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>


> On Nov 11, 2019, at 7:00 PM, Gideon Simpson via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> I noticed that when I am solving a problem with the ts and I am *not* using a da, if I want to use an implicit time stepping routine:
> 1. I have to explicitly provide the Jacobian

   Yes

> 2. When I do provide the Jacobian, if I want to access the elements of x(t) to construct f(t,x), I need to use a const PetscScalar and a VecGetArrayRead to get it to work.

  Presumably you call VecGetArray() instead? 
> 
>  
> 3.  My code works without declaring const when I'm using an explicit scheme.
> 
> In contrast, if I solve a problem using a da, my code works, I can use implicit schemes without having to provide the Jacobian, and I don't have to use const anywhere.

  The use with DMDA provides automatic routines for computing the needed Jacobians using finite differencing of your provided function and coloring of the Jacobian. This results in reasonably efficient computation of Jacobians that work in most  (almost all) cases.
> 
> Can someone clarify what is expected/preferred?

  You should always use VecGetArrayRead() for vectors you are accessing but NOT changing the values in. There is no reason not and it provides the potential for higher performance.

  The algebraic solvers have additional checks to prevent peopled from inadvertently changing the entries in x (which would produce bugs). Presumably this results in generating an error when you call VecGetArray(). At least some of the TS explicit calls do not have such checks. They could be added and should be added.  https://gitlab.com/petsc/petsc/issues/493

  Thanks for pointing out the inconsistency

  Barry

> 
> -- 
> gideon


From gideon.simpson at gmail.com  Tue Nov 12 09:26:21 2019
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 12 Nov 2019 10:26:21 -0500
Subject: [petsc-users] ts behavior question
In-Reply-To: <7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
References: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
	<7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
Message-ID: <CAAC-AeJpgBVHD65N5o05Xu-py5i8QxYMztDxqpafUFBu4+5kow@mail.gmail.com>

So, in principle, should we actually be using DMDAVecGetArrayRead in this
context?  I seem to be able to get away with DMDAVecGetArray with all time
steppers.

On Tue, Nov 12, 2019 at 12:33 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>
> > On Nov 11, 2019, at 7:00 PM, Gideon Simpson via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >
> > I noticed that when I am solving a problem with the ts and I am *not*
> using a da, if I want to use an implicit time stepping routine:
> > 1. I have to explicitly provide the Jacobian
>
>    Yes
>
> > 2. When I do provide the Jacobian, if I want to access the elements of
> x(t) to construct f(t,x), I need to use a const PetscScalar and a
> VecGetArrayRead to get it to work.
>
>   Presumably you call VecGetArray() instead?
> >
> >
> > 3.  My code works without declaring const when I'm using an explicit
> scheme.
> >
> > In contrast, if I solve a problem using a da, my code works, I can use
> implicit schemes without having to provide the Jacobian, and I don't have
> to use const anywhere.
>
>   The use with DMDA provides automatic routines for computing the needed
> Jacobians using finite differencing of your provided function and coloring
> of the Jacobian. This results in reasonably efficient computation of
> Jacobians that work in most  (almost all) cases.
> >
> > Can someone clarify what is expected/preferred?
>
>   You should always use VecGetArrayRead() for vectors you are accessing
> but NOT changing the values in. There is no reason not and it provides the
> potential for higher performance.
>
>   The algebraic solvers have additional checks to prevent peopled from
> inadvertently changing the entries in x (which would produce bugs).
> Presumably this results in generating an error when you call VecGetArray().
> At least some of the TS explicit calls do not have such checks. They could
> be added and should be added.  https://gitlab.com/petsc/petsc/issues/493
>
>   Thanks for pointing out the inconsistency
>
>   Barry
>
> >
> > --
> > gideon
>
>

-- 
gideon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191112/351d07d4/attachment.html>

From bsmith at mcs.anl.gov  Tue Nov 12 09:43:43 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 12 Nov 2019 15:43:43 +0000
Subject: [petsc-users] ts behavior question
In-Reply-To: <CAAC-AeJpgBVHD65N5o05Xu-py5i8QxYMztDxqpafUFBu4+5kow@mail.gmail.com>
References: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
	<7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
	<CAAC-AeJpgBVHD65N5o05Xu-py5i8QxYMztDxqpafUFBu4+5kow@mail.gmail.com>
Message-ID: <8DE0C027-9119-4FC0-A638-CDA47814EBB1@mcs.anl.gov>


  For any vector you only read you should use the read version.

  Sometimes the vector may not be locked and hence the other routine can be used but that may change as we add more locks and improve the code. So best to do it right

> On Nov 12, 2019, at 9:26 AM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> So, in principle, should we actually be using DMDAVecGetArrayRead in this context?  I seem to be able to get away with DMDAVecGetArray with all time steppers.

  I am not sure why DMDAVecGetArray would  work if VecGetArray did not work. Internally it calls VecGetArray() that will do the check. If you call it on local ghosted vectors it doesn't check if the vector is locked since the ghosted version is a copy of the true locked vector.

   Barry

> 
> On Tue, Nov 12, 2019 at 12:33 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
> 
> > On Nov 11, 2019, at 7:00 PM, Gideon Simpson via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > 
> > I noticed that when I am solving a problem with the ts and I am *not* using a da, if I want to use an implicit time stepping routine:
> > 1. I have to explicitly provide the Jacobian
> 
>    Yes
> 
> > 2. When I do provide the Jacobian, if I want to access the elements of x(t) to construct f(t,x), I need to use a const PetscScalar and a VecGetArrayRead to get it to work.
> 
>   Presumably you call VecGetArray() instead? 
> > 
> >  
> > 3.  My code works without declaring const when I'm using an explicit scheme.
> > 
> > In contrast, if I solve a problem using a da, my code works, I can use implicit schemes without having to provide the Jacobian, and I don't have to use const anywhere.
> 
>   The use with DMDA provides automatic routines for computing the needed Jacobians using finite differencing of your provided function and coloring of the Jacobian. This results in reasonably efficient computation of Jacobians that work in most  (almost all) cases.
> > 
> > Can someone clarify what is expected/preferred?
> 
>   You should always use VecGetArrayRead() for vectors you are accessing but NOT changing the values in. There is no reason not and it provides the potential for higher performance.
> 
>   The algebraic solvers have additional checks to prevent peopled from inadvertently changing the entries in x (which would produce bugs). Presumably this results in generating an error when you call VecGetArray(). At least some of the TS explicit calls do not have such checks. They could be added and should be added.  https://gitlab.com/petsc/petsc/issues/493
> 
>   Thanks for pointing out the inconsistency
> 
>   Barry
> 
> > 
> > -- 
> > gideon
> 
> 
> 
> -- 
> gideon


From hongzhang at anl.gov  Tue Nov 12 09:58:19 2019
From: hongzhang at anl.gov (Zhang, Hong)
Date: Tue, 12 Nov 2019 15:58:19 +0000
Subject: [petsc-users] ts behavior question
In-Reply-To: <7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
References: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
	<7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
Message-ID: <7CE1276E-EDD7-4245-8E44-B7127326A27C@anl.gov>


> On Nov 11, 2019, at 11:33 PM, Smith, Barry F. via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> 
> 
>> On Nov 11, 2019, at 7:00 PM, Gideon Simpson via petsc-users <petsc-users at mcs.anl.gov> wrote:
>> 
>> I noticed that when I am solving a problem with the ts and I am *not* using a da, if I want to use an implicit time stepping routine:
>> 1. I have to explicitly provide the Jacobian
> 
>   Yes

Alternatively, -snes_fd can be used to approximate the Jacobian with normal finite differences (no coloring). The FD approximation is not efficient, but should work for small problems, and it is also useful for testing your hand-written Jacobian (via -snes_test_jacobian) 

Hong (Mr.)

> 
>> 2. When I do provide the Jacobian, if I want to access the elements of x(t) to construct f(t,x), I need to use a const PetscScalar and a VecGetArrayRead to get it to work.
> 
>  Presumably you call VecGetArray() instead? 
>> 
>> 
>> 3.  My code works without declaring const when I'm using an explicit scheme.
>> 
>> In contrast, if I solve a problem using a da, my code works, I can use implicit schemes without having to provide the Jacobian, and I don't have to use const anywhere.
> 
>  The use with DMDA provides automatic routines for computing the needed Jacobians using finite differencing of your provided function and coloring of the Jacobian. This results in reasonably efficient computation of Jacobians that work in most  (almost all) cases.
>> 
>> Can someone clarify what is expected/preferred?
> 
>  You should always use VecGetArrayRead() for vectors you are accessing but NOT changing the values in. There is no reason not and it provides the potential for higher performance.
> 
>  The algebraic solvers have additional checks to prevent peopled from inadvertently changing the entries in x (which would produce bugs). Presumably this results in generating an error when you call VecGetArray(). At least some of the TS explicit calls do not have such checks. They could be added and should be added.  https://gitlab.com/petsc/petsc/issues/493
> 
>  Thanks for pointing out the inconsistency
> 
>  Barry
> 
>> 
>> -- 
>> gideon
> 


From gideon.simpson at gmail.com  Tue Nov 12 14:09:55 2019
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 12 Nov 2019 15:09:55 -0500
Subject: [petsc-users] ts behavior question
In-Reply-To: <8DE0C027-9119-4FC0-A638-CDA47814EBB1@mcs.anl.gov>
References: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
	<7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
	<CAAC-AeJpgBVHD65N5o05Xu-py5i8QxYMztDxqpafUFBu4+5kow@mail.gmail.com>
	<8DE0C027-9119-4FC0-A638-CDA47814EBB1@mcs.anl.gov>
Message-ID: <CAAC-AeL3ikx+Bh_cNqWsUKgAwZNTqXA98W-XwOh4u4CHqz5Vnw@mail.gmail.com>

So this might be a resolution/another question.  Part of the reason to use
the da is that it provides you with ghost points.  If you're only accessing
the dependent variables entries with DMDAVecGetArrayRead, then you can't
modify the ghost points.  If you can't modify the ghost points here, where
would you do so in the context of a problem with, for instance, time
dependent boundary conditions?

On Tue, Nov 12, 2019 at 10:43 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>   For any vector you only read you should use the read version.
>
>   Sometimes the vector may not be locked and hence the other routine can
> be used but that may change as we add more locks and improve the code. So
> best to do it right
>
> > On Nov 12, 2019, at 9:26 AM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
> >
> > So, in principle, should we actually be using DMDAVecGetArrayRead in
> this context?  I seem to be able to get away with DMDAVecGetArray with all
> time steppers.
>
>   I am not sure why DMDAVecGetArray would  work if VecGetArray did not
> work. Internally it calls VecGetArray() that will do the check. If you call
> it on local ghosted vectors it doesn't check if the vector is locked since
> the ghosted version is a copy of the true locked vector.
>
>    Barry
>
> >
> > On Tue, Nov 12, 2019 at 12:33 AM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> >
> >
> > > On Nov 11, 2019, at 7:00 PM, Gideon Simpson via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > >
> > > I noticed that when I am solving a problem with the ts and I am *not*
> using a da, if I want to use an implicit time stepping routine:
> > > 1. I have to explicitly provide the Jacobian
> >
> >    Yes
> >
> > > 2. When I do provide the Jacobian, if I want to access the elements of
> x(t) to construct f(t,x), I need to use a const PetscScalar and a
> VecGetArrayRead to get it to work.
> >
> >   Presumably you call VecGetArray() instead?
> > >
> > >
> > > 3.  My code works without declaring const when I'm using an explicit
> scheme.
> > >
> > > In contrast, if I solve a problem using a da, my code works, I can use
> implicit schemes without having to provide the Jacobian, and I don't have
> to use const anywhere.
> >
> >   The use with DMDA provides automatic routines for computing the needed
> Jacobians using finite differencing of your provided function and coloring
> of the Jacobian. This results in reasonably efficient computation of
> Jacobians that work in most  (almost all) cases.
> > >
> > > Can someone clarify what is expected/preferred?
> >
> >   You should always use VecGetArrayRead() for vectors you are accessing
> but NOT changing the values in. There is no reason not and it provides the
> potential for higher performance.
> >
> >   The algebraic solvers have additional checks to prevent peopled from
> inadvertently changing the entries in x (which would produce bugs).
> Presumably this results in generating an error when you call VecGetArray().
> At least some of the TS explicit calls do not have such checks. They could
> be added and should be added.  https://gitlab.com/petsc/petsc/issues/493
> >
> >   Thanks for pointing out the inconsistency
> >
> >   Barry
> >
> > >
> > > --
> > > gideon
> >
> >
> >
> > --
> > gideon
>
>

-- 
gideon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191112/ce0ec8b4/attachment.html>

From bsmith at mcs.anl.gov  Tue Nov 12 14:41:38 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 12 Nov 2019 20:41:38 +0000
Subject: [petsc-users] ts behavior question
In-Reply-To: <CAAC-AeL3ikx+Bh_cNqWsUKgAwZNTqXA98W-XwOh4u4CHqz5Vnw@mail.gmail.com>
References: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
	<7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
	<CAAC-AeJpgBVHD65N5o05Xu-py5i8QxYMztDxqpafUFBu4+5kow@mail.gmail.com>
	<8DE0C027-9119-4FC0-A638-CDA47814EBB1@mcs.anl.gov>
	<CAAC-AeL3ikx+Bh_cNqWsUKgAwZNTqXA98W-XwOh4u4CHqz5Vnw@mail.gmail.com>
Message-ID: <28124FE8-4FD1-4BAD-8FEC-52B6177EB178@mcs.anl.gov>


> On Nov 12, 2019, at 2:09 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> So this might be a resolution/another question.  Part of the reason to use the da is that it provides you with ghost points.  If you're only accessing the dependent variables entries with DMDAVecGetArrayRead, then you can't modify the ghost points.  If you can't modify the ghost points here, where would you do so in the context of a problem with, for instance, time dependent boundary conditions?

   In that case, as I say below, you have a ghosted local copy and you can put whatever values you wish into those ghosted locations. That is, when using ghosted local vectors you don't need to use the Read() version.

  Barry

  Note: if I were writing the code I would open the ghosted local input vector as writeable to put in the ghost values. Close it and then separately open it again as Read() to use in compute the needed TS functions. This is certainly not necessary but it helps with code maintainability and to decreases the likelihood of bugs. You have one set of access where you are legitimately changing values thus should not use Read() and another where you should not be changing values and thus should use read().


> 
> On Tue, Nov 12, 2019 at 10:43 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
>   For any vector you only read you should use the read version.
> 
>   Sometimes the vector may not be locked and hence the other routine can be used but that may change as we add more locks and improve the code. So best to do it right
> 
> > On Nov 12, 2019, at 9:26 AM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> > 
> > So, in principle, should we actually be using DMDAVecGetArrayRead in this context?  I seem to be able to get away with DMDAVecGetArray with all time steppers.
> 
>   I am not sure why DMDAVecGetArray would  work if VecGetArray did not work. Internally it calls VecGetArray() that will do the check. If you call it on local ghosted vectors it doesn't check if the vector is locked since the ghosted version is a copy of the true locked vector.
> 
>    Barry
> 
> > 
> > On Tue, Nov 12, 2019 at 12:33 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> > 
> > 
> > > On Nov 11, 2019, at 7:00 PM, Gideon Simpson via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > > 
> > > I noticed that when I am solving a problem with the ts and I am *not* using a da, if I want to use an implicit time stepping routine:
> > > 1. I have to explicitly provide the Jacobian
> > 
> >    Yes
> > 
> > > 2. When I do provide the Jacobian, if I want to access the elements of x(t) to construct f(t,x), I need to use a const PetscScalar and a VecGetArrayRead to get it to work.
> > 
> >   Presumably you call VecGetArray() instead? 
> > > 
> > >  
> > > 3.  My code works without declaring const when I'm using an explicit scheme.
> > > 
> > > In contrast, if I solve a problem using a da, my code works, I can use implicit schemes without having to provide the Jacobian, and I don't have to use const anywhere.
> > 
> >   The use with DMDA provides automatic routines for computing the needed Jacobians using finite differencing of your provided function and coloring of the Jacobian. This results in reasonably efficient computation of Jacobians that work in most  (almost all) cases.
> > > 
> > > Can someone clarify what is expected/preferred?
> > 
> >   You should always use VecGetArrayRead() for vectors you are accessing but NOT changing the values in. There is no reason not and it provides the potential for higher performance.
> > 
> >   The algebraic solvers have additional checks to prevent peopled from inadvertently changing the entries in x (which would produce bugs). Presumably this results in generating an error when you call VecGetArray(). At least some of the TS explicit calls do not have such checks. They could be added and should be added.  https://gitlab.com/petsc/petsc/issues/493
> > 
> >   Thanks for pointing out the inconsistency
> > 
> >   Barry
> > 
> > > 
> > > -- 
> > > gideon
> > 
> > 
> > 
> > -- 
> > gideon
> 
> 
> 
> -- 
> gideon


From gideon.simpson at gmail.com  Tue Nov 12 14:48:00 2019
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 12 Nov 2019 15:48:00 -0500
Subject: [petsc-users] ts behavior question
In-Reply-To: <28124FE8-4FD1-4BAD-8FEC-52B6177EB178@mcs.anl.gov>
References: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
	<7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
	<CAAC-AeJpgBVHD65N5o05Xu-py5i8QxYMztDxqpafUFBu4+5kow@mail.gmail.com>
	<8DE0C027-9119-4FC0-A638-CDA47814EBB1@mcs.anl.gov>
	<CAAC-AeL3ikx+Bh_cNqWsUKgAwZNTqXA98W-XwOh4u4CHqz5Vnw@mail.gmail.com>
	<28124FE8-4FD1-4BAD-8FEC-52B6177EB178@mcs.anl.gov>
Message-ID: <CAAC-Ae+yoHSOjfggkRfswAoP=dzRiNDc0g_hWeyYigKkBO5t+w@mail.gmail.com>

I think I'm almost with you.  In my code, I make a local copy of the vector
(with DMGetLocalVector) and after calling GlobaltoLocal, I call
DMDAVecGetArray on the local vector.  I use the array I obtain of this
local copy in populating by right hand side function.  Is that consistent
with your the approach that you guys recommend?  If I were to do as you say
and have a separate set of calls for populating the ghost points, where
would this fit in the ts framework?  Are are you saying this would be done
at the beginning of the RHS function?

On Tue, Nov 12, 2019 at 3:41 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>
> > On Nov 12, 2019, at 2:09 PM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
> >
> > So this might be a resolution/another question.  Part of the reason to
> use the da is that it provides you with ghost points.  If you're only
> accessing the dependent variables entries with DMDAVecGetArrayRead, then
> you can't modify the ghost points.  If you can't modify the ghost points
> here, where would you do so in the context of a problem with, for instance,
> time dependent boundary conditions?
>
>    In that case, as I say below, you have a ghosted local copy and you can
> put whatever values you wish into those ghosted locations. That is, when
> using ghosted local vectors you don't need to use the Read() version.
>
>   Barry
>
>   Note: if I were writing the code I would open the ghosted local input
> vector as writeable to put in the ghost values. Close it and then
> separately open it again as Read() to use in compute the needed TS
> functions. This is certainly not necessary but it helps with code
> maintainability and to decreases the likelihood of bugs. You have one set
> of access where you are legitimately changing values thus should not use
> Read() and another where you should not be changing values and thus should
> use read().
>
>
>
>
> >
> > On Tue, Nov 12, 2019 at 10:43 AM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> >
> >   For any vector you only read you should use the read version.
> >
> >   Sometimes the vector may not be locked and hence the other routine can
> be used but that may change as we add more locks and improve the code. So
> best to do it right
> >
> > > On Nov 12, 2019, at 9:26 AM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
> > >
> > > So, in principle, should we actually be using DMDAVecGetArrayRead in
> this context?  I seem to be able to get away with DMDAVecGetArray with all
> time steppers.
> >
> >   I am not sure why DMDAVecGetArray would  work if VecGetArray did not
> work. Internally it calls VecGetArray() that will do the check. If you call
> it on local ghosted vectors it doesn't check if the vector is locked since
> the ghosted version is a copy of the true locked vector.
> >
> >    Barry
> >
> > >
> > > On Tue, Nov 12, 2019 at 12:33 AM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >
> > > > On Nov 11, 2019, at 7:00 PM, Gideon Simpson via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > > >
> > > > I noticed that when I am solving a problem with the ts and I am
> *not* using a da, if I want to use an implicit time stepping routine:
> > > > 1. I have to explicitly provide the Jacobian
> > >
> > >    Yes
> > >
> > > > 2. When I do provide the Jacobian, if I want to access the elements
> of x(t) to construct f(t,x), I need to use a const PetscScalar and a
> VecGetArrayRead to get it to work.
> > >
> > >   Presumably you call VecGetArray() instead?
> > > >
> > > >
> > > > 3.  My code works without declaring const when I'm using an explicit
> scheme.
> > > >
> > > > In contrast, if I solve a problem using a da, my code works, I can
> use implicit schemes without having to provide the Jacobian, and I don't
> have to use const anywhere.
> > >
> > >   The use with DMDA provides automatic routines for computing the
> needed Jacobians using finite differencing of your provided function and
> coloring of the Jacobian. This results in reasonably efficient computation
> of Jacobians that work in most  (almost all) cases.
> > > >
> > > > Can someone clarify what is expected/preferred?
> > >
> > >   You should always use VecGetArrayRead() for vectors you are
> accessing but NOT changing the values in. There is no reason not and it
> provides the potential for higher performance.
> > >
> > >   The algebraic solvers have additional checks to prevent peopled from
> inadvertently changing the entries in x (which would produce bugs).
> Presumably this results in generating an error when you call VecGetArray().
> At least some of the TS explicit calls do not have such checks. They could
> be added and should be added.  https://gitlab.com/petsc/petsc/issues/493
> > >
> > >   Thanks for pointing out the inconsistency
> > >
> > >   Barry
> > >
> > > >
> > > > --
> > > > gideon
> > >
> > >
> > >
> > > --
> > > gideon
> >
> >
> >
> > --
> > gideon
>
>

-- 
gideon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191112/0d1106f2/attachment.html>

From bsmith at mcs.anl.gov  Tue Nov 12 17:26:53 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 12 Nov 2019 23:26:53 +0000
Subject: [petsc-users] ts behavior question
In-Reply-To: <CAAC-Ae+yoHSOjfggkRfswAoP=dzRiNDc0g_hWeyYigKkBO5t+w@mail.gmail.com>
References: <CAAC-AeLwzwAPR4CQR2rNi1_vEn1sAenzVP+SJ99S284k37Eg6g@mail.gmail.com>
	<7EC7C513-A02E-4CFA-80E1-A8A1E783A69A@anl.gov>
	<CAAC-AeJpgBVHD65N5o05Xu-py5i8QxYMztDxqpafUFBu4+5kow@mail.gmail.com>
	<8DE0C027-9119-4FC0-A638-CDA47814EBB1@mcs.anl.gov>
	<CAAC-AeL3ikx+Bh_cNqWsUKgAwZNTqXA98W-XwOh4u4CHqz5Vnw@mail.gmail.com>
	<28124FE8-4FD1-4BAD-8FEC-52B6177EB178@mcs.anl.gov>
	<CAAC-Ae+yoHSOjfggkRfswAoP=dzRiNDc0g_hWeyYigKkBO5t+w@mail.gmail.com>
Message-ID: <DA4EBD98-DDC3-49BE-91CB-0D8F4227E179@mcs.anl.gov>


> On Nov 12, 2019, at 2:48 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> I think I'm almost with you.  In my code, I make a local copy of the vector (with DMGetLocalVector) and after calling GlobaltoLocal, I call DMDAVecGetArray on the local vector.  I use the array I obtain of this local copy in populating by right hand side function.  Is that consistent with your the approach that you guys recommend?

  When you need ghost points yes. 

>  If I were to do as you say and have a separate set of calls for populating the ghost points, where would this fit in the ts framework?

   You could do it after you after the DMDAVecGetArray() call. So it is in the same routine. It would just "separate" more clearly the "setting the ghost values" from the "competing the RHS function".


  Barry


>  Are are you saying this would be done at the beginning of the RHS function?
> 
> On Tue, Nov 12, 2019 at 3:41 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
> 
> > On Nov 12, 2019, at 2:09 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> > 
> > So this might be a resolution/another question.  Part of the reason to use the da is that it provides you with ghost points.  If you're only accessing the dependent variables entries with DMDAVecGetArrayRead, then you can't modify the ghost points.  If you can't modify the ghost points here, where would you do so in the context of a problem with, for instance, time dependent boundary conditions?
> 
>    In that case, as I say below, you have a ghosted local copy and you can put whatever values you wish into those ghosted locations. That is, when using ghosted local vectors you don't need to use the Read() version.
> 
>   Barry
> 
>   Note: if I were writing the code I would open the ghosted local input vector as writeable to put in the ghost values. Close it and then separately open it again as Read() to use in compute the needed TS functions. This is certainly not necessary but it helps with code maintainability and to decreases the likelihood of bugs. You have one set of access where you are legitimately changing values thus should not use Read() and another where you should not be changing values and thus should use read().
> 
> 
> 
> 
> > 
> > On Tue, Nov 12, 2019 at 10:43 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> > 
> >   For any vector you only read you should use the read version.
> > 
> >   Sometimes the vector may not be locked and hence the other routine can be used but that may change as we add more locks and improve the code. So best to do it right
> > 
> > > On Nov 12, 2019, at 9:26 AM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> > > 
> > > So, in principle, should we actually be using DMDAVecGetArrayRead in this context?  I seem to be able to get away with DMDAVecGetArray with all time steppers.
> > 
> >   I am not sure why DMDAVecGetArray would  work if VecGetArray did not work. Internally it calls VecGetArray() that will do the check. If you call it on local ghosted vectors it doesn't check if the vector is locked since the ghosted version is a copy of the true locked vector.
> > 
> >    Barry
> > 
> > > 
> > > On Tue, Nov 12, 2019 at 12:33 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> > > 
> > > 
> > > > On Nov 11, 2019, at 7:00 PM, Gideon Simpson via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > > > 
> > > > I noticed that when I am solving a problem with the ts and I am *not* using a da, if I want to use an implicit time stepping routine:
> > > > 1. I have to explicitly provide the Jacobian
> > > 
> > >    Yes
> > > 
> > > > 2. When I do provide the Jacobian, if I want to access the elements of x(t) to construct f(t,x), I need to use a const PetscScalar and a VecGetArrayRead to get it to work.
> > > 
> > >   Presumably you call VecGetArray() instead? 
> > > > 
> > > >  
> > > > 3.  My code works without declaring const when I'm using an explicit scheme.
> > > > 
> > > > In contrast, if I solve a problem using a da, my code works, I can use implicit schemes without having to provide the Jacobian, and I don't have to use const anywhere.
> > > 
> > >   The use with DMDA provides automatic routines for computing the needed Jacobians using finite differencing of your provided function and coloring of the Jacobian. This results in reasonably efficient computation of Jacobians that work in most  (almost all) cases.
> > > > 
> > > > Can someone clarify what is expected/preferred?
> > > 
> > >   You should always use VecGetArrayRead() for vectors you are accessing but NOT changing the values in. There is no reason not and it provides the potential for higher performance.
> > > 
> > >   The algebraic solvers have additional checks to prevent peopled from inadvertently changing the entries in x (which would produce bugs). Presumably this results in generating an error when you call VecGetArray(). At least some of the TS explicit calls do not have such checks. They could be added and should be added.  https://gitlab.com/petsc/petsc/issues/493
> > > 
> > >   Thanks for pointing out the inconsistency
> > > 
> > >   Barry
> > > 
> > > > 
> > > > -- 
> > > > gideon
> > > 
> > > 
> > > 
> > > -- 
> > > gideon
> > 
> > 
> > 
> > -- 
> > gideon
> 
> 
> 
> -- 
> gideon


From hgbk2008 at gmail.com  Thu Nov 14 14:04:42 2019
From: hgbk2008 at gmail.com (hg)
Date: Thu, 14 Nov 2019 21:04:42 +0100
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAJW_hKcEVFb9EvCAUBWr9mEe9X6Hag4kyfc9d2_r-Z8+L3hYcA@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
	<937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
	<CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>
	<CAJW_hKd2zAagj7Bb0+vHu3c5QVRrL=femq54HBw4w-g2d9w_8w@mail.gmail.com>
	<CAMYG4GnF6WWxpYMWD9xQMYNM_5uoj9xM=z8P4cgac--MmVhAtg@mail.gmail.com>
	<1E28761A-883F-4B66-9BA8-8367881D5BCB@mcs.anl.gov>
	<CAJW_hKcEVFb9EvCAUBWr9mEe9X6Hag4kyfc9d2_r-Z8+L3hYcA@mail.gmail.com>
Message-ID: <CAJW_hKfR33OAijLFuyTage=6624Da6JoebY42ihrjz=OokvjWg@mail.gmail.com>

Hello

It turns out that hwloc is not installed on the cluster system that I'm
using. Without hwloc, pastix will run into the branch using
sched_setaffinity and caused error (see above at sopalin_thread.c). I'm not
able to understand and find a solution with sched_setaffinity so I think
enabling hwloc is an easier solution. Between, hwloc is recommended to
compile Pastix according to those threads:

https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
https://solverstack.gitlabpages.inria.fr/pastix/Bindings.html

hwloc is supported in PETSc so I assumed a clean and easy solution to
compile with --download-hwloc. I made some changes in
config/BuildSystem/config/packages/PaStiX.py to tell pastix to link to
hwloc:

...
self.hwloc          = framework.require('config.packages.hwloc',self)
...
if self.hwloc.found:
      g.write('CCPASTIX   := $(CCPASTIX) -DWITH_HWLOC
'+self.headers.toString(self.hwloc.include)+'\n')
      g.write('EXTRALIB   := $(EXTRALIB)
'+self.libraries.toString(self.hwloc.dlib)+'\n')

But it does not compile:

Possible ERROR while running linker: exit code 1
stderr:
/opt/petsc-dev/lib/libpastix.a(pastix.o): In function `pastix_task_init':
/home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:822:
undefined reference to `hwloc_topology_init'
/home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:828:
undefined reference to `hwloc_topology_load'
/opt/petsc-dev/lib/libpastix.a(pastix.o): In function `pastix_task_clean':
/home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:4677:
undefined reference to `hwloc_topology_destroy'
/opt/petsc-dev/lib/libpastix.a(sopalin_thread.o): In function
`hwloc_get_obj_by_type':
/opt/petsc-dev/include/hwloc/inlines.h:76: undefined reference to
`hwloc_get_type_depth'
/opt/petsc-dev/include/hwloc/inlines.h:81: undefined reference to
`hwloc_get_obj_by_depth'
/opt/petsc-dev/lib/libpastix.a(sopalin_thread.o): In function
`sopalin_bindthread':
/home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:538:
undefined reference to `hwloc_bitmap_dup'
/home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:539:
undefined reference to `hwloc_bitmap_singlify'
/home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:543:
undefined reference to `hwloc_set_cpubind'
/home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:567:
undefined reference to `hwloc_bitmap_free'
/home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:548:
undefined reference to `hwloc_bitmap_asprintf'

Any idea is appreciated. I can attach configure.log as needed.

Giang


On Thu, Nov 7, 2019 at 12:18 AM hg <hgbk2008 at gmail.com> wrote:

> Hi Barry
>
> Maybe you're right, sched_setaffinity returns EINVAL in my case, Probably
> the scheduler does not allow the process to bind to thread on its own.
>
> Giang
>
>
> On Wed, Nov 6, 2019 at 4:52 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
>>
>>   You can also just look at configure.log where it will show the calling
>> sequence of how PETSc configured and built Pastix. The recipe is in
>> config/BuildSystem/config/packages/PaStiX.py we don't monkey with low level
>> things like the affinity of external packages. My guess is that your
>> cluster system has inconsistent parts related to this, that one tool works
>> and another does not indicates they are inconsistent with respect to each
>> other in what they expect.
>>
>>    Barry
>>
>>
>>
>>
>> > On Nov 6, 2019, at 4:02 AM, Matthew Knepley <knepley at gmail.com> wrote:
>> >
>> > On Wed, Nov 6, 2019 at 4:40 AM hg <hgbk2008 at gmail.com> wrote:
>> > Look into
>> arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c
>> I saw something like:
>> >
>> > #ifdef HAVE_OLD_SCHED_SETAFFINITY
>> >     if(sched_setaffinity(0,&mask) < 0)
>> > #else /* HAVE_OLD_SCHED_SETAFFINITY */
>> >     if(sched_setaffinity(0,sizeof(mask),&mask) < 0)
>> > #endif /* HAVE_OLD_SCHED_SETAFFINITY */
>> >       {
>> >   perror("sched_setaffinity");
>> >   EXIT(MOD_SOPALIN, INTERNAL_ERR);
>> >       }
>> >
>> > Is there possibility that Petsc turn on HAVE_OLD_SCHED_SETAFFINITY
>> during compilation?
>> >
>> > May I know how to trigger re-compilation of external packages with
>> petsc? I may go in there and check what's going on.
>> >
>> > If we built it during configure, then you can just go to
>> >
>> >   $PETSC_DIR/$PETSC_ARCH/externalpackages/*pastix*/
>> >
>> > and rebuild/install it to test. If you want configure to do it, you
>> have to delete
>> >
>> >   $PETSC_DIR/$PETSC_ARCH/lib/petsc/conf/pkg.conf.pastix
>> >
>> > and reconfigure.
>> >
>> >   Thanks,
>> >
>> >      Matt
>> >
>> > Giang
>> >
>> >
>> > On Wed, Nov 6, 2019 at 10:12 AM hg <hgbk2008 at gmail.com> wrote:
>> > sched_setaffinity: Invalid argument only happens when I launch the job
>> with sbatch. Running without scheduler is fine. I think this has something
>> to do with pastix.
>> >
>> > Giang
>> >
>> >
>> > On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov>
>> wrote:
>> >
>> >   Google finds this
>> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
>> >
>> >
>> >
>> > > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> > >
>> > > I have no idea. That is a good question for the PasTix list.
>> > >
>> > >   Thanks,
>> > >
>> > >     Matt
>> > >
>> > > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
>> > > Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and
>> also OMP_NUM_THREADS to 1
>> > >
>> > > Giang
>> > >
>> > >
>> > > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> > > Hello
>> > >
>> > > I got crashed when using Pastix as solver for KSP. The error message
>> looks like:
>> > >
>> > > ....
>> > > NUMBER of BUBBLE 1
>> > > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
>> > > ** End of Partition & Distribution phase **
>> > >    Time to analyze                              0.225 s
>> > >    Number of nonzeros in factorized matrix      708784076
>> > >    Fill-in                                      12.2337
>> > >    Number of operations (LU)                    2.80185e+12
>> > >    Prediction Time to factorize (AMD 6180  MKL) 394 s
>> > > 0 : SolverMatrix size (without coefficients)    32.4 MB
>> > > 0 : Number of nonzeros (local block structure)  365309391
>> > >  Numerical Factorization (LU) :
>> > > 0 : Internal CSC size                           1.08 GB
>> > >    Time to fill internal csc                    6.66 s
>> > >    --- Sopalin : Allocation de la structure globale ---
>> > >    --- Fin Sopalin Init                             ---
>> > >    --- Initialisation des tableaux globaux          ---
>> > > sched_setaffinity: Invalid argument
>> > > [node083:165071] *** Process received signal ***
>> > > [node083:165071] Signal: Aborted (6)
>> > > [node083:165071] Signal code:  (-6)
>> > > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
>> > > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
>> > > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
>> > > [node083:165071] [ 3]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
>> > > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
>> communication, 0 out-of-core)
>> > >    --- Sopalin : Local structure allocation         ---
>> > >
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
>> > > [node083:165071] [ 5]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
>> > > [node083:165071] [ 6]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
>> > > [node083:165071] [ 7]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
>> > > [node083:165071] [ 8]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
>> > > [node083:165071] [ 9]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
>> > > [node083:165071] [10]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
>> > > [node083:165071] [11]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
>> > > [node083:165071] [12]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
>> > > [node083:165071] [13]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
>> > > [node083:165071] [14]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
>> > > [node083:165071] [15]
>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
>> > >
>> > > Does anyone have an idea what is the problem and how to fix it? The
>> PETSc parameters I used are as below:
>> > >
>> > > It looks like PasTix is having trouble setting the thread affinity:
>> > >
>> > > sched_setaffinity: Invalid argument
>> > >
>> > > so it may be your build of PasTix.
>> > >
>> > >   Thanks,
>> > >
>> > >      Matt
>> > >
>> > > -pc_type lu
>> > > -pc_factor_mat_solver_package pastix
>> > > -mat_pastix_verbose 2
>> > > -mat_pastix_threadnbr 1
>> > >
>> > > Giang
>> > >
>> > >
>> > >
>> > > --
>> > > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > > -- Norbert Wiener
>> > >
>> > > https://www.cse.buffalo.edu/~knepley/
>> > >
>> > >
>> > > --
>> > > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > > -- Norbert Wiener
>> > >
>> > > https://www.cse.buffalo.edu/~knepley/
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> > https://www.cse.buffalo.edu/~knepley/
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191114/2fb491ff/attachment.html>

From s.ritwik98 at gmail.com  Thu Nov 14 22:15:09 2019
From: s.ritwik98 at gmail.com (Ritwik Saha)
Date: Thu, 14 Nov 2019 23:15:09 -0500
Subject: [petsc-users] Including Implementations in my code
Message-ID: <CABrMFS0j58S4dZPPa+mi+5rZk+SSE0_ZodQg6MinHtqKOdWNUA@mail.gmail.com>

Hi All,

PETSc provides various implementations of functions like VecAXPY() in CUDA.
I am talking specifically about VecAXPY_SeqCUDA() in
src/vec/vec/impls/seq/seqcuda/veccuda2.cu . How to I include these
functions in my C code?

Thanks in advance.

Regards,
Ritwik Saha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191114/d34d4e3c/attachment.html>

From jed at jedbrown.org  Thu Nov 14 22:49:35 2019
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 14 Nov 2019 21:49:35 -0700
Subject: [petsc-users] Including Implementations in my code
In-Reply-To: <CABrMFS0j58S4dZPPa+mi+5rZk+SSE0_ZodQg6MinHtqKOdWNUA@mail.gmail.com>
References: <CABrMFS0j58S4dZPPa+mi+5rZk+SSE0_ZodQg6MinHtqKOdWNUA@mail.gmail.com>
Message-ID: <87d0dtg2w0.fsf@jedbrown.org>

Ritwik Saha via petsc-users <petsc-users at mcs.anl.gov> writes:

> Hi All,
>
> PETSc provides various implementations of functions like VecAXPY() in CUDA.
> I am talking specifically about VecAXPY_SeqCUDA() in
> src/vec/vec/impls/seq/seqcuda/veccuda2.cu . How to I include these
> functions in my C code?

I'm not sure I follow.  If you want to call those functions, set the
VecType to VECCUDA.  For many examples, this is done via the run-time
option

  -dm_vec_type cuda

(see many examples in the PETSc source tree).  If you're trying to copy
the implementation into your code without using PETSc, you're on your
own.

From hgbk2008 at gmail.com  Fri Nov 15 08:34:48 2019
From: hgbk2008 at gmail.com (hg)
Date: Fri, 15 Nov 2019 15:34:48 +0100
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAJW_hKfR33OAijLFuyTage=6624Da6JoebY42ihrjz=OokvjWg@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
	<937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
	<CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>
	<CAJW_hKd2zAagj7Bb0+vHu3c5QVRrL=femq54HBw4w-g2d9w_8w@mail.gmail.com>
	<CAMYG4GnF6WWxpYMWD9xQMYNM_5uoj9xM=z8P4cgac--MmVhAtg@mail.gmail.com>
	<1E28761A-883F-4B66-9BA8-8367881D5BCB@mcs.anl.gov>
	<CAJW_hKcEVFb9EvCAUBWr9mEe9X6Hag4kyfc9d2_r-Z8+L3hYcA@mail.gmail.com>
	<CAJW_hKfR33OAijLFuyTage=6624Da6JoebY42ihrjz=OokvjWg@mail.gmail.com>
Message-ID: <CAJW_hKcEiWsosef2QxNo31S8YvUg=F6pixFbTrATZF1JRi9Stw@mail.gmail.com>

FYI, this problem is fixed, providing that hwloc is added to dependencies
of Pastix.

Giang


On Thu, Nov 14, 2019 at 9:04 PM hg <hgbk2008 at gmail.com> wrote:

> Hello
>
> It turns out that hwloc is not installed on the cluster system that I'm
> using. Without hwloc, pastix will run into the branch using
> sched_setaffinity and caused error (see above at sopalin_thread.c). I'm not
> able to understand and find a solution with sched_setaffinity so I think
> enabling hwloc is an easier solution. Between, hwloc is recommended to
> compile Pastix according to those threads:
>
>
> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
> https://solverstack.gitlabpages.inria.fr/pastix/Bindings.html
>
> hwloc is supported in PETSc so I assumed a clean and easy solution to
> compile with --download-hwloc. I made some changes in
> config/BuildSystem/config/packages/PaStiX.py to tell pastix to link to
> hwloc:
>
> ...
> self.hwloc          = framework.require('config.packages.hwloc',self)
> ...
> if self.hwloc.found:
>       g.write('CCPASTIX   := $(CCPASTIX) -DWITH_HWLOC
> '+self.headers.toString(self.hwloc.include)+'\n')
>       g.write('EXTRALIB   := $(EXTRALIB)
> '+self.libraries.toString(self.hwloc.dlib)+'\n')
>
> But it does not compile:
>
> Possible ERROR while running linker: exit code 1
> stderr:
> /opt/petsc-dev/lib/libpastix.a(pastix.o): In function `pastix_task_init':
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:822:
> undefined reference to `hwloc_topology_init'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:828:
> undefined reference to `hwloc_topology_load'
> /opt/petsc-dev/lib/libpastix.a(pastix.o): In function `pastix_task_clean':
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:4677:
> undefined reference to `hwloc_topology_destroy'
> /opt/petsc-dev/lib/libpastix.a(sopalin_thread.o): In function
> `hwloc_get_obj_by_type':
> /opt/petsc-dev/include/hwloc/inlines.h:76: undefined reference to
> `hwloc_get_type_depth'
> /opt/petsc-dev/include/hwloc/inlines.h:81: undefined reference to
> `hwloc_get_obj_by_depth'
> /opt/petsc-dev/lib/libpastix.a(sopalin_thread.o): In function
> `sopalin_bindthread':
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:538:
> undefined reference to `hwloc_bitmap_dup'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:539:
> undefined reference to `hwloc_bitmap_singlify'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:543:
> undefined reference to `hwloc_set_cpubind'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:567:
> undefined reference to `hwloc_bitmap_free'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:548:
> undefined reference to `hwloc_bitmap_asprintf'
>
> Any idea is appreciated. I can attach configure.log as needed.
>
> Giang
>
>
> On Thu, Nov 7, 2019 at 12:18 AM hg <hgbk2008 at gmail.com> wrote:
>
>> Hi Barry
>>
>> Maybe you're right, sched_setaffinity returns EINVAL in my case, Probably
>> the scheduler does not allow the process to bind to thread on its own.
>>
>> Giang
>>
>>
>> On Wed, Nov 6, 2019 at 4:52 PM Smith, Barry F. <bsmith at mcs.anl.gov>
>> wrote:
>>
>>>
>>>   You can also just look at configure.log where it will show the calling
>>> sequence of how PETSc configured and built Pastix. The recipe is in
>>> config/BuildSystem/config/packages/PaStiX.py we don't monkey with low level
>>> things like the affinity of external packages. My guess is that your
>>> cluster system has inconsistent parts related to this, that one tool works
>>> and another does not indicates they are inconsistent with respect to each
>>> other in what they expect.
>>>
>>>    Barry
>>>
>>>
>>>
>>>
>>> > On Nov 6, 2019, at 4:02 AM, Matthew Knepley <knepley at gmail.com> wrote:
>>> >
>>> > On Wed, Nov 6, 2019 at 4:40 AM hg <hgbk2008 at gmail.com> wrote:
>>> > Look into
>>> arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c
>>> I saw something like:
>>> >
>>> > #ifdef HAVE_OLD_SCHED_SETAFFINITY
>>> >     if(sched_setaffinity(0,&mask) < 0)
>>> > #else /* HAVE_OLD_SCHED_SETAFFINITY */
>>> >     if(sched_setaffinity(0,sizeof(mask),&mask) < 0)
>>> > #endif /* HAVE_OLD_SCHED_SETAFFINITY */
>>> >       {
>>> >   perror("sched_setaffinity");
>>> >   EXIT(MOD_SOPALIN, INTERNAL_ERR);
>>> >       }
>>> >
>>> > Is there possibility that Petsc turn on HAVE_OLD_SCHED_SETAFFINITY
>>> during compilation?
>>> >
>>> > May I know how to trigger re-compilation of external packages with
>>> petsc? I may go in there and check what's going on.
>>> >
>>> > If we built it during configure, then you can just go to
>>> >
>>> >   $PETSC_DIR/$PETSC_ARCH/externalpackages/*pastix*/
>>> >
>>> > and rebuild/install it to test. If you want configure to do it, you
>>> have to delete
>>> >
>>> >   $PETSC_DIR/$PETSC_ARCH/lib/petsc/conf/pkg.conf.pastix
>>> >
>>> > and reconfigure.
>>> >
>>> >   Thanks,
>>> >
>>> >      Matt
>>> >
>>> > Giang
>>> >
>>> >
>>> > On Wed, Nov 6, 2019 at 10:12 AM hg <hgbk2008 at gmail.com> wrote:
>>> > sched_setaffinity: Invalid argument only happens when I launch the job
>>> with sbatch. Running without scheduler is fine. I think this has something
>>> to do with pastix.
>>> >
>>> > Giang
>>> >
>>> >
>>> > On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov>
>>> wrote:
>>> >
>>> >   Google finds this
>>> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
>>> >
>>> >
>>> >
>>> > > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> > >
>>> > > I have no idea. That is a good question for the PasTix list.
>>> > >
>>> > >   Thanks,
>>> > >
>>> > >     Matt
>>> > >
>>> > > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
>>> > > Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1
>>> and also OMP_NUM_THREADS to 1
>>> > >
>>> > > Giang
>>> > >
>>> > >
>>> > > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>> > > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> > > Hello
>>> > >
>>> > > I got crashed when using Pastix as solver for KSP. The error message
>>> looks like:
>>> > >
>>> > > ....
>>> > > NUMBER of BUBBLE 1
>>> > > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
>>> > > ** End of Partition & Distribution phase **
>>> > >    Time to analyze                              0.225 s
>>> > >    Number of nonzeros in factorized matrix      708784076
>>> > >    Fill-in                                      12.2337
>>> > >    Number of operations (LU)                    2.80185e+12
>>> > >    Prediction Time to factorize (AMD 6180  MKL) 394 s
>>> > > 0 : SolverMatrix size (without coefficients)    32.4 MB
>>> > > 0 : Number of nonzeros (local block structure)  365309391
>>> > >  Numerical Factorization (LU) :
>>> > > 0 : Internal CSC size                           1.08 GB
>>> > >    Time to fill internal csc                    6.66 s
>>> > >    --- Sopalin : Allocation de la structure globale ---
>>> > >    --- Fin Sopalin Init                             ---
>>> > >    --- Initialisation des tableaux globaux          ---
>>> > > sched_setaffinity: Invalid argument
>>> > > [node083:165071] *** Process received signal ***
>>> > > [node083:165071] Signal: Aborted (6)
>>> > > [node083:165071] Signal code:  (-6)
>>> > > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
>>> > > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
>>> > > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
>>> > > [node083:165071] [ 3]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
>>> > > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0
>>> communication, 0 out-of-core)
>>> > >    --- Sopalin : Local structure allocation         ---
>>> > >
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
>>> > > [node083:165071] [ 5]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
>>> > > [node083:165071] [ 6]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
>>> > > [node083:165071] [ 7]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
>>> > > [node083:165071] [ 8]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
>>> > > [node083:165071] [ 9]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
>>> > > [node083:165071] [10]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
>>> > > [node083:165071] [11]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
>>> > > [node083:165071] [12]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
>>> > > [node083:165071] [13]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
>>> > > [node083:165071] [14]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
>>> > > [node083:165071] [15]
>>> /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
>>> > >
>>> > > Does anyone have an idea what is the problem and how to fix it? The
>>> PETSc parameters I used are as below:
>>> > >
>>> > > It looks like PasTix is having trouble setting the thread affinity:
>>> > >
>>> > > sched_setaffinity: Invalid argument
>>> > >
>>> > > so it may be your build of PasTix.
>>> > >
>>> > >   Thanks,
>>> > >
>>> > >      Matt
>>> > >
>>> > > -pc_type lu
>>> > > -pc_factor_mat_solver_package pastix
>>> > > -mat_pastix_verbose 2
>>> > > -mat_pastix_threadnbr 1
>>> > >
>>> > > Giang
>>> > >
>>> > >
>>> > >
>>> > > --
>>> > > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > > -- Norbert Wiener
>>> > >
>>> > > https://www.cse.buffalo.edu/~knepley/
>>> > >
>>> > >
>>> > > --
>>> > > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > > -- Norbert Wiener
>>> > >
>>> > > https://www.cse.buffalo.edu/~knepley/
>>> >
>>> >
>>> >
>>> > --
>>> > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > -- Norbert Wiener
>>> >
>>> > https://www.cse.buffalo.edu/~knepley/
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191115/c8da2535/attachment.html>

From yjwu16 at gmail.com  Sun Nov 17 09:25:40 2019
From: yjwu16 at gmail.com (Yingjie Wu)
Date: Sun, 17 Nov 2019 23:25:40 +0800
Subject: [petsc-users] Question about changing time step during calculation
Message-ID: <CAAMEhBihmdhTtnCmvYm0MsKmQA3bHafWS1Lt+UCwHO22zO7MFA@mail.gmail.com>

Dear Petsc developers
Hi,
Recently I am trying to using TS to solve time-dependent nonlinear PDEs. In
my program, next time step is based on the results of previous time step. I
want to add this control in TSmonitor() to change time step length in
calculation. I referred to user guide but I didn't find what I wanted.
Please give me some advice.

Thanks,
Yingjie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191117/de0490fb/attachment.html>

From hongzhang at anl.gov  Sun Nov 17 17:32:26 2019
From: hongzhang at anl.gov (Zhang, Hong)
Date: Sun, 17 Nov 2019 23:32:26 +0000
Subject: [petsc-users] Question about changing time step during
 calculation
In-Reply-To: <CAAMEhBihmdhTtnCmvYm0MsKmQA3bHafWS1Lt+UCwHO22zO7MFA@mail.gmail.com>
References: <CAAMEhBihmdhTtnCmvYm0MsKmQA3bHafWS1Lt+UCwHO22zO7MFA@mail.gmail.com>
Message-ID: <F2CB6DD9-B02C-450C-807B-ED4107A10BEC@anl.gov>

TSSetTimeStep()

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetTimeStep.html#TSSetTimeStep

If you want to decide the step size by yourself, make sure that the adaptivity is turned off, e.g., with -ts_adapt_type none

Btw, have you tried the available TSAdapt types? Is there anything special in your problems so that none of these adaptors works?

Hong (Mr.)

On Nov 17, 2019, at 9:25 AM, Yingjie Wu via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Dear Petsc developers
Hi,
Recently I am trying to using TS to solve time-dependent nonlinear PDEs. In my program, next time step is based on the results of previous time step. I want to add this control in TSmonitor() to change time step length in calculation. I referred to user guide but I didn't find what I wanted. Please give me some advice.

Thanks,
Yingjie

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191117/ee90d5fe/attachment.html>

From bsmith at mcs.anl.gov  Sun Nov 17 23:24:51 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Mon, 18 Nov 2019 05:24:51 +0000
Subject: [petsc-users] Question about changing time step during
 calculation
In-Reply-To: <F2CB6DD9-B02C-450C-807B-ED4107A10BEC@anl.gov>
References: <CAAMEhBihmdhTtnCmvYm0MsKmQA3bHafWS1Lt+UCwHO22zO7MFA@mail.gmail.com>
	<F2CB6DD9-B02C-450C-807B-ED4107A10BEC@anl.gov>
Message-ID: <3A83F643-C6CE-4010-92B7-BF8E2E945AEC@anl.gov>


> On Nov 17, 2019, at 5:32 PM, Zhang, Hong via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> TSSetTimeStep()
> 
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetTimeStep.html#TSSetTimeStep
> 
> If you want to decide the step size by yourself, make sure that the adaptivity is turned off, e.g., with -ts_adapt_type none
> 
> Btw, have you tried the available TSAdapt types? Is there anything special in your problems so that none of these adaptors works?

  It is also possible to register your own adaptor using TSAdaptRegister(). You can start writing you own by copying the basic adapter src/ts/adapt/impls/basic/adaptbasic.c; you can also look at the other adaptors for ideas. Note this is an advanced topic that only needs to be used when none of standard adapters are useful for your situation.

  Barry

> 
> Hong (Mr.)
> 
>> On Nov 17, 2019, at 9:25 AM, Yingjie Wu via petsc-users <petsc-users at mcs.anl.gov> wrote:
>> 
>> Dear Petsc developers
>> Hi,
>> Recently I am trying to using TS to solve time-dependent nonlinear PDEs. In my program, next time step is based on the results of previous time step. I want to add this control in TSmonitor() to change time step length in calculation. I referred to user guide but I didn't find what I wanted. Please give me some advice.
>> 
>> Thanks,
>> Yingjie
> 


From repepo at gmail.com  Tue Nov 19 04:40:22 2019
From: repepo at gmail.com (Santiago Andres Triana)
Date: Tue, 19 Nov 2019 11:40:22 +0100
Subject: [petsc-users] problem downloading "fix-syntax-for-nag.tar.gx"
Message-ID: <CAOrd_Sdo=4MX5wzjoWsw1tPoOUrsXJBLgDb1zvb7PFdQuZu++A@mail.gmail.com>

Hello petsc-users:

I found this error when configure tries to download fblaslapack:

*******************************************************************************
         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
details):
-------------------------------------------------------------------------------
Error during download/extract/detection of FBLASLAPACK:
file could not be opened successfully
Downloaded package FBLASLAPACK from:
https://bitbucket.org/petsc/pkg-fblaslapack/get/origin/barry/2019-08-22/fix-syntax-for-nag.tar.gz
is not a tarball.
[or installed python cannot process compressed files]
* If you are behind a firewall - please fix your proxy and rerun ./configure
  For example at LANL you may need to set the environmental variable
http_proxy (or HTTP_PROXY?) to  http://proxyout.lanl.gov
* You can run with --with-packages-download-dir=/adirectory and ./configure
will instruct you what packages to download manually
* or you can download the above URL manually, to
/yourselectedlocation/fix-syntax-for-nag.tar.gz
  and use the configure option:
  --download-fblaslapack=/yourselectedlocation/fix-syntax-for-nag.tar.gz
*******************************************************************************


Any ideas? the file in question doesn't seem to exist ... Thanks a lot in
advance!

Santiago
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191119/bbd0ed56/attachment.html>

From bsmith at mcs.anl.gov  Tue Nov 19 05:49:22 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 19 Nov 2019 11:49:22 +0000
Subject: [petsc-users] problem downloading "fix-syntax-for-nag.tar.gx"
In-Reply-To: <CAOrd_Sdo=4MX5wzjoWsw1tPoOUrsXJBLgDb1zvb7PFdQuZu++A@mail.gmail.com>
References: <CAOrd_Sdo=4MX5wzjoWsw1tPoOUrsXJBLgDb1zvb7PFdQuZu++A@mail.gmail.com>
Message-ID: <6FACC969-7AE8-411D-AE15-A9933A9C8A4E@anl.gov>


   For a while I had put in an incorrect URL in the download location.

   Perhaps you are using PETSc 3.12.0 and need to use 3.12.1 from  https://www.mcs.anl.gov/petsc/download/index.html

   Otherwise please send configure.log


> On Nov 19, 2019, at 4:40 AM, Santiago Andres Triana via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hello petsc-users:
> 
> I found this error when configure tries to download fblaslapack:
> 
> *******************************************************************************
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
> -------------------------------------------------------------------------------
> Error during download/extract/detection of FBLASLAPACK:
> file could not be opened successfully
> Downloaded package FBLASLAPACK from: https://bitbucket.org/petsc/pkg-fblaslapack/get/origin/barry/2019-08-22/fix-syntax-for-nag.tar.gz is not a tarball.
> [or installed python cannot process compressed files]
> * If you are behind a firewall - please fix your proxy and rerun ./configure
>   For example at LANL you may need to set the environmental variable http_proxy (or HTTP_PROXY?) to  http://proxyout.lanl.gov
> * You can run with --with-packages-download-dir=/adirectory and ./configure will instruct you what packages to download manually
> * or you can download the above URL manually, to /yourselectedlocation/fix-syntax-for-nag.tar.gz
>   and use the configure option:
>   --download-fblaslapack=/yourselectedlocation/fix-syntax-for-nag.tar.gz
> *******************************************************************************
> 
> 
> Any ideas? the file in question doesn't seem to exist ... Thanks a lot in advance!
> 
> Santiago


From bsmith at mcs.anl.gov  Tue Nov 19 06:20:45 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 19 Nov 2019 12:20:45 +0000
Subject: [petsc-users] solve problem with pastix
In-Reply-To: <CAJW_hKfR33OAijLFuyTage=6624Da6JoebY42ihrjz=OokvjWg@mail.gmail.com>
References: <CAJW_hKdvddAFb4vUkg=nKM2UotqW5i+bwTWNeCGfz8r0u3gyng@mail.gmail.com>
	<CAMYG4Gk3UmVoX4t_1mnVfpe=rm67d8p68TN+-kKeT_aOM2Tw5g@mail.gmail.com>
	<CAJW_hKd0zGeNh=2ZACTK43sxLrxAuLHkJoGSYZ934csjhysBxg@mail.gmail.com>
	<CAMYG4G=MVe-fszE-KGGRV94mah_bD+dh1SecbWPPQvHivzOvTg@mail.gmail.com>
	<937C0613-989A-4DD6-8A33-BB947D5E7DEB@anl.gov>
	<CAJW_hKf6+rk=MfTxd5PC1U-z+hVy8-UEXbYMRL4YwEKYHZsGpA@mail.gmail.com>
	<CAJW_hKd2zAagj7Bb0+vHu3c5QVRrL=femq54HBw4w-g2d9w_8w@mail.gmail.com>
	<CAMYG4GnF6WWxpYMWD9xQMYNM_5uoj9xM=z8P4cgac--MmVhAtg@mail.gmail.com>
	<1E28761A-883F-4B66-9BA8-8367881D5BCB@mcs.anl.gov>
	<CAJW_hKcEVFb9EvCAUBWr9mEe9X6Hag4kyfc9d2_r-Z8+L3hYcA@mail.gmail.com>
	<CAJW_hKfR33OAijLFuyTage=6624Da6JoebY42ihrjz=OokvjWg@mail.gmail.com>
Message-ID: <0DB2E0B0-6AB9-409A-A1F7-2A92BEF0915F@mcs.anl.gov>


  Thanks for the fix.  https://gitlab.com/petsc/petsc/pipelines/96957999

> On Nov 14, 2019, at 2:04 PM, hg <hgbk2008 at gmail.com> wrote:
> 
> Hello
> 
> It turns out that hwloc is not installed on the cluster system that I'm using. Without hwloc, pastix will run into the branch using sched_setaffinity and caused error (see above at sopalin_thread.c). I'm not able to understand and find a solution with sched_setaffinity so I think enabling hwloc is an easier solution. Between, hwloc is recommended to compile Pastix according to those threads:
> 
> https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
> https://solverstack.gitlabpages.inria.fr/pastix/Bindings.html
> 
> hwloc is supported in PETSc so I assumed a clean and easy solution to compile with --download-hwloc. I made some changes in config/BuildSystem/config/packages/PaStiX.py to tell pastix to link to hwloc:
> 
> ...
> self.hwloc          = framework.require('config.packages.hwloc',self)
> ...
> if self.hwloc.found:
>       g.write('CCPASTIX   := $(CCPASTIX) -DWITH_HWLOC '+self.headers.toString(self.hwloc.include)+'\n')
>       g.write('EXTRALIB   := $(EXTRALIB) '+self.libraries.toString(self.hwloc.dlib)+'\n')
> 
> But it does not compile:
> 
> Possible ERROR while running linker: exit code 1
> stderr:
> /opt/petsc-dev/lib/libpastix.a(pastix.o): In function `pastix_task_init':
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:822: undefined reference to `hwloc_topology_init'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:828: undefined reference to `hwloc_topology_load'
> /opt/petsc-dev/lib/libpastix.a(pastix.o): In function `pastix_task_clean':
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/pastix.c:4677: undefined reference to `hwloc_topology_destroy'
> /opt/petsc-dev/lib/libpastix.a(sopalin_thread.o): In function `hwloc_get_obj_by_type':
> /opt/petsc-dev/include/hwloc/inlines.h:76: undefined reference to `hwloc_get_type_depth'
> /opt/petsc-dev/include/hwloc/inlines.h:81: undefined reference to `hwloc_get_obj_by_depth'
> /opt/petsc-dev/lib/libpastix.a(sopalin_thread.o): In function `sopalin_bindthread':
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:538: undefined reference to `hwloc_bitmap_dup'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:539: undefined reference to `hwloc_bitmap_singlify'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:543: undefined reference to `hwloc_set_cpubind'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:567: undefined reference to `hwloc_bitmap_free'
> /home/hbui/sw2/petsc-dev/arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c:548: undefined reference to `hwloc_bitmap_asprintf'
> 
> Any idea is appreciated. I can attach configure.log as needed.
> 
> Giang
> 
> 
> On Thu, Nov 7, 2019 at 12:18 AM hg <hgbk2008 at gmail.com> wrote:
> Hi Barry
> 
> Maybe you're right, sched_setaffinity returns EINVAL in my case, Probably the scheduler does not allow the process to bind to thread on its own.
> 
> Giang
> 
> 
> On Wed, Nov 6, 2019 at 4:52 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
>   You can also just look at configure.log where it will show the calling sequence of how PETSc configured and built Pastix. The recipe is in config/BuildSystem/config/packages/PaStiX.py we don't monkey with low level things like the affinity of external packages. My guess is that your cluster system has inconsistent parts related to this, that one tool works and another does not indicates they are inconsistent with respect to each other in what they expect.
> 
>    Barry
> 
> 
> 
> 
> > On Nov 6, 2019, at 4:02 AM, Matthew Knepley <knepley at gmail.com> wrote:
> > 
> > On Wed, Nov 6, 2019 at 4:40 AM hg <hgbk2008 at gmail.com> wrote:
> > Look into arch-linux2-cxx-opt/externalpackages/pastix_5.2.3/src/sopalin/src/sopalin_thread.c I saw something like:
> > 
> > #ifdef HAVE_OLD_SCHED_SETAFFINITY
> >     if(sched_setaffinity(0,&mask) < 0)
> > #else /* HAVE_OLD_SCHED_SETAFFINITY */
> >     if(sched_setaffinity(0,sizeof(mask),&mask) < 0)
> > #endif /* HAVE_OLD_SCHED_SETAFFINITY */
> >       {
> >   perror("sched_setaffinity");
> >   EXIT(MOD_SOPALIN, INTERNAL_ERR);
> >       }
> > 
> > Is there possibility that Petsc turn on HAVE_OLD_SCHED_SETAFFINITY during compilation?
> > 
> > May I know how to trigger re-compilation of external packages with petsc? I may go in there and check what's going on.
> > 
> > If we built it during configure, then you can just go to
> > 
> >   $PETSC_DIR/$PETSC_ARCH/externalpackages/*pastix*/
> > 
> > and rebuild/install it to test. If you want configure to do it, you have to delete
> > 
> >   $PETSC_DIR/$PETSC_ARCH/lib/petsc/conf/pkg.conf.pastix
> > 
> > and reconfigure.
> > 
> >   Thanks,
> > 
> >      Matt
> >  
> > Giang
> > 
> > 
> > On Wed, Nov 6, 2019 at 10:12 AM hg <hgbk2008 at gmail.com> wrote:
> > sched_setaffinity: Invalid argument only happens when I launch the job with sbatch. Running without scheduler is fine. I think this has something to do with pastix.
> > 
> > Giang
> > 
> > 
> > On Wed, Nov 6, 2019 at 4:37 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> > 
> >   Google finds this https://gforge.inria.fr/forum/forum.php?thread_id=32824&forum_id=599&group_id=186
> > 
> > 
> > 
> > > On Nov 5, 2019, at 7:01 PM, Matthew Knepley via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > > 
> > > I have no idea. That is a good question for the PasTix list.
> > > 
> > >   Thanks,
> > > 
> > >     Matt
> > > 
> > > On Tue, Nov 5, 2019 at 5:32 PM hg <hgbk2008 at gmail.com> wrote:
> > > Should thread affinity be invoked? I set  -mat_pastix_threadnbr 1 and also OMP_NUM_THREADS to 1
> > > 
> > > Giang
> > > 
> > > 
> > > On Tue, Nov 5, 2019 at 10:50 PM Matthew Knepley <knepley at gmail.com> wrote:
> > > On Tue, Nov 5, 2019 at 4:11 PM hg via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > > Hello
> > > 
> > > I got crashed when using Pastix as solver for KSP. The error message looks like:
> > > 
> > > ....
> > > NUMBER of BUBBLE 1
> > > COEFMAX 1735566 CPFTMAX 0 BPFTMAX 0 NBFTMAX 0 ARFTMAX 0
> > > ** End of Partition & Distribution phase **
> > >    Time to analyze                              0.225 s
> > >    Number of nonzeros in factorized matrix      708784076
> > >    Fill-in                                      12.2337
> > >    Number of operations (LU)                    2.80185e+12
> > >    Prediction Time to factorize (AMD 6180  MKL) 394 s
> > > 0 : SolverMatrix size (without coefficients)    32.4 MB
> > > 0 : Number of nonzeros (local block structure)  365309391
> > >  Numerical Factorization (LU) :
> > > 0 : Internal CSC size                           1.08 GB
> > >    Time to fill internal csc                    6.66 s
> > >    --- Sopalin : Allocation de la structure globale ---
> > >    --- Fin Sopalin Init                             ---
> > >    --- Initialisation des tableaux globaux          ---
> > > sched_setaffinity: Invalid argument
> > > [node083:165071] *** Process received signal ***
> > > [node083:165071] Signal: Aborted (6)
> > > [node083:165071] Signal code:  (-6)
> > > [node083:165071] [ 0] /lib64/libpthread.so.0(+0xf680)[0x2b8081845680]
> > > [node083:165071] [ 1] /lib64/libc.so.6(gsignal+0x37)[0x2b8082191207]
> > > [node083:165071] [ 2] /lib64/libc.so.6(abort+0x148)[0x2b80821928f8]
> > > [node083:165071] [ 3] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_comm+0x0)[0x2b80a4124c9d]
> > > [node083:165071] [ 4] Launching 1 threads (1 commputation, 0 communication, 0 out-of-core)
> > >    --- Sopalin : Local structure allocation         ---
> > > /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_sopalin_init_smp+0x29b)[0x2b80a40c39d2]
> > > [node083:165071] [ 5] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_smp+0x68)[0x2b80a40cf4c2]
> > > [node083:165071] [ 6] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(sopalin_launch_thread+0x4ba)[0x2b80a4124a31]
> > > [node083:165071] [ 7] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_ge_sopalin_thread+0x94)[0x2b80a40d6170]
> > > [node083:165071] [ 8] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(D_pastix_task_sopalin+0x5ad)[0x2b80a40b09a2]
> > > [node083:165071] [ 9] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(d_pastix+0xa8a)[0x2b80a40b2325]
> > > [node083:165071] [10] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0x63927b)[0x2b80a35bf27b]
> > > [node083:165071] [11] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(MatLUFactorNumeric+0x19a)[0x2b80a32c7552]
> > > [node083:165071] [12] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(+0xa46c09)[0x2b80a39ccc09]
> > > [node083:165071] [13] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(PCSetUp+0x311)[0x2b80a3a8f1a9]
> > > [node083:165071] [14] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSetUp+0xbf7)[0x2b80a3b46e81]
> > > [node083:165071] [15] /sdhome/bui/opt/petsc-3.11.0_ompi-3.0.0/lib/libpetsc.so.3.11(KSPSolve+0x210)[0x2b80a3b4746e]
> > > 
> > > Does anyone have an idea what is the problem and how to fix it? The PETSc parameters I used are as below:
> > > 
> > > It looks like PasTix is having trouble setting the thread affinity:
> > > 
> > > sched_setaffinity: Invalid argument
> > > 
> > > so it may be your build of PasTix.
> > > 
> > >   Thanks,
> > > 
> > >      Matt
> > >  
> > > -pc_type lu
> > > -pc_factor_mat_solver_package pastix
> > > -mat_pastix_verbose 2
> > > -mat_pastix_threadnbr 1
> > > 
> > > Giang
> > > 
> > > 
> > > 
> > > -- 
> > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > > -- Norbert Wiener
> > > 
> > > https://www.cse.buffalo.edu/~knepley/
> > > 
> > > 
> > > -- 
> > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > > -- Norbert Wiener
> > > 
> > > https://www.cse.buffalo.edu/~knepley/
> > 
> > 
> > 
> > -- 
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> > 
> > https://www.cse.buffalo.edu/~knepley/
> 


From yann.jobic at univ-amu.fr  Tue Nov 19 10:06:46 2019
From: yann.jobic at univ-amu.fr (Yann Jobic)
Date: Tue, 19 Nov 2019 17:06:46 +0100
Subject: [petsc-users] SLEPc GEVP for huge systems
Message-ID: <4330c5e5-b40c-0e21-3be5-54a94a8e6a50@univ-amu.fr>

Hi all,
I'm trying to solve a huge generalize (unsymetric) eigen value problem 
with SLEPc + MUMPS. We actually failed to allocate the requested memory 
for MUMPS factorization (we tried BVVECS).
We would like to know if there is an alternate iterative way of solving 
such problems.
Thank you,
Best regards,
Yann

From jroman at dsic.upv.es  Tue Nov 19 10:25:16 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 19 Nov 2019 17:25:16 +0100
Subject: [petsc-users] SLEPc GEVP for huge systems
In-Reply-To: <4330c5e5-b40c-0e21-3be5-54a94a8e6a50@univ-amu.fr>
References: <4330c5e5-b40c-0e21-3be5-54a94a8e6a50@univ-amu.fr>
Message-ID: <29214E41-C20C-4B97-9469-958C9D859C82@dsic.upv.es>

Are you getting an error from MUMPS or from BV? What is the error message you get? What is the size of the matrix? How many eigenvalues do you need to compute?

In principle you can use any KSP+PC, see section 3.4.1 of the users manual. If you have a good preconditioner, then an alternative to Krylov methods is to use Davidson-type methods https://doi.org/10.1145/2543696 - in some cases these can be competitive.

Jose


> El 19 nov 2019, a las 17:06, Yann Jobic via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> 
> Hi all,
> I'm trying to solve a huge generalize (unsymetric) eigen value problem with SLEPc + MUMPS. We actually failed to allocate the requested memory for MUMPS factorization (we tried BVVECS).
> We would like to know if there is an alternate iterative way of solving such problems.
> Thank you,
> Best regards,
> Yann


From mpovolot at purdue.edu  Tue Nov 19 13:42:24 2019
From: mpovolot at purdue.edu (Povolotskyi, Mykhailo)
Date: Tue, 19 Nov 2019 19:42:24 +0000
Subject: [petsc-users] petsc without MPI
Message-ID: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>

Hello,

I'm trying to build PETSC without MPI.

Even if I specify --with_mpi=0, the configuration script still activates 
MPI.

I attach the configure.log.

What am I doing wrong?

Thank you,

Michael.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: configure.log
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191119/1cca5e2d/attachment-0001.ksh>

From balay at mcs.anl.gov  Tue Nov 19 13:47:39 2019
From: balay at mcs.anl.gov (Balay, Satish)
Date: Tue, 19 Nov 2019 19:47:39 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
Message-ID: <alpine.LFD.2.21.1911191347280.2147@sb>

On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:

> Hello,
> 
> I'm trying to build PETSC without MPI.
> 
> Even if I specify --with_mpi=0, the configuration script still activates 
> MPI.
> 
> I attach the configure.log.
> 
> What am I doing wrong?

The option is --with-mpi=0

Satish


> 
> Thank you,
> 
> Michael.
> 
> 


From balay at mcs.anl.gov  Tue Nov 19 13:51:51 2019
From: balay at mcs.anl.gov (Balay, Satish)
Date: Tue, 19 Nov 2019 19:51:51 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <alpine.LFD.2.21.1911191347280.2147@sb>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
Message-ID: <alpine.LFD.2.21.1911191350500.2147@sb>

And I see from configure.log - you are using the correct option.

>>>>>>>
Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0 --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0 --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11 --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0 --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=0 --download-superlu_dist=0 --download-parmetis=0 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0 --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64  -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack=0
<<<<<<<

And configure completed successfully. What issue are you encountering? Why do you think its activating MPI?

Satish


On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:

> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
> 
> > Hello,
> > 
> > I'm trying to build PETSC without MPI.
> > 
> > Even if I specify --with_mpi=0, the configuration script still activates 
> > MPI.
> > 
> > I attach the configure.log.
> > 
> > What am I doing wrong?
> 
> The option is --with-mpi=0
> 
> Satish
> 
> 
> > 
> > Thank you,
> > 
> > Michael.
> > 
> > 
> 

From mpovolot at purdue.edu  Tue Nov 19 13:51:42 2019
From: mpovolot at purdue.edu (Povolotskyi, Mykhailo)
Date: Tue, 19 Nov 2019 19:51:42 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <alpine.LFD.2.21.1911191347280.2147@sb>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
Message-ID: <40a99637-47db-bb4b-d6c7-39a831f5ac5d@purdue.edu>


On 11/19/2019 2:47 PM, Balay, Satish wrote:
> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
>
>> Hello,
>>
>> I'm trying to build PETSC without MPI.
>>
>> Even if I specify --with_mpi=0, the configuration script still activates
>> MPI.
>>
>> I attach the configure.log.
>>
>> What am I doing wrong?
> The option is --with-mpi=0
>
> Satish
>
>
>> Thank you,
>>
>> Michael.
>>
>>

Dear Satish,

I actually used a correct one --with-mpi=0 (you can see the attached 
configuration log output, the e-mail had a mistake), but it did not work.


Michael.


From mpovolot at purdue.edu  Tue Nov 19 13:53:38 2019
From: mpovolot at purdue.edu (Povolotskyi, Mykhailo)
Date: Tue, 19 Nov 2019 19:53:38 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <alpine.LFD.2.21.1911191350500.2147@sb>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
Message-ID: <6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>

Why it did not work then?

On 11/19/2019 2:51 PM, Balay, Satish wrote:
> And I see from configure.log - you are using the correct option.
>
> Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0 --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0 --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11 --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0 --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=0 --download-superlu_dist=0 --download-parmetis=0 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0 --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64  -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack=0
> <<<<<<<
>
> And configure completed successfully. What issue are you encountering? Why do you think its activating MPI?
>
> Satish
>
>
> On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
>
>> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
>>
>>> Hello,
>>>
>>> I'm trying to build PETSC without MPI.
>>>
>>> Even if I specify --with_mpi=0, the configuration script still activates
>>> MPI.
>>>
>>> I attach the configure.log.
>>>
>>> What am I doing wrong?
>> The option is --with-mpi=0
>>
>> Satish
>>
>>
>>> Thank you,
>>>
>>> Michael.
>>>
>>>


From knepley at gmail.com  Tue Nov 19 13:55:37 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 19 Nov 2019 14:55:37 -0500
Subject: [petsc-users] petsc without MPI
In-Reply-To: <6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
	<6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
Message-ID: <CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>

The log you sent has configure completely successfully. Please retry and
send the log for a failed run.

  Thanks,

     Matt

On Tue, Nov 19, 2019 at 2:53 PM Povolotskyi, Mykhailo via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Why it did not work then?
>
> On 11/19/2019 2:51 PM, Balay, Satish wrote:
> > And I see from configure.log - you are using the correct option.
> >
> > Configure Options: --configModules=PETSc.Configure
> --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0
> --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0
> --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11
> --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0
> --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0
> --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS=
> FOPTFLAGS= --download-metis=0 --download-superlu_dist=0
> --download-parmetis=0
> --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5
> --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0
> --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64
> -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64
> -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core "
> --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so
> --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include
> --with-scalapack=0
> > <<<<<<<
> >
> > And configure completed successfully. What issue are you encountering?
> Why do you think its activating MPI?
> >
> > Satish
> >
> >
> > On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
> >
> >> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
> >>
> >>> Hello,
> >>>
> >>> I'm trying to build PETSC without MPI.
> >>>
> >>> Even if I specify --with_mpi=0, the configuration script still
> activates
> >>> MPI.
> >>>
> >>> I attach the configure.log.
> >>>
> >>> What am I doing wrong?
> >> The option is --with-mpi=0
> >>
> >> Satish
> >>
> >>
> >>> Thank you,
> >>>
> >>> Michael.
> >>>
> >>>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191119/12c55948/attachment-0001.html>

From mpovolot at purdue.edu  Tue Nov 19 13:58:41 2019
From: mpovolot at purdue.edu (Povolotskyi, Mykhailo)
Date: Tue, 19 Nov 2019 19:58:41 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
	<6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
	<CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>
Message-ID: <d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>

Let me explain the problem.

This log file has

#ifndef PETSC_HAVE_MPI
#define PETSC_HAVE_MPI 1
#endif

while I need to have PETSC without MPI.

On 11/19/2019 2:55 PM, Matthew Knepley wrote:
The log you sent has configure completely successfully. Please retry and send the log for a failed run.

  Thanks,

     Matt

On Tue, Nov 19, 2019 at 2:53 PM Povolotskyi, Mykhailo via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Why it did not work then?

On 11/19/2019 2:51 PM, Balay, Satish wrote:
> And I see from configure.log - you are using the correct option.
>
> Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0 --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0 --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11 --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0 --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=0 --download-superlu_dist=0 --download-parmetis=0 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0 --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64  -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack=0
> <<<<<<<
>
> And configure completed successfully. What issue are you encountering? Why do you think its activating MPI?
>
> Satish
>
>
> On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
>
>> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
>>
>>> Hello,
>>>
>>> I'm trying to build PETSC without MPI.
>>>
>>> Even if I specify --with_mpi=0, the configuration script still activates
>>> MPI.
>>>
>>> I attach the configure.log.
>>>
>>> What am I doing wrong?
>> The option is --with-mpi=0
>>
>> Satish
>>
>>
>>> Thank you,
>>>
>>> Michael.
>>>
>>>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191119/772845d6/attachment.html>

From knepley at gmail.com  Tue Nov 19 14:00:32 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 19 Nov 2019 15:00:32 -0500
Subject: [petsc-users] petsc without MPI
In-Reply-To: <d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
	<6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
	<CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>
	<d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>
Message-ID: <CAMYG4GmQOc8mBFU2ftWnqHiKq+N_t1hhvqa+ehKU3P9Rjt7kTQ@mail.gmail.com>

On Tue, Nov 19, 2019 at 2:58 PM Povolotskyi, Mykhailo <mpovolot at purdue.edu>
wrote:

> Let me explain the problem.
>
> This log file has
>
> #ifndef PETSC_HAVE_MPI
> #define PETSC_HAVE_MPI 1
> #endif
>
> while I need to have PETSC without MPI.
>
If you do not provide MPI, we provide MPIUNI. Do you see it linking to an
MPI implementation, or using mpi.h?

    Matt


> On 11/19/2019 2:55 PM, Matthew Knepley wrote:
>
> The log you sent has configure completely successfully. Please retry and
> send the log for a failed run.
>
>   Thanks,
>
>      Matt
>
> On Tue, Nov 19, 2019 at 2:53 PM Povolotskyi, Mykhailo via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Why it did not work then?
>>
>> On 11/19/2019 2:51 PM, Balay, Satish wrote:
>> > And I see from configure.log - you are using the correct option.
>> >
>> > Configure Options: --configModules=PETSc.Configure
>> --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0
>> --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0
>> --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11
>> --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0
>> --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0
>> --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS=
>> FOPTFLAGS= --download-metis=0 --download-superlu_dist=0
>> --download-parmetis=0
>> --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5
>> --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0
>> --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64
>> -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64
>> -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core "
>> --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so
>> --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include
>> --with-scalapack=0
>> > <<<<<<<
>> >
>> > And configure completed successfully. What issue are you encountering?
>> Why do you think its activating MPI?
>> >
>> > Satish
>> >
>> >
>> > On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
>> >
>> >> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
>> >>
>> >>> Hello,
>> >>>
>> >>> I'm trying to build PETSC without MPI.
>> >>>
>> >>> Even if I specify --with_mpi=0, the configuration script still
>> activates
>> >>> MPI.
>> >>>
>> >>> I attach the configure.log.
>> >>>
>> >>> What am I doing wrong?
>> >> The option is --with-mpi=0
>> >>
>> >> Satish
>> >>
>> >>
>> >>> Thank you,
>> >>>
>> >>> Michael.
>> >>>
>> >>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191119/5a386421/attachment.html>

From balay at mcs.anl.gov  Tue Nov 19 14:07:16 2019
From: balay at mcs.anl.gov (Balay, Satish)
Date: Tue, 19 Nov 2019 20:07:16 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
	<6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
	<CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>
	<d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>
Message-ID: <alpine.LFD.2.21.1911191405390.2147@sb>

Not sure why you are looking at this flag and interpreting it - PETSc code uses the flag PETSC_HAVE_MPIUNI to check for a sequential build.

[this one states the module MPI similar to BLASLAPACK etc  in configure is enabled]

Satish

On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:

> Let me explain the problem.
> 
> This log file has
> 
> #ifndef PETSC_HAVE_MPI
> #define PETSC_HAVE_MPI 1
> #endif
> 
> while I need to have PETSC without MPI.
> 
> On 11/19/2019 2:55 PM, Matthew Knepley wrote:
> The log you sent has configure completely successfully. Please retry and send the log for a failed run.
> 
>   Thanks,
> 
>      Matt
> 
> On Tue, Nov 19, 2019 at 2:53 PM Povolotskyi, Mykhailo via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
> Why it did not work then?
> 
> On 11/19/2019 2:51 PM, Balay, Satish wrote:
> > And I see from configure.log - you are using the correct option.
> >
> > Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0 --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0 --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11 --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0 --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=0 --download-superlu_dist=0 --download-parmetis=0 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0 --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64  -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack=0
> > <<<<<<<
> >
> > And configure completed successfully. What issue are you encountering? Why do you think its activating MPI?
> >
> > Satish
> >
> >
> > On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
> >
> >> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
> >>
> >>> Hello,
> >>>
> >>> I'm trying to build PETSC without MPI.
> >>>
> >>> Even if I specify --with_mpi=0, the configuration script still activates
> >>> MPI.
> >>>
> >>> I attach the configure.log.
> >>>
> >>> What am I doing wrong?
> >> The option is --with-mpi=0
> >>
> >> Satish
> >>
> >>
> >>> Thank you,
> >>>
> >>> Michael.
> >>>
> >>>
> 
> 
> 
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
> 

From mpovolot at purdue.edu  Tue Nov 19 14:07:45 2019
From: mpovolot at purdue.edu (Povolotskyi, Mykhailo)
Date: Tue, 19 Nov 2019 20:07:45 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <CAMYG4GmQOc8mBFU2ftWnqHiKq+N_t1hhvqa+ehKU3P9Rjt7kTQ@mail.gmail.com>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
	<6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
	<CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>
	<d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>
	<CAMYG4GmQOc8mBFU2ftWnqHiKq+N_t1hhvqa+ehKU3P9Rjt7kTQ@mail.gmail.com>
Message-ID: <7b2c1352-df81-16e8-080a-83da950e0ede@purdue.edu>

I see.

Actually, my goal is to compile petsc without real MPI to use it with libmesh.

You are saying that PETSC_HAVE_MPI is not a sign that Petsc is built with MPI. It means you have MPIUNI which is a serial code, but has an interface of MPI.

Correct?

On 11/19/2019 3:00 PM, Matthew Knepley wrote:
On Tue, Nov 19, 2019 at 2:58 PM Povolotskyi, Mykhailo <mpovolot at purdue.edu<mailto:mpovolot at purdue.edu>> wrote:

Let me explain the problem.

This log file has

#ifndef PETSC_HAVE_MPI
#define PETSC_HAVE_MPI 1
#endif

while I need to have PETSC without MPI.

If you do not provide MPI, we provide MPIUNI. Do you see it linking to an MPI implementation, or using mpi.h?

    Matt

On 11/19/2019 2:55 PM, Matthew Knepley wrote:
The log you sent has configure completely successfully. Please retry and send the log for a failed run.

  Thanks,

     Matt

On Tue, Nov 19, 2019 at 2:53 PM Povolotskyi, Mykhailo via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Why it did not work then?

On 11/19/2019 2:51 PM, Balay, Satish wrote:
> And I see from configure.log - you are using the correct option.
>
> Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0 --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0 --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11 --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0 --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=0 --download-superlu_dist=0 --download-parmetis=0 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0 --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64  -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack=0
> <<<<<<<
>
> And configure completed successfully. What issue are you encountering? Why do you think its activating MPI?
>
> Satish
>
>
> On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
>
>> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
>>
>>> Hello,
>>>
>>> I'm trying to build PETSC without MPI.
>>>
>>> Even if I specify --with_mpi=0, the configuration script still activates
>>> MPI.
>>>
>>> I attach the configure.log.
>>>
>>> What am I doing wrong?
>> The option is --with-mpi=0
>>
>> Satish
>>
>>
>>> Thank you,
>>>
>>> Michael.
>>>
>>>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191119/04a65aa4/attachment.html>

From mpovolot at purdue.edu  Tue Nov 19 14:09:26 2019
From: mpovolot at purdue.edu (Povolotskyi, Mykhailo)
Date: Tue, 19 Nov 2019 20:09:26 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <alpine.LFD.2.21.1911191405390.2147@sb>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
	<6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
	<CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>
	<d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>
	<alpine.LFD.2.21.1911191405390.2147@sb>
Message-ID: <574993b1-eb17-d56d-5a19-92289131a896@purdue.edu>

Thank you. It is clear now.

On 11/19/2019 3:07 PM, Balay, Satish wrote:
> Not sure why you are looking at this flag and interpreting it - PETSc code uses the flag PETSC_HAVE_MPIUNI to check for a sequential build.
>
> [this one states the module MPI similar to BLASLAPACK etc  in configure is enabled]
>
> Satish
>
> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
>
>> Let me explain the problem.
>>
>> This log file has
>>
>> #ifndef PETSC_HAVE_MPI
>> #define PETSC_HAVE_MPI 1
>> #endif
>>
>> while I need to have PETSC without MPI.
>>
>> On 11/19/2019 2:55 PM, Matthew Knepley wrote:
>> The log you sent has configure completely successfully. Please retry and send the log for a failed run.
>>
>>    Thanks,
>>
>>       Matt
>>
>> On Tue, Nov 19, 2019 at 2:53 PM Povolotskyi, Mykhailo via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
>> Why it did not work then?
>>
>> On 11/19/2019 2:51 PM, Balay, Satish wrote:
>>> And I see from configure.log - you are using the correct option.
>>>
>>> Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0 --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0 --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11 --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0 --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=0 --download-superlu_dist=0 --download-parmetis=0 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0 --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64  -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack=0
>>> <<<<<<<
>>>
>>> And configure completed successfully. What issue are you encountering? Why do you think its activating MPI?
>>>
>>> Satish
>>>
>>>
>>> On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
>>>
>>>> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I'm trying to build PETSC without MPI.
>>>>>
>>>>> Even if I specify --with_mpi=0, the configuration script still activates
>>>>> MPI.
>>>>>
>>>>> I attach the configure.log.
>>>>>
>>>>> What am I doing wrong?
>>>> The option is --with-mpi=0
>>>>
>>>> Satish
>>>>
>>>>
>>>>> Thank you,
>>>>>
>>>>> Michael.
>>>>>
>>>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
>>


From yann.jobic at univ-amu.fr  Tue Nov 19 15:05:25 2019
From: yann.jobic at univ-amu.fr (Yann Jobic)
Date: Tue, 19 Nov 2019 22:05:25 +0100
Subject: [petsc-users] SLEPc GEVP for huge systems
In-Reply-To: <29214E41-C20C-4B97-9469-958C9D859C82@dsic.upv.es>
References: <4330c5e5-b40c-0e21-3be5-54a94a8e6a50@univ-amu.fr>
	<29214E41-C20C-4B97-9469-958C9D859C82@dsic.upv.es>
Message-ID: <365c0629-caae-0f1f-0fc0-4dc61335dc4d@univ-amu.fr>

Thanks for the fast answer !
The error coming from MUMPS is :
  On return from DMUMPS, INFOG(1)=              -9
  On return from DMUMPS, INFOG(2)=        29088157
The matrix size : 4972410*4972410
I need only 1 eigen value, the one near zero.
In order to have more precision, i put ncv at 500.
I'm using : -eps_gen_non_hermitian  -st_type sinvert -eps_target 0.1 
-eps_ncv 500 -eps_tol 1e-9 -bv_type vecs

I'm doing linear stability analysis. I'm looking at eigen values near 
zero, and if the first one is positive or negative.
The mass matrix is ill conditioned. On a smaller matrix, it seems that 
using KSP without a preconditioner gives satisfactory results. With a 
PC, it diverges.

  Number of iterations of the method: 1
  Number of linear iterations of the method: 1000
  Solution method: krylovschur

  Number of requested eigenvalues: 1
  Stopping condition: tol=1e-08, maxit=711
  Linear eigensolve converged (14 eigenpairs) due to CONVERGED_TOL; 
iterations 1
  ---------------------- --------------------
             k             ||Ax-kBx||/||kx||
  ---------------------- --------------------
    0.000005+0.016787i       7.87928e-07
    0.000005-0.016787i       7.87928e-07
        -0.001781            1.11832e-05
        -0.001802             0.00274427
[...]

I'm trying that on the big one.

Thanks for your help,

Yann


Le 11/19/2019 ? 5:25 PM, Jose E. Roman a ?crit?:
> Are you getting an error from MUMPS or from BV? What is the error message you get? What is the size of the matrix? How many eigenvalues do you need to compute?
> 
> In principle you can use any KSP+PC, see section 3.4.1 of the users manual. If you have a good preconditioner, then an alternative to Krylov methods is to use Davidson-type methods https://doi.org/10.1145/2543696 - in some cases these can be competitive.
> 
> Jose
> 
> 
>> El 19 nov 2019, a las 17:06, Yann Jobic via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>>
>> Hi all,
>> I'm trying to solve a huge generalize (unsymetric) eigen value problem with SLEPc + MUMPS. We actually failed to allocate the requested memory for MUMPS factorization (we tried BVVECS).
>> We would like to know if there is an alternate iterative way of solving such problems.
>> Thank you,
>> Best regards,
>> Yann
> 

From jroman at dsic.upv.es  Wed Nov 20 08:22:46 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Wed, 20 Nov 2019 15:22:46 +0100
Subject: [petsc-users] SLEPc GEVP for huge systems
In-Reply-To: <365c0629-caae-0f1f-0fc0-4dc61335dc4d@univ-amu.fr>
References: <4330c5e5-b40c-0e21-3be5-54a94a8e6a50@univ-amu.fr>
	<29214E41-C20C-4B97-9469-958C9D859C82@dsic.upv.es>
	<365c0629-caae-0f1f-0fc0-4dc61335dc4d@univ-amu.fr>
Message-ID: <33A81620-3CF9-4E16-A51A-8AA048948859@dsic.upv.es>


> El 19 nov 2019, a las 22:05, Yann Jobic <yann.jobic at univ-amu.fr> escribi?:
> 
> Thanks for the fast answer !
> The error coming from MUMPS is :
> On return from DMUMPS, INFOG(1)=              -9
> On return from DMUMPS, INFOG(2)=        29088157
> The matrix size : 4972410*4972410

You may want to try running with -mat_mumps_icntl_14 200

> I need only 1 eigen value, the one near zero.
> In order to have more precision, i put ncv at 500.
> I'm using : -eps_gen_non_hermitian  -st_type sinvert -eps_target 0.1 -eps_ncv 500 -eps_tol 1e-9 -bv_type vecs

BVVECS is going to be slower, if the default BV gives a memory error I would suggest using BVMAT.

> 
> I'm doing linear stability analysis. I'm looking at eigen values near zero, and if the first one is positive or negative.
> The mass matrix is ill conditioned. On a smaller matrix, it seems that using KSP without a preconditioner gives satisfactory results. With a PC, it diverges.
> 
> Number of iterations of the method: 1
> Number of linear iterations of the method: 1000
> Solution method: krylovschur
> 
> Number of requested eigenvalues: 1
> Stopping condition: tol=1e-08, maxit=711
> Linear eigensolve converged (14 eigenpairs) due to CONVERGED_TOL; iterations 1
> ---------------------- --------------------
>            k             ||Ax-kBx||/||kx||
> ---------------------- --------------------
>   0.000005+0.016787i       7.87928e-07
>   0.000005-0.016787i       7.87928e-07
>       -0.001781            1.11832e-05
>       -0.001802             0.00274427
> [...]

By default the KSP tolerance is equal to the EPS tolerance. You may need to reduce the KSP tolerance, e.g. -st_ksp_rtol 1e-9

Jose

> 
> I'm trying that on the big one.
> 
> Thanks for your help,
> 
> Yann
> 
> 
> Le 11/19/2019 ? 5:25 PM, Jose E. Roman a ?crit :
>> Are you getting an error from MUMPS or from BV? What is the error message you get? What is the size of the matrix? How many eigenvalues do you need to compute?
>> In principle you can use any KSP+PC, see section 3.4.1 of the users manual. If you have a good preconditioner, then an alternative to Krylov methods is to use Davidson-type methods https://doi.org/10.1145/2543696 - in some cases these can be competitive.
>> Jose
>>> El 19 nov 2019, a las 17:06, Yann Jobic via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>>> 
>>> Hi all,
>>> I'm trying to solve a huge generalize (unsymetric) eigen value problem with SLEPc + MUMPS. We actually failed to allocate the requested memory for MUMPS factorization (we tried BVVECS).
>>> We would like to know if there is an alternate iterative way of solving such problems.
>>> Thank you,
>>> Best regards,
>>> Yann


From yann.jobic at univ-amu.fr  Wed Nov 20 11:35:26 2019
From: yann.jobic at univ-amu.fr (Yann Jobic)
Date: Wed, 20 Nov 2019 18:35:26 +0100
Subject: [petsc-users] SLEPc GEVP for huge systems
In-Reply-To: <33A81620-3CF9-4E16-A51A-8AA048948859@dsic.upv.es>
References: <4330c5e5-b40c-0e21-3be5-54a94a8e6a50@univ-amu.fr>
	<29214E41-C20C-4B97-9469-958C9D859C82@dsic.upv.es>
	<365c0629-caae-0f1f-0fc0-4dc61335dc4d@univ-amu.fr>
	<33A81620-3CF9-4E16-A51A-8AA048948859@dsic.upv.es>
Message-ID: <344ebf39-11f3-5d52-e177-57812a4cba2c@univ-amu.fr>

Hi Jose,
My matrices were not correct...
It's now running fine, with mumps.
Thanks for the help,
Best regards,
Yann


On 20/11/2019 15:22, Jose E. Roman wrote:
> 
> 
>> El 19 nov 2019, a las 22:05, Yann Jobic <yann.jobic at univ-amu.fr> escribi?:
>>
>> Thanks for the fast answer !
>> The error coming from MUMPS is :
>> On return from DMUMPS, INFOG(1)=              -9
>> On return from DMUMPS, INFOG(2)=        29088157
>> The matrix size : 4972410*4972410
> 
> You may want to try running with -mat_mumps_icntl_14 200
> 
>> I need only 1 eigen value, the one near zero.
>> In order to have more precision, i put ncv at 500.
>> I'm using : -eps_gen_non_hermitian  -st_type sinvert -eps_target 0.1 -eps_ncv 500 -eps_tol 1e-9 -bv_type vecs
> 
> BVVECS is going to be slower, if the default BV gives a memory error I would suggest using BVMAT.
> 
>>
>> I'm doing linear stability analysis. I'm looking at eigen values near zero, and if the first one is positive or negative.
>> The mass matrix is ill conditioned. On a smaller matrix, it seems that using KSP without a preconditioner gives satisfactory results. With a PC, it diverges.
>>
>> Number of iterations of the method: 1
>> Number of linear iterations of the method: 1000
>> Solution method: krylovschur
>>
>> Number of requested eigenvalues: 1
>> Stopping condition: tol=1e-08, maxit=711
>> Linear eigensolve converged (14 eigenpairs) due to CONVERGED_TOL; iterations 1
>> ---------------------- --------------------
>>             k             ||Ax-kBx||/||kx||
>> ---------------------- --------------------
>>    0.000005+0.016787i       7.87928e-07
>>    0.000005-0.016787i       7.87928e-07
>>        -0.001781            1.11832e-05
>>        -0.001802             0.00274427
>> [...]
> 
> By default the KSP tolerance is equal to the EPS tolerance. You may need to reduce the KSP tolerance, e.g. -st_ksp_rtol 1e-9
> 
> Jose
> 
>>
>> I'm trying that on the big one.
>>
>> Thanks for your help,
>>
>> Yann
>>
>>
>> Le 11/19/2019 ? 5:25 PM, Jose E. Roman a ?crit :
>>> Are you getting an error from MUMPS or from BV? What is the error message you get? What is the size of the matrix? How many eigenvalues do you need to compute?
>>> In principle you can use any KSP+PC, see section 3.4.1 of the users manual. If you have a good preconditioner, then an alternative to Krylov methods is to use Davidson-type methods https://doi.org/10.1145/2543696 - in some cases these can be competitive.
>>> Jose
>>>> El 19 nov 2019, a las 17:06, Yann Jobic via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>>>>
>>>> Hi all,
>>>> I'm trying to solve a huge generalize (unsymetric) eigen value problem with SLEPc + MUMPS. We actually failed to allocate the requested memory for MUMPS factorization (we tried BVVECS).
>>>> We would like to know if there is an alternate iterative way of solving such problems.
>>>> Thank you,
>>>> Best regards,
>>>> Yann
> 

From perceval.desforges at polytechnique.edu  Thu Nov 21 11:13:52 2019
From: perceval.desforges at polytechnique.edu (Perceval Desforges)
Date: Thu, 21 Nov 2019 18:13:52 +0100
Subject: [petsc-users] Memory optimization
Message-ID: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>

Hello all, 

I am trying to obtain all the eigenvalues in a certain interval for a
fairly large matrix (1000000 * 1000000). I therefore use the spectrum
slicing method detailed in section 3.4.5 of the manual. The calculations
are run on a processor with 20 cores and 96 Go of RAM. 

The options I use are : 

-bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1
-mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12 

However the program quickly crashes with this error: 

slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084
> 107317760), being killed 

I've tried reducing the amount of memory used by slepc with the
-mat_mumps_icntl_14 option by setting it at -70 for example but then I
get this error: 

[1]PETSC ERROR: Error in external library
[1]PETSC ERROR: Error reported by MUMPS in numerical factorization
phase: INFOG(1)=-9, INFO(2)=82733614 

which is an error due to setting the mumps icntl option so low from what
I've gathered. 

Is there any other way I can reduce memory usage? 

Thanks, 

Regards, 

Perceval, 

P.S. I sent the same email a few minutes ago but I think I made a
mistake in the address, I'm sorry if I've sent it twice.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191121/620db702/attachment.html>

From jroman at dsic.upv.es  Thu Nov 21 11:39:43 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Thu, 21 Nov 2019 18:39:43 +0100
Subject: [petsc-users] Memory optimization
In-Reply-To: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
Message-ID: <FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>

Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.

Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
You can comment out the call to EPSSolve() and run with the option -show_inertias
For example, the output
   Shift 0.1  Inertia 3 
   Shift 0.35  Inertia 11 
means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).

By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).

Jose


> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> 
> Hello all,
> 
> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
> 
> The options I use are :
> 
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
> 
> 
> 
> However the program quickly crashes with this error:
> 
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
> 
> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
> 
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
> 
> which is an error due to setting the mumps icntl option so low from what I've gathered.
> 
> Is there any other way I can reduce memory usage?
> 
> 
> 
> Thanks,
> 
> Regards,
> 
> Perceval,
> 
> 
> 
> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.
> 


From perceval.desforges at polytechnique.edu  Fri Nov 22 12:56:31 2019
From: perceval.desforges at polytechnique.edu (Perceval Desforges)
Date: Fri, 22 Nov 2019 19:56:31 +0100
Subject: [petsc-users] Memory optimization
In-Reply-To: <FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
Message-ID: <852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>

Hi, 

Thanks for your answer. I tried looking at the inertias before solving,
but the problem is that the program crashes when I call EPSSetUp with
this error: 

slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508
> 107317760), being killed 

I get this error even when there are no eigenvalues in the interval. 

I've started using BVMAT instead of BVVECS by the way. 

Thanks, 

Perceval, 

> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
> 
> Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option -show_inertias
> For example, the output
> Shift 0.1  Inertia 3 
> Shift 0.35  Inertia 11 
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
> 
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).
> 
> Jose
> 
>> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>> 
>> Hello all,
>> 
>> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
>> 
>> The options I use are :
>> 
>> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>> 
>> However the program quickly crashes with this error:
>> 
>> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
>> 
>> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
>> 
>> [1]PETSC ERROR: Error in external library
>> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
>> 
>> which is an error due to setting the mumps icntl option so low from what I've gathered.
>> 
>> Is there any other way I can reduce memory usage?
>> 
>> Thanks,
>> 
>> Regards,
>> 
>> Perceval,
>> 
>> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191122/7015b1ea/attachment.html>

From balay at mcs.anl.gov  Fri Nov 22 15:51:37 2019
From: balay at mcs.anl.gov (Balay, Satish)
Date: Fri, 22 Nov 2019 21:51:37 +0000
Subject: [petsc-users] petsc-3.12.2.tar.gz now available
Message-ID: <alpine.LFD.2.21.1911221548530.2443@sb>

Dear PETSc users,

The patch release petsc-3.12.2 is now available for download,
with change list at 'PETSc-3.12 Changelog'

http://www.mcs.anl.gov/petsc/download/index.html

Satish


From ztdepyahoo at gmail.com  Sun Nov 24 17:22:08 2019
From: ztdepyahoo at gmail.com (Peng Ding)
Date: Mon, 25 Nov 2019 07:22:08 +0800
Subject: [petsc-users] Fwd: how to set the matrix with the new cell ordering
 with metis
In-Reply-To: <301A0E4B-E95D-4746-9CFF-A64084D27CA5@gmail.com>
References: <301A0E4B-E95D-4746-9CFF-A64084D27CA5@gmail.com>
Message-ID: <CANsL=kvG3sLUpCGdhmDjJ=zgoZo1ATaHjsPdD8HuT7bqd8Bw1A@mail.gmail.com>

Dear sir:
     I generate the 3D mesh with Tetgen for FVM computation.  I reordered
the cells and partitioned them with metis.  Then i got an array which
records the distribution of each cells.  For example, on CPU -0., I have:
      1, 3, 6 , 7 ....11, 22......
    But all these cells have non-continuos index, how to set the matrix in
petsc.
Regards


ztdepyahoo
ztdepyahoo at gmail.com
<https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=ztdepyahoo&uid=ztdepyahoo%40gmail.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22ztdepyahoo%40gmail.com%22%5D>
??? ?????? <https://mail.163.com/dashi/dlpro.html?from=mail81> ??
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/fa6ca1fc/attachment.html>

From knepley at gmail.com  Sun Nov 24 17:29:49 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 24 Nov 2019 17:29:49 -0600
Subject: [petsc-users] Fwd: how to set the matrix with the new cell
 ordering with metis
In-Reply-To: <CANsL=kvG3sLUpCGdhmDjJ=zgoZo1ATaHjsPdD8HuT7bqd8Bw1A@mail.gmail.com>
References: <301A0E4B-E95D-4746-9CFF-A64084D27CA5@gmail.com>
	<CANsL=kvG3sLUpCGdhmDjJ=zgoZo1ATaHjsPdD8HuT7bqd8Bw1A@mail.gmail.com>
Message-ID: <CAMYG4GmdqVY5xz+VG+48T+1xNYQB8VPDHWbOXHaRrdOTdGoyGQ@mail.gmail.com>

On Sun, Nov 24, 2019 at 5:23 PM Peng Ding <ztdepyahoo at gmail.com> wrote:

>
> Dear sir:
>      I generate the 3D mesh with Tetgen for FVM computation.  I reordered
> the cells and partitioned them with metis.  Then i got an array which
> records the distribution of each cells.  For example, on CPU -0., I have:
>       1, 3, 6 , 7 ....11, 22......
>     But all these cells have non-continuos index, how to set the matrix in
> petsc.
>

You will have to renumber them.

  Thanks,

    Matt


> Regards
>
>
> ztdepyahoo
> ztdepyahoo at gmail.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=ztdepyahoo&uid=ztdepyahoo%40gmail.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22ztdepyahoo%40gmail.com%22%5D>
> ??? ?????? <https://mail.163.com/dashi/dlpro.html?from=mail81> ??
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191124/04681bb1/attachment.html>

From bsmith at mcs.anl.gov  Sun Nov 24 21:40:04 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Mon, 25 Nov 2019 03:40:04 +0000
Subject: [petsc-users] how to set the matrix with the new cell ordering
 with metis
In-Reply-To: <CAMYG4GmdqVY5xz+VG+48T+1xNYQB8VPDHWbOXHaRrdOTdGoyGQ@mail.gmail.com>
References: <301A0E4B-E95D-4746-9CFF-A64084D27CA5@gmail.com>
	<CANsL=kvG3sLUpCGdhmDjJ=zgoZo1ATaHjsPdD8HuT7bqd8Bw1A@mail.gmail.com>
	<CAMYG4GmdqVY5xz+VG+48T+1xNYQB8VPDHWbOXHaRrdOTdGoyGQ@mail.gmail.com>
Message-ID: <8F373FC9-58BC-4AA1-86FA-BA440469C537@anl.gov>


  You can possibly use the PETSc object AO (see AOCreate()) to manage the reordering. The non-contiguous order you start with is the application ordering and the new contiguous ordering is the petsc ordering. Note you will likely need to reorder the cell vertex or edge numbers as well. 

   Barry


> On Nov 24, 2019, at 5:29 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Sun, Nov 24, 2019 at 5:23 PM Peng Ding <ztdepyahoo at gmail.com> wrote:
> 
> Dear sir:
>      I generate the 3D mesh with Tetgen for FVM computation.  I reordered the cells and partitioned them with metis.  Then i got an array which records the distribution of each cells.  For example, on CPU -0., I have:
>       1, 3, 6 , 7 ....11, 22......
>     But all these cells have non-continuos index, how to set the matrix in petsc.
> 
> You will have to renumber them.
> 
>   Thanks,
> 
>     Matt
>  
> Regards
>     
> 
>  	
> ztdepyahoo
> ztdepyahoo at gmail.com
> ??? ?????? ??
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/


From jroman at dsic.upv.es  Mon Nov 25 02:44:12 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 25 Nov 2019 09:44:12 +0100
Subject: [petsc-users] Memory optimization
In-Reply-To: <852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
Message-ID: <9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>

Then I guess it is the factorization that is failing. How many nonzero entries do you have? Run with
-mat_view ::ascii_info

Jose


> El 22 nov 2019, a las 19:56, Perceval Desforges <perceval.desforges at polytechnique.edu> escribi?:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, but the problem is that the program crashes when I call EPSSetUp with this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> 
> 
> 
> 
>> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>> 
>> Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
>> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
>> You can comment out the call to EPSSolve() and run with the option -show_inertias
>> For example, the output
>>    Shift 0.1  Inertia 3 
>>    Shift 0.35  Inertia 11 
>> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>> 
>> By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).
>> 
>> Jose
>> 
>> 
>>> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>>> 
>>> Hello all,
>>> 
>>> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
>>> 
>>> The options I use are :
>>> 
>>> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>>> 
>>> 
>>> 
>>> However the program quickly crashes with this error:
>>> 
>>> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
>>> 
>>> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
>>> 
>>> [1]PETSC ERROR: Error in external library
>>> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
>>> 
>>> which is an error due to setting the mumps icntl option so low from what I've gathered.
>>> 
>>> Is there any other way I can reduce memory usage?
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Regards,
>>> 
>>> Perceval,
>>> 
>>> 
>>> 
>>> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.
> 
> 


From perceval.desforges at polytechnique.edu  Mon Nov 25 11:20:24 2019
From: perceval.desforges at polytechnique.edu (Perceval Desforges)
Date: Mon, 25 Nov 2019 18:20:24 +0100
Subject: [petsc-users] Memory optimization
In-Reply-To: <9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
	<9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
Message-ID: <792624a5b70858444b7e529ce3624395@polytechnique.edu>

Hi, 

So I'm loading two matrices from files, both 1000000 by 10000000. I ran
the program with -mat_view::ascii_info and I got: 

Mat Object: 1 MPI processes
  type: seqaij
  rows=1000000, cols=1000000
  total: nonzeros=7000000, allocated nonzeros=7000000
  total number of mallocs used during MatSetValues calls =0
    not using I-node routines 

20 times, and then 

Mat Object: 1 MPI processes
  type: seqaij
  rows=1000000, cols=1000000
  total: nonzeros=1000000, allocated nonzeros=1000000
  total number of mallocs used during MatSetValues calls =0
    not using I-node routines 

20 times as well, and then 

Mat Object: 1 MPI processes
  type: seqaij
  rows=1000000, cols=1000000
  total: nonzeros=7000000, allocated nonzeros=7000000
  total number of mallocs used during MatSetValues calls =0
    not using I-node routines 

20 times as well before crashing. 

I realized it might be because I am setting up 20 krylov schur
partitions which may be too much. I tried running the code again with
only 2 partitions and now the code runs but I have speed issues. 

I have one version of the code where my first matrix has 5 non-zero
diagonals (so 5000000 non-zero entries), and the set up time is quite
fast (8 seconds)  and solving is also quite fast. The second version is
the same but I have two extra non-zero diagonals (7000000 non-zero
entries)  and the set up time is a lot slower (2900 seconds ~ 50
minutes) and solving is also a lot slower. Is it normal that adding two
extra diagonals increases solve and set up time so much? 

Thanks again, 

Best regards, 

Perceval, 

> Then I guess it is the factorization that is failing. How many nonzero entries do you have? Run with
> -mat_view ::ascii_info
> 
> Jose
> 
> El 22 nov 2019, a las 19:56, Perceval Desforges <perceval.desforges at polytechnique.edu> escribi?:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, but the problem is that the program crashes when I call EPSSetUp with this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
> 
> Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option -show_inertias
> For example, the output
> Shift 0.1  Inertia 3 
> Shift 0.35  Inertia 11 
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
> 
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).
> 
> Jose
> 
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> 
> Hello all,
> 
> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
> 
> The options I use are :
> 
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
> 
> However the program quickly crashes with this error:
> 
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
> 
> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
> 
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
> 
> which is an error due to setting the mumps icntl option so low from what I've gathered.
> 
> Is there any other way I can reduce memory usage?
> 
> Thanks,
> 
> Regards,
> 
> Perceval,
> 
> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/2dd4bb12/attachment-0001.html>

From knepley at gmail.com  Mon Nov 25 11:25:07 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Nov 2019 11:25:07 -0600
Subject: [petsc-users] Memory optimization
In-Reply-To: <792624a5b70858444b7e529ce3624395@polytechnique.edu>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
	<9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
	<792624a5b70858444b7e529ce3624395@polytechnique.edu>
Message-ID: <CAMYG4G=fk6cb+n-3q9YimPztq+vv+RN0woWBZ4Ae9=eRfOrFEA@mail.gmail.com>

On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <
perceval.desforges at polytechnique.edu> wrote:

> Hi,
>
> So I'm loading two matrices from files, both 1000000 by 10000000. I ran
> the program with -mat_view::ascii_info and I got:
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=7000000, allocated nonzeros=7000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
>
> 20 times, and then
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=1000000, allocated nonzeros=1000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
>
> 20 times as well, and then
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=7000000, allocated nonzeros=7000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
>
> 20 times as well before crashing.
>
> I realized it might be because I am setting up 20 krylov schur partitions
> which may be too much. I tried running the code again with only 2
> partitions and now the code runs but I have speed issues.
>
> I have one version of the code where my first matrix has 5 non-zero
> diagonals (so 5000000 non-zero entries), and the set up time is quite fast
> (8 seconds)  and solving is also quite fast. The second version is the same
> but I have two extra non-zero diagonals (7000000 non-zero entries)  and the
> set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also
> a lot slower. Is it normal that adding two extra diagonals increases solve
> and set up time so much?
>
> I can't see the rest of your code, but I am guessing your preallocation
statement has "5", so it does no mallocs when you create
your first matrix, but mallocs for every row when you create your second
matrix. When you load them from disk, we do all the
preallocation correctly.

  Thanks,

    Matt

> Thanks again,
>
> Best regards,
>
> Perceval,
>
>
>
> Then I guess it is the factorization that is failing. How many nonzero
> entries do you have? Run with
> -mat_view ::ascii_info
>
> Jose
>
>
> El 22 nov 2019, a las 19:56, Perceval Desforges <
> perceval.desforges at polytechnique.edu> escribi?:
>
> Hi,
>
> Thanks for your answer. I tried looking at the inertias before solving,
> but the problem is that the program crashes when I call EPSSetUp with this
> error:
>
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 >
> 107317760), being killed
>
> I get this error even when there are no eigenvalues in the interval.
>
> I've started using BVMAT instead of BVVECS by the way.
>
> Thanks,
>
> Perceval,
>
>
>
>
>
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>
> Most likely the problem is that the interval you gave is too large and
> contains too many eigenvalues (SLEPc needs to allocate at least one vector
> per each eigenvalue). You can count the eigenvalues in the interval with
> the inertias, which are available at EPSSetUp (no need to call EPSSolve).
> See this example:
>
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option
> -show_inertias
> For example, the output
>    Shift 0.1  Inertia 3
>    Shift 0.35  Inertia 11
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is
> slower).
>
> Jose
>
>
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <
> petsc-users at mcs.anl.gov> escribi?:
>
> Hello all,
>
> I am trying to obtain all the eigenvalues in a certain interval for a
> fairly large matrix (1000000 * 1000000). I therefore use the spectrum
> slicing method detailed in section 3.4.5 of the manual. The calculations
> are run on a processor with 20 cores and 96 Go of RAM.
>
> The options I use are :
>
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1
> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>
>
>
> However the program quickly crashes with this error:
>
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 >
> 107317760), being killed
>
> I've tried reducing the amount of memory used by slepc with the
> -mat_mumps_icntl_14 option by setting it at -70 for example but then I get
> this error:
>
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase:
> INFOG(1)=-9, INFO(2)=82733614
>
> which is an error due to setting the mumps icntl option so low from what
> I've gathered.
>
> Is there any other way I can reduce memory usage?
>
>
>
> Thanks,
>
> Regards,
>
> Perceval,
>
>
>
> P.S. I sent the same email a few minutes ago but I think I made a mistake
> in the address, I'm sorry if I've sent it twice.
>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/f2cf7807/attachment.html>

From jroman at dsic.upv.es  Mon Nov 25 11:31:19 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 25 Nov 2019 18:31:19 +0100
Subject: [petsc-users] Memory optimization
In-Reply-To: <CAMYG4G=fk6cb+n-3q9YimPztq+vv+RN0woWBZ4Ae9=eRfOrFEA@mail.gmail.com>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
	<9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
	<792624a5b70858444b7e529ce3624395@polytechnique.edu>
	<CAMYG4G=fk6cb+n-3q9YimPztq+vv+RN0woWBZ4Ae9=eRfOrFEA@mail.gmail.com>
Message-ID: <ED9A9BC9-C85D-4438-93C4-25C0BA211324@dsic.upv.es>

Probably it is not a preallocation issue, as it shows "total number of mallocs used during MatSetValues calls =0".

Adding new diagonals may increase fill-in a lot, if the new diagonals are displaced with respect to the other ones.

The partitions option is intended for running several nodes. If you are using just one node probably it is better to set one partition only.

Jose


> El 25 nov 2019, a las 18:25, Matthew Knepley <knepley at gmail.com> escribi?:
> 
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <perceval.desforges at polytechnique.edu> wrote:
> Hi,
> 
> So I'm loading two matrices from files, both 1000000 by 10000000. I ran the program with -mat_view::ascii_info and I got:
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=7000000, allocated nonzeros=7000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
> 
> 20 times, and then
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=1000000, allocated nonzeros=1000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
> 
> 20 times as well, and then
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=7000000, allocated nonzeros=7000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
> 
> 20 times as well before crashing.
> 
> I realized it might be because I am setting up 20 krylov schur partitions which may be too much. I tried running the code again with only 2 partitions and now the code runs but I have speed issues.
> 
> I have one version of the code where my first matrix has 5 non-zero diagonals (so 5000000 non-zero entries), and the set up time is quite fast (8 seconds)  and solving is also quite fast. The second version is the same but I have two extra non-zero diagonals (7000000 non-zero entries)  and the set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also a lot slower. Is it normal that adding two extra diagonals increases solve and set up time so much?
> 
> 
> I can't see the rest of your code, but I am guessing your preallocation statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second matrix. When you load them from disk, we do all the
> preallocation correctly.
> 
>   Thanks,
> 
>     Matt 
> Thanks again,
> 
> Best regards,
> 
> Perceval,
> 
> 
> 
> 
> 
>> Then I guess it is the factorization that is failing. How many nonzero entries do you have? Run with
>> -mat_view ::ascii_info
>> 
>> Jose
>> 
>> 
>>> El 22 nov 2019, a las 19:56, Perceval Desforges <perceval.desforges at polytechnique.edu> escribi?:
>>> 
>>> Hi,
>>> 
>>> Thanks for your answer. I tried looking at the inertias before solving, but the problem is that the program crashes when I call EPSSetUp with this error:
>>> 
>>> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 107317760), being killed
>>> 
>>> I get this error even when there are no eigenvalues in the interval.
>>> 
>>> I've started using BVMAT instead of BVVECS by the way.
>>> 
>>> Thanks,
>>> 
>>> Perceval,
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>>>> 
>>>> Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
>>>> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
>>>> You can comment out the call to EPSSolve() and run with the option -show_inertias
>>>> For example, the output
>>>>    Shift 0.1  Inertia 3 
>>>>    Shift 0.35  Inertia 11 
>>>> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>>>> 
>>>> By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).
>>>> 
>>>> Jose
>>>> 
>>>> 
>>>>> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>>>>> 
>>>>> Hello all,
>>>>> 
>>>>> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
>>>>> 
>>>>> The options I use are :
>>>>> 
>>>>> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>>>>> 
>>>>> 
>>>>> 
>>>>> However the program quickly crashes with this error:
>>>>> 
>>>>> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
>>>>> 
>>>>> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
>>>>> 
>>>>> [1]PETSC ERROR: Error in external library
>>>>> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
>>>>> 
>>>>> which is an error due to setting the mumps icntl option so low from what I've gathered.
>>>>> 
>>>>> Is there any other way I can reduce memory usage?
>>>>> 
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> Perceval,
>>>>> 
>>>>> 
>>>>> 
>>>>> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.
>>> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/


From perceval.desforges at polytechnique.edu  Mon Nov 25 11:44:50 2019
From: perceval.desforges at polytechnique.edu (Perceval Desforges)
Date: Mon, 25 Nov 2019 18:44:50 +0100
Subject: [petsc-users] Memory optimization
In-Reply-To: <ED9A9BC9-C85D-4438-93C4-25C0BA211324@dsic.upv.es>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
	<9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
	<792624a5b70858444b7e529ce3624395@polytechnique.edu>
	<CAMYG4G=fk6cb+n-3q9YimPztq+vv+RN0woWBZ4Ae9=eRfOrFEA@mail.gmail.com>
	<ED9A9BC9-C85D-4438-93C4-25C0BA211324@dsic.upv.es>
Message-ID: <0007da7378494d3bdb15c219872d3359@polytechnique.edu>

I am basically trying to solve a finite element problem, which is why in
3D I have 7 non-zero diagonals that are quite farm apart from one
another. In 2D I only have 5 non-zero diagonals that are less far apart.
So is it normal that the set up time is around 400 times greater in the
3D case? Is there nothing to be done? 

I will try setting up only one partition. 

Thanks, 

Perceval,

> Probably it is not a preallocation issue, as it shows "total number of mallocs used during MatSetValues calls =0".
> 
> Adding new diagonals may increase fill-in a lot, if the new diagonals are displaced with respect to the other ones.
> 
> The partitions option is intended for running several nodes. If you are using just one node probably it is better to set one partition only.
> 
> Jose
> 
> El 25 nov 2019, a las 18:25, Matthew Knepley <knepley at gmail.com> escribi?:
> 
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <perceval.desforges at polytechnique.edu> wrote:
> Hi,
> 
> So I'm loading two matrices from files, both 1000000 by 10000000. I ran the program with -mat_view::ascii_info and I got:
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=7000000, allocated nonzeros=7000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times, and then
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=1000000, allocated nonzeros=1000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well, and then
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=7000000, allocated nonzeros=7000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well before crashing.
> 
> I realized it might be because I am setting up 20 krylov schur partitions which may be too much. I tried running the code again with only 2 partitions and now the code runs but I have speed issues.
> 
> I have one version of the code where my first matrix has 5 non-zero diagonals (so 5000000 non-zero entries), and the set up time is quite fast (8 seconds)  and solving is also quite fast. The second version is the same but I have two extra non-zero diagonals (7000000 non-zero entries)  and the set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also a lot slower. Is it normal that adding two extra diagonals increases solve and set up time so much?
> 
> I can't see the rest of your code, but I am guessing your preallocation statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second matrix. When you load them from disk, we do all the
> preallocation correctly.
> 
> Thanks,
> 
> Matt 
> Thanks again,
> 
> Best regards,
> 
> Perceval,
> 
> Then I guess it is the factorization that is failing. How many nonzero entries do you have? Run with
> -mat_view ::ascii_info
> 
> Jose
> 
> El 22 nov 2019, a las 19:56, Perceval Desforges <perceval.desforges at polytechnique.edu> escribi?:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, but the problem is that the program crashes when I call EPSSetUp with this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
> 
> Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option -show_inertias
> For example, the output
> Shift 0.1  Inertia 3 
> Shift 0.35  Inertia 11 
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
> 
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).
> 
> Jose
> 
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> 
> Hello all,
> 
> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
> 
> The options I use are :
> 
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
> 
> However the program quickly crashes with this error:
> 
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
> 
> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
> 
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
> 
> which is an error due to setting the mumps icntl option so low from what I've gathered.
> 
> Is there any other way I can reduce memory usage?
> 
> Thanks,
> 
> Regards,
> 
> Perceval,
> 
> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/63b72f8d/attachment-0001.html>

From jroman at dsic.upv.es  Mon Nov 25 11:49:16 2019
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 25 Nov 2019 18:49:16 +0100
Subject: [petsc-users] Memory optimization
In-Reply-To: <0007da7378494d3bdb15c219872d3359@polytechnique.edu>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
	<9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
	<792624a5b70858444b7e529ce3624395@polytechnique.edu>
	<CAMYG4G=fk6cb+n-3q9YimPztq+vv+RN0woWBZ4Ae9=eRfOrFEA@mail.gmail.com>
	<ED9A9BC9-C85D-4438-93C4-25C0BA211324@dsic.upv.es>
	<0007da7378494d3bdb15c219872d3359@polytechnique.edu>
Message-ID: <D60EEADB-B51D-41D9-AEEE-E6181F476113@dsic.upv.es>

In 3D problems it is recommended to use preconditioned iterative solvers. Unfortunately the spectrum slicing technique requires the full factorization (because it uses matrix inertia).


> El 25 nov 2019, a las 18:44, Perceval Desforges <perceval.desforges at polytechnique.edu> escribi?:
> 
> I am basically trying to solve a finite element problem, which is why in 3D I have 7 non-zero diagonals that are quite farm apart from one another. In 2D I only have 5 non-zero diagonals that are less far apart. So is it normal that the set up time is around 400 times greater in the 3D case? Is there nothing to be done?
> 
> I will try setting up only one partition.
> 
> Thanks,
> 
> Perceval,
> 
>> Probably it is not a preallocation issue, as it shows "total number of mallocs used during MatSetValues calls =0".
>> 
>> Adding new diagonals may increase fill-in a lot, if the new diagonals are displaced with respect to the other ones.
>> 
>> The partitions option is intended for running several nodes. If you are using just one node probably it is better to set one partition only.
>> 
>> Jose
>> 
>> 
>>> El 25 nov 2019, a las 18:25, Matthew Knepley <knepley at gmail.com> escribi?:
>>> 
>>> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <perceval.desforges at polytechnique.edu> wrote:
>>> Hi,
>>> 
>>> So I'm loading two matrices from files, both 1000000 by 10000000. I ran the program with -mat_view::ascii_info and I got:
>>> 
>>> Mat Object: 1 MPI processes
>>>   type: seqaij
>>>   rows=1000000, cols=1000000
>>>   total: nonzeros=7000000, allocated nonzeros=7000000
>>>   total number of mallocs used during MatSetValues calls =0
>>>     not using I-node routines
>>> 
>>> 20 times, and then
>>> 
>>> Mat Object: 1 MPI processes
>>>   type: seqaij
>>>   rows=1000000, cols=1000000
>>>   total: nonzeros=1000000, allocated nonzeros=1000000
>>>   total number of mallocs used during MatSetValues calls =0
>>>     not using I-node routines
>>> 
>>> 20 times as well, and then
>>> 
>>> Mat Object: 1 MPI processes
>>>   type: seqaij
>>>   rows=1000000, cols=1000000
>>>   total: nonzeros=7000000, allocated nonzeros=7000000
>>>   total number of mallocs used during MatSetValues calls =0
>>>     not using I-node routines
>>> 
>>> 20 times as well before crashing.
>>> 
>>> I realized it might be because I am setting up 20 krylov schur partitions which may be too much. I tried running the code again with only 2 partitions and now the code runs but I have speed issues.
>>> 
>>> I have one version of the code where my first matrix has 5 non-zero diagonals (so 5000000 non-zero entries), and the set up time is quite fast (8 seconds)  and solving is also quite fast. The second version is the same but I have two extra non-zero diagonals (7000000 non-zero entries)  and the set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also a lot slower. Is it normal that adding two extra diagonals increases solve and set up time so much?
>>> 
>>> 
>>> I can't see the rest of your code, but I am guessing your preallocation statement has "5", so it does no mallocs when you create
>>> your first matrix, but mallocs for every row when you create your second matrix. When you load them from disk, we do all the
>>> preallocation correctly.
>>> 
>>>   Thanks,
>>> 
>>>     Matt 
>>> Thanks again,
>>> 
>>> Best regards,
>>> 
>>> Perceval,
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> Then I guess it is the factorization that is failing. How many nonzero entries do you have? Run with
>>>> -mat_view ::ascii_info
>>>> 
>>>> Jose
>>>> 
>>>> 
>>>>> El 22 nov 2019, a las 19:56, Perceval Desforges <perceval.desforges at polytechnique.edu> escribi?:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Thanks for your answer. I tried looking at the inertias before solving, but the problem is that the program crashes when I call EPSSetUp with this error:
>>>>> 
>>>>> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 107317760), being killed
>>>>> 
>>>>> I get this error even when there are no eigenvalues in the interval.
>>>>> 
>>>>> I've started using BVMAT instead of BVVECS by the way.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Perceval,
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>>>>>> 
>>>>>> Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
>>>>>> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
>>>>>> You can comment out the call to EPSSolve() and run with the option -show_inertias
>>>>>> For example, the output
>>>>>>    Shift 0.1  Inertia 3 
>>>>>>    Shift 0.35  Inertia 11 
>>>>>> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>>>>>> 
>>>>>> By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).
>>>>>> 
>>>>>> Jose
>>>>>> 
>>>>>> 
>>>>>>> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
>>>>>>> 
>>>>>>> The options I use are :
>>>>>>> 
>>>>>>> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> However the program quickly crashes with this error:
>>>>>>> 
>>>>>>> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
>>>>>>> 
>>>>>>> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
>>>>>>> 
>>>>>>> [1]PETSC ERROR: Error in external library
>>>>>>> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
>>>>>>> 
>>>>>>> which is an error due to setting the mumps icntl option so low from what I've gathered.
>>>>>>> 
>>>>>>> Is there any other way I can reduce memory usage?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Regards,
>>>>>>> 
>>>>>>> Perceval,
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/
> 
> 


From knepley at gmail.com  Mon Nov 25 11:48:52 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Nov 2019 11:48:52 -0600
Subject: [petsc-users] Memory optimization
In-Reply-To: <0007da7378494d3bdb15c219872d3359@polytechnique.edu>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
	<9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
	<792624a5b70858444b7e529ce3624395@polytechnique.edu>
	<CAMYG4G=fk6cb+n-3q9YimPztq+vv+RN0woWBZ4Ae9=eRfOrFEA@mail.gmail.com>
	<ED9A9BC9-C85D-4438-93C4-25C0BA211324@dsic.upv.es>
	<0007da7378494d3bdb15c219872d3359@polytechnique.edu>
Message-ID: <CAMYG4GmDD8B1ReSx3-bAc4HKMAL26KXzEs61WY8QkYN7YJJFnQ@mail.gmail.com>

On Mon, Nov 25, 2019 at 11:45 AM Perceval Desforges <
perceval.desforges at polytechnique.edu> wrote:

> I am basically trying to solve a finite element problem, which is why in
> 3D I have 7 non-zero diagonals that are quite farm apart from one another.
> In 2D I only have 5 non-zero diagonals that are less far apart. So is it
> normal that the set up time is around 400 times greater in the 3D case? Is
> there nothing to be done?
>
> No. It is almost certain that preallocation is screwed up. There is no way
it can take 400x longer for a few nonzeros.

In order to debug, please send the output of -log_view and indicate where
the time is taken for assembly. You can usually
track down bad preallocation using -info.

  Thanks,

     Matt

> I will try setting up only one partition.
>
> Thanks,
>
> Perceval,
>
> Probably it is not a preallocation issue, as it shows "total number of
> mallocs used during MatSetValues calls =0".
>
> Adding new diagonals may increase fill-in a lot, if the new diagonals are
> displaced with respect to the other ones.
>
> The partitions option is intended for running several nodes. If you are
> using just one node probably it is better to set one partition only.
>
> Jose
>
>
> El 25 nov 2019, a las 18:25, Matthew Knepley <knepley at gmail.com> escribi?:
>
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <
> perceval.desforges at polytechnique.edu> wrote:
> Hi,
>
> So I'm loading two matrices from files, both 1000000 by 10000000. I ran
> the program with -mat_view::ascii_info and I got:
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=7000000, allocated nonzeros=7000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
>
> 20 times, and then
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=1000000, allocated nonzeros=1000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
>
> 20 times as well, and then
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=1000000, cols=1000000
>   total: nonzeros=7000000, allocated nonzeros=7000000
>   total number of mallocs used during MatSetValues calls =0
>     not using I-node routines
>
> 20 times as well before crashing.
>
> I realized it might be because I am setting up 20 krylov schur partitions
> which may be too much. I tried running the code again with only 2
> partitions and now the code runs but I have speed issues.
>
> I have one version of the code where my first matrix has 5 non-zero
> diagonals (so 5000000 non-zero entries), and the set up time is quite fast
> (8 seconds)  and solving is also quite fast. The second version is the same
> but I have two extra non-zero diagonals (7000000 non-zero entries)  and the
> set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also
> a lot slower. Is it normal that adding two extra diagonals increases solve
> and set up time so much?
>
>
> I can't see the rest of your code, but I am guessing your preallocation
> statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second
> matrix. When you load them from disk, we do all the
> preallocation correctly.
>
>   Thanks,
>
>     Matt
> Thanks again,
>
> Best regards,
>
> Perceval,
>
>
>
>
>
> Then I guess it is the factorization that is failing. How many nonzero
> entries do you have? Run with
> -mat_view ::ascii_info
>
> Jose
>
>
> El 22 nov 2019, a las 19:56, Perceval Desforges <
> perceval.desforges at polytechnique.edu> escribi?:
>
> Hi,
>
> Thanks for your answer. I tried looking at the inertias before solving,
> but the problem is that the program crashes when I call EPSSetUp with this
> error:
>
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 >
> 107317760), being killed
>
> I get this error even when there are no eigenvalues in the interval.
>
> I've started using BVMAT instead of BVVECS by the way.
>
> Thanks,
>
> Perceval,
>
>
>
>
>
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>
> Most likely the problem is that the interval you gave is too large and
> contains too many eigenvalues (SLEPc needs to allocate at least one vector
> per each eigenvalue). You can count the eigenvalues in the interval with
> the inertias, which are available at EPSSetUp (no need to call EPSSolve).
> See this example:
>
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option
> -show_inertias
> For example, the output
>    Shift 0.1  Inertia 3
>    Shift 0.35  Inertia 11
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is
> slower).
>
> Jose
>
>
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <
> petsc-users at mcs.anl.gov> escribi?:
>
> Hello all,
>
> I am trying to obtain all the eigenvalues in a certain interval for a
> fairly large matrix (1000000 * 1000000). I therefore use the spectrum
> slicing method detailed in section 3.4.5 of the manual. The calculations
> are run on a processor with 20 cores and 96 Go of RAM.
>
> The options I use are :
>
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1
> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>
>
>
> However the program quickly crashes with this error:
>
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 >
> 107317760), being killed
>
> I've tried reducing the amount of memory used by slepc with the
> -mat_mumps_icntl_14 option by setting it at -70 for example but then I get
> this error:
>
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase:
> INFOG(1)=-9, INFO(2)=82733614
>
> which is an error due to setting the mumps icntl option so low from what
> I've gathered.
>
> Is there any other way I can reduce memory usage?
>
>
>
> Thanks,
>
> Regards,
>
> Perceval,
>
>
>
> P.S. I sent the same email a few minutes ago but I think I made a mistake
> in the address, I'm sorry if I've sent it twice.
>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/ba5dea9c/attachment.html>

From swarnava89 at gmail.com  Mon Nov 25 18:23:59 2019
From: swarnava89 at gmail.com (Swarnava Ghosh)
Date: Mon, 25 Nov 2019 16:23:59 -0800
Subject: [petsc-users] Domain decomposition using DMPLEX
Message-ID: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>

Dear PETSc users and developers,

I am working with dmplex to distribute a 3D unstructured mesh made of
tetrahedrons in a cuboidal domain. I had a few queries:
1) Is there any way of ensuring load balancing based on the number of
vertices per MPI process.
2) As the global domain is cuboidal, is the resulting domain decomposition
also cuboidal on every MPI process? If not, is there a way to ensure this?
For example in DMDA, the default domain decomposition for a cuboidal domain
is cuboidal.

Sincerely,
SG
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/b2262871/attachment.html>

From knepley at gmail.com  Mon Nov 25 21:54:44 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Nov 2019 21:54:44 -0600
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
Message-ID: <CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>

On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh <swarnava89 at gmail.com> wrote:

> Dear PETSc users and developers,
>
> I am working with dmplex to distribute a 3D unstructured mesh made of
> tetrahedrons in a cuboidal domain. I had a few queries:
> 1) Is there any way of ensuring load balancing based on the number of
> vertices per MPI process.
>

You can now call DMPlexRebalanceSharedPoints() to try and get better
balance of vertices.


> 2) As the global domain is cuboidal, is the resulting domain decomposition
> also cuboidal on every MPI process? If not, is there a way to ensure this?
> For example in DMDA, the default domain decomposition for a cuboidal domain
> is cuboidal.
>

It sounds like you do not want something that is actually unstructured.
Rather, it seems like you want to
take a DMDA type thing and split it into tets. You can get a cuboidal
decomposition of a hex mesh easily.
Call DMPlexCreateBoxMesh() with one cell for every process, distribute, and
then uniformly refine. This
will not quite work for tets since the mesh partitioner will tend to
violate that constraint. You could:

  a) Prescribe the distribution yourself using the Shell partitioner type

or

  b) Write a refiner that turns hexes into tets

We already have a refiner that turns tets into hexes, but we never wrote
the other direction because it was not clear
that it was useful.

  Thanks,

     Matt


> Sincerely,
> SG
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/4eaaafe4/attachment.html>

From swarnava89 at gmail.com  Mon Nov 25 22:45:18 2019
From: swarnava89 at gmail.com (Swarnava Ghosh)
Date: Mon, 25 Nov 2019 20:45:18 -0800
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
	<CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
Message-ID: <CAC9YzR4eRJ3xtnacA3ubKHwOjAkBPp8ubGtiJWPgHG0ggY_OBQ@mail.gmail.com>

Hi Matt,


https://arxiv.org/pdf/1907.02604.pdf

On Mon, Nov 25, 2019 at 7:54 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh <swarnava89 at gmail.com>
> wrote:
>
>> Dear PETSc users and developers,
>>
>> I am working with dmplex to distribute a 3D unstructured mesh made of
>> tetrahedrons in a cuboidal domain. I had a few queries:
>> 1) Is there any way of ensuring load balancing based on the number of
>> vertices per MPI process.
>>
>
> You can now call DMPlexRebalanceSharedPoints() to try and get better
> balance of vertices.
>
>
  Thank you for pointing out this function!


> 2) As the global domain is cuboidal, is the resulting domain decomposition
>> also cuboidal on every MPI process? If not, is there a way to ensure this?
>> For example in DMDA, the default domain decomposition for a cuboidal domain
>> is cuboidal.
>>
>
> It sounds like you do not want something that is actually unstructured.
> Rather, it seems like you want to
> take a DMDA type thing and split it into tets. You can get a cuboidal
> decomposition of a hex mesh easily.
> Call DMPlexCreateBoxMesh() with one cell for every process, distribute,
> and then uniformly refine. This
> will not quite work for tets since the mesh partitioner will tend to
> violate that constraint. You could:
>
> No, I have an unstructured mesh that increases in resolution away from the
center of the cuboid. See Figure: 5 in the ArXiv paper
https://arxiv.org/pdf/1907.02604.pdf  for a slice through the midplane of
the cuboid.  Given this type of mesh, will dmplex do a cuboidal domain
decomposition?

Sincerely,
SG


>   a) Prescribe the distribution yourself using the Shell partitioner type
>
> or
>
>   b) Write a refiner that turns hexes into tets
>
> We already have a refiner that turns tets into hexes, but we never wrote
> the other direction because it was not clear
> that it was useful.
>
>   Thanks,
>
>      Matt
>
>
>> Sincerely,
>> SG
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/946a5163/attachment.html>

From bsmith at mcs.anl.gov  Mon Nov 25 23:02:50 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 26 Nov 2019 05:02:50 +0000
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <CAC9YzR4eRJ3xtnacA3ubKHwOjAkBPp8ubGtiJWPgHG0ggY_OBQ@mail.gmail.com>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
	<CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
	<CAC9YzR4eRJ3xtnacA3ubKHwOjAkBPp8ubGtiJWPgHG0ggY_OBQ@mail.gmail.com>
Message-ID: <AADD769B-E4EB-4692-81C8-6E0803237525@anl.gov>


"No, I have an unstructured mesh that increases in resolution away from the center of the cuboid. See Figure: 5 in the ArXiv paper https://arxiv.org/pdf/1907.02604.pdf  for a slice through the midplane of the cuboid.  Given this type of mesh, will dmplex do a cuboidal domain decomposition?"

  No definitely not. Why do you need a cuboidal domain decomposition? 

  Barry


> On Nov 25, 2019, at 10:45 PM, Swarnava Ghosh <swarnava89 at gmail.com> wrote:
> 
> Hi Matt,
> 
> 
> https://arxiv.org/pdf/1907.02604.pdf  
> 
> On Mon, Nov 25, 2019 at 7:54 PM Matthew Knepley <knepley at gmail.com> wrote:
> On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh <swarnava89 at gmail.com> wrote:
> Dear PETSc users and developers,
> 
> I am working with dmplex to distribute a 3D unstructured mesh made of tetrahedrons in a cuboidal domain. I had a few queries:
> 1) Is there any way of ensuring load balancing based on the number of vertices per MPI process.
> 
> You can now call DMPlexRebalanceSharedPoints() to try and get better balance of vertices.
>  
>   Thank you for pointing out this function!  
>  
> 2) As the global domain is cuboidal, is the resulting domain decomposition also cuboidal on every MPI process? If not, is there a way to ensure this? For example in DMDA, the default domain decomposition for a cuboidal domain is cuboidal. 
> 
> It sounds like you do not want something that is actually unstructured. Rather, it seems like you want to
> take a DMDA type thing and split it into tets. You can get a cuboidal decomposition of a hex mesh easily.
> Call DMPlexCreateBoxMesh() with one cell for every process, distribute, and then uniformly refine. This
> will not quite work for tets since the mesh partitioner will tend to violate that constraint. You could:
> 
> No, I have an unstructured mesh that increases in resolution away from the center of the cuboid. See Figure: 5 in the ArXiv paper https://arxiv.org/pdf/1907.02604.pdf  for a slice through the midplane of the cuboid.  Given this type of mesh, will dmplex do a cuboidal domain decomposition?
> 
> Sincerely,
> SG
>  
>   a) Prescribe the distribution yourself using the Shell partitioner type
> 
> or
> 
>   b) Write a refiner that turns hexes into tets
> 
> We already have a refiner that turns tets into hexes, but we never wrote the other direction because it was not clear
> that it was useful.
> 
>   Thanks,
> 
>      Matt
>  
> Sincerely,
> SG
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/


From bsmith at mcs.anl.gov  Tue Nov 26 00:27:45 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 26 Nov 2019 06:27:45 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <7b2c1352-df81-16e8-080a-83da950e0ede@purdue.edu>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
	<6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
	<CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>
	<d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>
	<CAMYG4GmQOc8mBFU2ftWnqHiKq+N_t1hhvqa+ehKU3P9Rjt7kTQ@mail.gmail.com>
	<7b2c1352-df81-16e8-080a-83da950e0ede@purdue.edu>
Message-ID: <528CB72A-B209-42A5-99EE-9ED9D3EADA19@anl.gov>


  I agree this is confusing. https://gitlab.com/petsc/petsc/merge_requests/2331 the flag PETSC_HAVE_MPI will no longer be set when MPI is not used (only MPIUNI is used).

   Barry

  The code API still has MPI* in it with MPI but they are stubs that just handle the sequential code and do not require an installation of MPI.


> On Nov 19, 2019, at 2:07 PM, Povolotskyi, Mykhailo via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> I see.
> 
> Actually, my goal is to compile petsc without real MPI to use it with libmesh.
> 
> You are saying that PETSC_HAVE_MPI is not a sign that Petsc is built with MPI. It means you have MPIUNI which is a serial code, but has an interface of MPI.
> 
> Correct?
> 
> On 11/19/2019 3:00 PM, Matthew Knepley wrote:
>> On Tue, Nov 19, 2019 at 2:58 PM Povolotskyi, Mykhailo <mpovolot at purdue.edu> wrote:
>> Let me explain the problem.
>> 
>> This log file has 
>> 
>> #ifndef PETSC_HAVE_MPI
>> #define PETSC_HAVE_MPI 1
>> #endif
>> 
>> while I need to have PETSC without MPI.
>> 
>> If you do not provide MPI, we provide MPIUNI. Do you see it linking to an MPI implementation, or using mpi.h?
>> 
>>     Matt
>>  
>> On 11/19/2019 2:55 PM, Matthew Knepley wrote:
>>> The log you sent has configure completely successfully. Please retry and send the log for a failed run.
>>> 
>>>   Thanks,
>>> 
>>>      Matt
>>> 
>>> On Tue, Nov 19, 2019 at 2:53 PM Povolotskyi, Mykhailo via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>> Why it did not work then?
>>> 
>>> On 11/19/2019 2:51 PM, Balay, Satish wrote:
>>> > And I see from configure.log - you are using the correct option.
>>> >
>>> > Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0 --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0 --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11 --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0 --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=0 --download-superlu_dist=0 --download-parmetis=0 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0 --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64  -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack=0
>>> > <<<<<<<
>>> >
>>> > And configure completed successfully. What issue are you encountering? Why do you think its activating MPI?
>>> >
>>> > Satish
>>> >
>>> >
>>> > On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
>>> >
>>> >> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
>>> >>
>>> >>> Hello,
>>> >>>
>>> >>> I'm trying to build PETSC without MPI.
>>> >>>
>>> >>> Even if I specify --with_mpi=0, the configuration script still activates
>>> >>> MPI.
>>> >>>
>>> >>> I attach the configure.log.
>>> >>>
>>> >>> What am I doing wrong?
>>> >> The option is --with-mpi=0
>>> >>
>>> >> Satish
>>> >>
>>> >>
>>> >>> Thank you,
>>> >>>
>>> >>> Michael.
>>> >>>
>>> >>>
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/


From balay at mcs.anl.gov  Tue Nov 26 07:52:02 2019
From: balay at mcs.anl.gov (Balay, Satish)
Date: Tue, 26 Nov 2019 13:52:02 +0000
Subject: [petsc-users] petsc without MPI
In-Reply-To: <528CB72A-B209-42A5-99EE-9ED9D3EADA19@anl.gov>
References: <01a221b4-795e-6624-71e4-75853a1bbf6e@purdue.edu>
	<alpine.LFD.2.21.1911191347280.2147@sb>
	<alpine.LFD.2.21.1911191350500.2147@sb>
	<6925fced-aaf2-1ef9-866c-a0c415d24df6@purdue.edu>
	<CAMYG4GkRsRRLfZVOhFcogR8cqO99nmXG4jXfMXYE4Q__Rw36Fw@mail.gmail.com>
	<d135a86f-cc1e-817a-16fd-b3d85aedb80e@purdue.edu>
	<CAMYG4GmQOc8mBFU2ftWnqHiKq+N_t1hhvqa+ehKU3P9Rjt7kTQ@mail.gmail.com>
	<7b2c1352-df81-16e8-080a-83da950e0ede@purdue.edu>
	<528CB72A-B209-42A5-99EE-9ED9D3EADA19@anl.gov>
Message-ID: <alpine.LFD.2.21.1911260748540.909873@sb>

Generally - even when one wants sequential build - its best use MPICH [or openMPI] when using multiple MPI based packages.

[this is to avoid conflicts - if any - in the seqential MPI stubs of these packages]

And run the code sequentially..

Satish

On Tue, 26 Nov 2019, Smith, Barry F. wrote:

> 
>   I agree this is confusing. https://gitlab.com/petsc/petsc/merge_requests/2331 the flag PETSC_HAVE_MPI will no longer be set when MPI is not used (only MPIUNI is used).
> 
>    Barry
> 
>   The code API still has MPI* in it with MPI but they are stubs that just handle the sequential code and do not require an installation of MPI.
> 
> 
> > On Nov 19, 2019, at 2:07 PM, Povolotskyi, Mykhailo via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > 
> > I see.
> > 
> > Actually, my goal is to compile petsc without real MPI to use it with libmesh.
> > 
> > You are saying that PETSC_HAVE_MPI is not a sign that Petsc is built with MPI. It means you have MPIUNI which is a serial code, but has an interface of MPI.
> > 
> > Correct?
> > 
> > On 11/19/2019 3:00 PM, Matthew Knepley wrote:
> >> On Tue, Nov 19, 2019 at 2:58 PM Povolotskyi, Mykhailo <mpovolot at purdue.edu> wrote:
> >> Let me explain the problem.
> >> 
> >> This log file has 
> >> 
> >> #ifndef PETSC_HAVE_MPI
> >> #define PETSC_HAVE_MPI 1
> >> #endif
> >> 
> >> while I need to have PETSC without MPI.
> >> 
> >> If you do not provide MPI, we provide MPIUNI. Do you see it linking to an MPI implementation, or using mpi.h?
> >> 
> >>     Matt
> >>  
> >> On 11/19/2019 2:55 PM, Matthew Knepley wrote:
> >>> The log you sent has configure completely successfully. Please retry and send the log for a failed run.
> >>> 
> >>>   Thanks,
> >>> 
> >>>      Matt
> >>> 
> >>> On Tue, Nov 19, 2019 at 2:53 PM Povolotskyi, Mykhailo via petsc-users <petsc-users at mcs.anl.gov> wrote:
> >>> Why it did not work then?
> >>> 
> >>> On 11/19/2019 2:51 PM, Balay, Satish wrote:
> >>> > And I see from configure.log - you are using the correct option.
> >>> >
> >>> > Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-scalar-type=real --with-x=0 --with-hdf5=0 --with-single-library=1 --with-shared-libraries=0 --with-log=0 --with-mpi=0 --with-clanguage=C++ --with-cxx-dialect=C++11 --CXXFLAGS="-fopenmp -fPIC" --CFLAGS="-fopenmp -fPIC" --with-fortran=0 --FFLAGS="-fopenmp -fPIC" --with-64-bit-indices=0 --with-debugging=0 --with-cc=gcc --with-fc=gfortran --with-cxx=g++ COPTFLAGS= CXXOPTFLAGS= FOPTFLAGS= --download-metis=0 --download-superlu_dist=0 --download-parmetis=0 --with-valgrind-dir=/apps/brown/valgrind/3.13.0_gcc-4.8.5 --download-mumps=1 --with-mumps-serial=1 --with-fortran-kernels=0 --with-blaslapack-lib="-Wl,-rpath,/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64  -L/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core " --with-blacs-lib=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so --with-blacs-include=/apps/cent7/intel/compilers_and_libraries_2017.1.132/linux/mkl/include --with-scalapack=0
> >>> > <<<<<<<
> >>> >
> >>> > And configure completed successfully. What issue are you encountering? Why do you think its activating MPI?
> >>> >
> >>> > Satish
> >>> >
> >>> >
> >>> > On Tue, 19 Nov 2019, Balay, Satish via petsc-users wrote:
> >>> >
> >>> >> On Tue, 19 Nov 2019, Povolotskyi, Mykhailo via petsc-users wrote:
> >>> >>
> >>> >>> Hello,
> >>> >>>
> >>> >>> I'm trying to build PETSC without MPI.
> >>> >>>
> >>> >>> Even if I specify --with_mpi=0, the configuration script still activates
> >>> >>> MPI.
> >>> >>>
> >>> >>> I attach the configure.log.
> >>> >>>
> >>> >>> What am I doing wrong?
> >>> >> The option is --with-mpi=0
> >>> >>
> >>> >> Satish
> >>> >>
> >>> >>
> >>> >>> Thank you,
> >>> >>>
> >>> >>> Michael.
> >>> >>>
> >>> >>>
> >>> 
> >>> 
> >>> 
> >>> -- 
> >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> >>> -- Norbert Wiener
> >>> 
> >>> https://www.cse.buffalo.edu/~knepley/
> >> 
> >> 
> >> -- 
> >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> >> -- Norbert Wiener
> >> 
> >> https://www.cse.buffalo.edu/~knepley/
> 

From bldenton at buffalo.edu  Tue Nov 26 07:22:47 2019
From: bldenton at buffalo.edu (Brandon Denton)
Date: Tue, 26 Nov 2019 08:22:47 -0500
Subject: [petsc-users] Petsc Matrix modifications
Message-ID: <CAP4qnDBu2SHPskyj_F_80KDTtOi-OruzDS3jzk6tsktfMR5zEA@mail.gmail.com>

Good Morning,

Is it possible to expand a matrix in petsc? I current created and loaded a
matrix (6 x 5) which holds information required later in my program. I
would like to store additional information in the matrix by expanding its
size, let's say make it at 10 x 5 matrix. How is this accomplished in
petsc. When I try to use MatSetSize() or MatSetValue() my code throws
errors. What is the process for accomplishing this?

Thank you in advance for your time.
Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191126/cc5db26d/attachment.html>

From knepley at gmail.com  Tue Nov 26 09:00:27 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 26 Nov 2019 09:00:27 -0600
Subject: [petsc-users] Petsc Matrix modifications
In-Reply-To: <CAP4qnDBu2SHPskyj_F_80KDTtOi-OruzDS3jzk6tsktfMR5zEA@mail.gmail.com>
References: <CAP4qnDBu2SHPskyj_F_80KDTtOi-OruzDS3jzk6tsktfMR5zEA@mail.gmail.com>
Message-ID: <CAMYG4GmA9U2SDX05NLA6tv5G6rAxqmC1EAFMgjAV0GXaqE5XBQ@mail.gmail.com>

On Tue, Nov 26, 2019 at 8:04 AM Brandon Denton <bldenton at buffalo.edu> wrote:

> Good Morning,
>
> Is it possible to expand a matrix in petsc? I current created and loaded a
> matrix (6 x 5) which holds information required later in my program. I
> would like to store additional information in the matrix by expanding its
> size, let's say make it at 10 x 5 matrix. How is this accomplished in
> petsc. When I try to use MatSetSize() or MatSetValue() my code throws
> errors. What is the process for accomplishing this?
>

Normally if you change the size, you just make a new object. If you really
want to retain the same pointer
because it is being held by other objects, you can call MatReset() and
rebuild the matrix completely, but
I normally would not do this.

  Thanks,

    Matt


> Thank you in advance for your time.
> Brandon
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191126/e7c638a2/attachment.html>

From perceval.desforges at polytechnique.edu  Tue Nov 26 09:23:22 2019
From: perceval.desforges at polytechnique.edu (Perceval Desforges)
Date: Tue, 26 Nov 2019 16:23:22 +0100
Subject: [petsc-users] Memory optimization
In-Reply-To: <CAMYG4GmDD8B1ReSx3-bAc4HKMAL26KXzEs61WY8QkYN7YJJFnQ@mail.gmail.com>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
	<9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
	<792624a5b70858444b7e529ce3624395@polytechnique.edu>
	<CAMYG4G=fk6cb+n-3q9YimPztq+vv+RN0woWBZ4Ae9=eRfOrFEA@mail.gmail.com>
	<ED9A9BC9-C85D-4438-93C4-25C0BA211324@dsic.upv.es>
	<0007da7378494d3bdb15c219872d3359@polytechnique.edu>
	<CAMYG4GmDD8B1ReSx3-bAc4HKMAL26KXzEs61WY8QkYN7YJJFnQ@mail.gmail.com>
Message-ID: <7ac1554555a98057cf38e8d69fb2f9c8@polytechnique.edu>

Hello, 

This is the output of -log_view. I selected what I thought were the
important parts. I don't know if this is the best format to send the
logs. If a text file is better let me know. Thanks again, 

---------------------------------------------- PETSc Performance
Summary: ----------------------------------------------

./dos.exe on a  named compute-0-11.local with 20 processors, by pcd Tue
Nov 26 15:50:50 2019
Using Petsc Release Version 3.10.5, Mar, 28, 2019 

                         Max       Max/Min     Avg       Total 
Time (sec):           2.214e+03     1.000   2.214e+03
Objects:              1.370e+02     1.030   1.332e+02
Flop:                 1.967e+14     1.412   1.539e+14  3.077e+15
Flop/sec:             8.886e+10     1.412   6.950e+10  1.390e+12
MPI Messages:         1.716e+03     1.350   1.516e+03  3.032e+04
MPI Message Lengths:  2.559e+08     5.796   4.179e+04  1.267e+09
MPI Reductions:       3.840e+02     1.000 

Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages
---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total    Count  
%Total     Avg         %Total    Count   %Total 
 0:      Main Stage: 1.0000e+02   4.5%  3.0771e+15 100.0%  3.016e+04 
99.5%  4.190e+04       99.7%  3.310e+02  86.2% 
 1:  Setting Up EPS: 2.1137e+03  95.5%  7.4307e+09   0.0%  1.600e+02  
0.5%  2.000e+04        0.3%  4.600e+01  12.0% 

------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                     
        --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen 
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

PetscBarrier           2 1.0 2.6554e+004632.9 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   3  0  0  0  0     0
BuildTwoSidedF         3 1.0 1.2021e-01672.3 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecDot                 8 1.0 1.1364e-02 2.3 8.00e+05 1.0 0.0e+00 0.0e+00
8.0e+00  0  0  0  0  2   0  0  0  0  2  1408
VecMDot               11 1.0 4.8588e-02 2.2 6.60e+06 1.0 0.0e+00 0.0e+00
1.1e+01  0  0  0  0  3   0  0  0  0  3  2717
VecNorm               12 1.0 5.2616e-02 4.3 1.20e+06 1.0 0.0e+00 0.0e+00
1.2e+01  0  0  0  0  3   0  0  0  0  4   456
VecScale              12 1.0 9.8681e-04 2.2 6.00e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 12160
VecCopy                3 1.0 4.1175e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               108 1.0 9.3610e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                1 1.0 1.6284e-04 3.2 1.00e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 12282
VecMAXPY              12 1.0 7.6976e-03 1.9 7.70e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 20006
VecScatterBegin      419 1.0 4.5905e-01 3.7 0.00e+00 0.0 2.9e+04 3.7e+04
9.0e+01  0  0 96 85 23   0  0 97 85 27     0
VecScatterEnd        329 1.0 9.3328e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecSetRandom           1 1.0 4.3299e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          12 1.0 5.3697e-02 4.2 1.80e+06 1.0 0.0e+00 0.0e+00
1.2e+01  0  0  0  0  3   0  0  0  0  4   670
MatMult              240 1.0 1.2112e-01 1.5 1.86e+07 1.0 4.4e+02 8.0e+04
0.0e+00  0  0  1  3  0   0  0  1  3  0  3071
MatSolve             101 1.0 9.3087e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04
9.1e+01  4100 97 82 24  93100 97 82 27 33055277
MatCholFctrNum         1 1.0 1.2752e-02 2.8 5.00e+04 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0    78
MatICCFactorSym        1 1.0 4.0321e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       5 1.7 1.2031e-01501.1 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         5 1.7 6.6613e-02 2.4 0.00e+00 0.0 1.6e+02 2.0e+04
2.4e+01  0  0  1  0  6   0  0  1  0  7     0
MatGetRowIJ            1 1.0 7.1526e-06 2.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.2271e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLoad                3 1.0 2.8543e-01 1.0 0.00e+00 0.0 3.3e+02 5.6e+05
5.4e+01  0  0  1 15 14   0  0  1 15 16     0
MatView                2 0.0 7.4778e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               2 1.0 1.3866e-0236.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve              90 1.0 9.3211e+01 1.0 1.97e+14 1.4 3.0e+04 3.6e+04
1.1e+02  4100 98 85 30  93100 99 85 34 33011509
KSPGMRESOrthog        11 1.0 5.3543e-02 2.0 1.32e+07 1.0 0.0e+00 0.0e+00
1.1e+01  0  0  0  0  3   0  0  0  0  3  4931
PCSetUp                2 1.0 1.8253e-02 2.9 5.00e+04 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0    55
PCSetUpOnBlocks        1 1.0 1.8055e-02 2.9 5.00e+04 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0    55
PCApply              101 1.0 9.3089e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04
9.1e+01  4100 97 82 24  93100 97 82 27 33054820
EPSSolve               1 1.0 9.5183e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04
2.4e+02  4100 97 82 63  95100 97 82 73 32327750
STApply               89 1.0 9.3107e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04
9.1e+01  4100 97 82 24  93100 97 82 27 33048198
STMatSolve            89 1.0 9.3084e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04
9.1e+01  4100 97 82 24  93100 97 82 27 33056525
BVCreate               2 1.0 5.0357e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
6.0e+00  0  0  0  0  2   0  0  0  0  2     0
BVCopy                 1 1.0 9.2030e-05 2.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BVMultVec            132 1.0 7.2259e-01 1.3 5.26e+08 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   1  0  0  0  0 14567
BVMultInPlace          1 1.0 2.2316e-01 1.1 6.40e+08 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 57357
BVDotVec             132 1.0 1.3370e+00 1.1 5.46e+08 1.0 0.0e+00 0.0e+00
1.3e+02  0  0  0  0 35   1  0  0  0 40  8169
BVOrthogonalizeV      81 1.0 1.9413e+00 1.1 1.07e+09 1.0 0.0e+00 0.0e+00
1.3e+02  0  0  0  0 35   2  0  0  0 40 11048
BVScale               89 1.0 3.0558e-03 1.4 4.45e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 29125
BVNormVec              8 1.0 1.5073e-02 1.9 1.20e+06 1.0 0.0e+00 0.0e+00
1.0e+01  0  0  0  0  3   0  0  0  0  3  1592
BVSetRandom            1 1.0 4.3440e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DSSolve                1 1.0 2.5339e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DSVectors             80 1.0 3.5286e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DSOther                1 1.0 6.0797e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 1: Setting Up EPS

BuildTwoSidedF         3 1.0 2.8591e-0211.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 4 1.0 6.1312e-03122.5 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCholFctrSym         1 1.0 1.1540e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
5.0e+00  1  0  0  0  1   1  0  0  0 11     0
MatCholFctrNum         2 1.0 2.1019e+03 1.0 1.00e+09 4.3 0.0e+00 0.0e+00
0.0e+00 95  0  0  0  0  99100  0  0  0     4
MatCopy                1 1.0 3.3707e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+00  0  0  0  0  1   0  0  0  0  4     0
MatConvert             1 1.0 6.1760e-03 2.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       3 1.0 2.8630e-0211.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         3 1.0 3.2575e-02 1.1 0.00e+00 0.0 1.6e+02 2.0e+04
1.8e+01  0  0  1  0  5   0  0100100 39     0
MatGetRowIJ            1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 2.6703e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         1 1.0 1.0121e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                2 1.0 1.1354e-01 1.1 0.00e+00 0.0 1.6e+02 2.0e+04
2.0e+01  0  0  1  0  5   0  0100100 43     0
KSPSetUp               2 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCSetUp                2 1.0 2.1135e+03 1.0 1.00e+09 4.3 0.0e+00 0.0e+00
1.2e+01 95  0  0  0  3 100100  0  0 26     4
EPSSetUp               1 1.0 2.1137e+03 1.0 1.00e+09 4.3 1.6e+02 2.0e+04
4.6e+01 95  0  1  0 12 100100100100100     4
STSetUp                2 1.0 1.0712e+03 1.0 4.95e+08 4.3 8.0e+01 2.0e+04
2.6e+01 48  0  0  0  7  51 50 50 50 57     3
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants'
Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage 

              Vector    37             50    126614208     0.
              Matrix    13             17    159831092     0.
              Viewer     6              5         4200     0.
           Index Set    12             13      2507240     0.
         Vec Scatter     5              7       128984     0.
       Krylov Solver     3              4        22776     0.
      Preconditioner     3              4         3848     0.
          EPS Solver     1              2         8632     0.
  Spectral Transform     1              2         1664     0.
       Basis Vectors     3              4        45600     0.
         PetscRandom     2              2         1292     0.
              Region     1              2         1344     0.
       Direct Solver     1              2       163856     0.

--- Event Stage 1: Setting Up EPS

              Vector    19              6       729576     0.
              Matrix    10              6     12178892     0.
           Index Set     9              8       766336     0.
         Vec Scatter     4              2         2640     0.
       Krylov Solver     1              0            0     0.
      Preconditioner     1              0            0     0.
          EPS Solver     1              0            0     0.
  Spectral Transform     1              0            0     0.
       Basis Vectors     1              0            0     0.
              Region     1              0            0     0.
       Direct Solver     1              0            0     0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 0.000263596
Average time for zero size MPI_Send(): 5.78523e-05
#PETSc Option Table entries:
-log_view
-mat_mumps_cntl_3 1e-12
-mat_mumps_icntl_13 1
-mat_mumps_icntl_14 60
-mat_mumps_icntl_24 1
-matload_block_size 1
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --prefix=/share/apps/petsc/3.10.5 --with-cc=gcc
--with-cxx=g++ --with-fc=gfortran --with-debugging=0 COPTFLAGS="-O3
-march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native
-mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native"
--download-mpich --download-fblaslapack --download-scalapack
--download-mumps 

Best regards, 

Perceval, 

> On Mon, Nov 25, 2019 at 11:45 AM Perceval Desforges <perceval.desforges at polytechnique.edu> wrote: 
> 
>> I am basically trying to solve a finite element problem, which is why in 3D I have 7 non-zero diagonals that are quite farm apart from one another. In 2D I only have 5 non-zero diagonals that are less far apart. So is it normal that the set up time is around 400 times greater in the 3D case? Is there nothing to be done?
> 
> No. It is almost certain that preallocation is screwed up. There is no way it can take 400x longer for a few nonzeros. 
> 
> In order to debug, please send the output of -log_view and indicate where the time is taken for assembly. You can usually 
> track down bad preallocation using -info. 
> 
> Thanks, 
> 
> Matt  
> 
> I will try setting up only one partition. 
> 
> Thanks, 
> 
> Perceval, 
> Probably it is not a preallocation issue, as it shows "total number of mallocs used during MatSetValues calls =0".
> 
> Adding new diagonals may increase fill-in a lot, if the new diagonals are displaced with respect to the other ones.
> 
> The partitions option is intended for running several nodes. If you are using just one node probably it is better to set one partition only.
> 
> Jose
> 
> El 25 nov 2019, a las 18:25, Matthew Knepley <knepley at gmail.com> escribi?:
> 
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <perceval.desforges at polytechnique.edu> wrote:
> Hi,
> 
> So I'm loading two matrices from files, both 1000000 by 10000000. I ran the program with -mat_view::ascii_info and I got:
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=7000000, allocated nonzeros=7000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times, and then
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=1000000, allocated nonzeros=1000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well, and then
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=7000000, allocated nonzeros=7000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well before crashing.
> 
> I realized it might be because I am setting up 20 krylov schur partitions which may be too much. I tried running the code again with only 2 partitions and now the code runs but I have speed issues.
> 
> I have one version of the code where my first matrix has 5 non-zero diagonals (so 5000000 non-zero entries), and the set up time is quite fast (8 seconds)  and solving is also quite fast. The second version is the same but I have two extra non-zero diagonals (7000000 non-zero entries)  and the set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also a lot slower. Is it normal that adding two extra diagonals increases solve and set up time so much?
> 
> I can't see the rest of your code, but I am guessing your preallocation statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second matrix. When you load them from disk, we do all the
> preallocation correctly.
> 
> Thanks,
> 
> Matt 
> Thanks again,
> 
> Best regards,
> 
> Perceval,
> 
> Then I guess it is the factorization that is failing. How many nonzero entries do you have? Run with
> -mat_view ::ascii_info
> 
> Jose
> 
> El 22 nov 2019, a las 19:56, Perceval Desforges <perceval.desforges at polytechnique.edu> escribi?:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, but the problem is that the program crashes when I call EPSSetUp with this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
> 
> Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option -show_inertias
> For example, the output
> Shift 0.1  Inertia 3 
> Shift 0.35  Inertia 11 
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
> 
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).
> 
> Jose
> 
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> 
> Hello all,
> 
> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
> 
> The options I use are :
> 
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
> 
> However the program quickly crashes with this error:
> 
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
> 
> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
> 
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
> 
> which is an error due to setting the mumps icntl option so low from what I've gathered.
> 
> Is there any other way I can reduce memory usage?
> 
> Thanks,
> 
> Regards,
> 
> Perceval,
> 
> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 

  -- 

What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener 

https://www.cse.buffalo.edu/~knepley/ [1] 

 
Links:
------
[1] http://www.cse.buffalo.edu/~knepley/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191126/802f718e/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue Nov 26 10:11:15 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Tue, 26 Nov 2019 16:11:15 +0000
Subject: [petsc-users] Memory optimization
In-Reply-To: <7ac1554555a98057cf38e8d69fb2f9c8@polytechnique.edu>
References: <7e97478cd26eb88d653bfd7617e92f18@polytechnique.edu>
	<FC15B830-3B58-4981-ADD7-EC3A7F41E998@dsic.upv.es>
	<852862500ebf52db0edde47a63ce8ae7@polytechnique.edu>
	<9177FF06-92F4-40E4-8CF6-4A2995C17072@dsic.upv.es>
	<792624a5b70858444b7e529ce3624395@polytechnique.edu>
	<CAMYG4G=fk6cb+n-3q9YimPztq+vv+RN0woWBZ4Ae9=eRfOrFEA@mail.gmail.com>
	<ED9A9BC9-C85D-4438-93C4-25C0BA211324@dsic.upv.es>
	<0007da7378494d3bdb15c219872d3359@polytechnique.edu>
	<CAMYG4GmDD8B1ReSx3-bAc4HKMAL26KXzEs61WY8QkYN7YJJFnQ@mail.gmail.com>
	<7ac1554555a98057cf38e8d69fb2f9c8@polytechnique.edu>
Message-ID: <AA4280AA-4D23-4FC7-B96A-81906D2C305A@anl.gov>


> I am basically trying to solve a finite element problem, which is why in 3D I have 7 non-zero diagonals that are quite farm apart from one another. In 2D I only have 5 non-zero diagonals that are less far apart. So is it normal that the set up time is around 400 times greater in the 3D case? Is there nothing to be done?

   Yes, sparse direct solver behavior between 2d and 3d problems can be dramatically different in both space and time. There is a well developed understanding of this from the 1970s. 

   For 2d the results are given in https://epubs.siam.org/doi/abs/10.1137/0710032?journalCode=sjnaam  work is n^3 space is n^2 log (n) using nested dissection ordering

   In 3d work is n^6   see http://amath.colorado.edu/faculty/martinss/2014_CBMS/Lectures/lecture06.pdf

   So 3d is very limited for direct solvers; and one has to try something else.
 
   Barry


> On Nov 26, 2019, at 9:23 AM, Perceval Desforges <perceval.desforges at polytechnique.edu> wrote:
> 
> Hello,
> 
> This is the output of -log_view. I selected what I thought were the important parts. I don't know if this is the best format to send the logs. If a text file is better let me know. Thanks again,
> 
> 
> 
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
> 
> ./dos.exe on a  named compute-0-11.local with 20 processors, by pcd Tue Nov 26 15:50:50 2019
> Using Petsc Release Version 3.10.5, Mar, 28, 2019 
> 
>                          Max       Max/Min     Avg       Total 
> Time (sec):           2.214e+03     1.000   2.214e+03
> Objects:              1.370e+02     1.030   1.332e+02
> Flop:                 1.967e+14     1.412   1.539e+14  3.077e+15
> Flop/sec:             8.886e+10     1.412   6.950e+10  1.390e+12
> MPI Messages:         1.716e+03     1.350   1.516e+03  3.032e+04
> MPI Message Lengths:  2.559e+08     5.796   4.179e+04  1.267e+09
> MPI Reductions:       3.840e+02     1.000
> 
> Summary of Stages:   ----- Time ------  ----- Flop ------  --- Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total    Count   %Total     Avg         %Total    Count   %Total 
>  0:      Main Stage: 1.0000e+02   4.5%  3.0771e+15 100.0%  3.016e+04  99.5%  4.190e+04       99.7%  3.310e+02  86.2% 
>  1:  Setting Up EPS: 2.1137e+03  95.5%  7.4307e+09   0.0%  1.600e+02   0.5%  2.000e+04        0.3%  4.600e+01  12.0%
> 
> 
> 
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
> 
> --- Event Stage 0: Main Stage
> 
> PetscBarrier           2 1.0 2.6554e+004632.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   3  0  0  0  0     0
> BuildTwoSidedF         3 1.0 1.2021e-01672.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecDot                 8 1.0 1.1364e-02 2.3 8.00e+05 1.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  2   0  0  0  0  2  1408
> VecMDot               11 1.0 4.8588e-02 2.2 6.60e+06 1.0 0.0e+00 0.0e+00 1.1e+01  0  0  0  0  3   0  0  0  0  3  2717
> VecNorm               12 1.0 5.2616e-02 4.3 1.20e+06 1.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  3   0  0  0  0  4   456
> VecScale              12 1.0 9.8681e-04 2.2 6.00e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 12160
> VecCopy                3 1.0 4.1175e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               108 1.0 9.3610e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY                1 1.0 1.6284e-04 3.2 1.00e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 12282
> VecMAXPY              12 1.0 7.6976e-03 1.9 7.70e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 20006
> VecScatterBegin      419 1.0 4.5905e-01 3.7 0.00e+00 0.0 2.9e+04 3.7e+04 9.0e+01  0  0 96 85 23   0  0 97 85 27     0
> VecScatterEnd        329 1.0 9.3328e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecSetRandom           1 1.0 4.3299e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecNormalize          12 1.0 5.3697e-02 4.2 1.80e+06 1.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  3   0  0  0  0  4   670
> MatMult              240 1.0 1.2112e-01 1.5 1.86e+07 1.0 4.4e+02 8.0e+04 0.0e+00  0  0  1  3  0   0  0  1  3  0  3071
> MatSolve             101 1.0 9.3087e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04 9.1e+01  4100 97 82 24  93100 97 82 27 33055277
> MatCholFctrNum         1 1.0 1.2752e-02 2.8 5.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    78
> MatICCFactorSym        1 1.0 4.0321e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin       5 1.7 1.2031e-01501.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd         5 1.7 6.6613e-02 2.4 0.00e+00 0.0 1.6e+02 2.0e+04 2.4e+01  0  0  1  0  6   0  0  1  0  7     0
> MatGetRowIJ            1 1.0 7.1526e-06 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 1.2271e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatLoad                3 1.0 2.8543e-01 1.0 0.00e+00 0.0 3.3e+02 5.6e+05 5.4e+01  0  0  1 15 14   0  0  1 15 16     0
> MatView                2 0.0 7.4778e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSetUp               2 1.0 1.3866e-0236.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve              90 1.0 9.3211e+01 1.0 1.97e+14 1.4 3.0e+04 3.6e+04 1.1e+02  4100 98 85 30  93100 99 85 34 33011509
> KSPGMRESOrthog        11 1.0 5.3543e-02 2.0 1.32e+07 1.0 0.0e+00 0.0e+00 1.1e+01  0  0  0  0  3   0  0  0  0  3  4931
> PCSetUp                2 1.0 1.8253e-02 2.9 5.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    55
> PCSetUpOnBlocks        1 1.0 1.8055e-02 2.9 5.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    55
> PCApply              101 1.0 9.3089e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04 9.1e+01  4100 97 82 24  93100 97 82 27 33054820
> EPSSolve               1 1.0 9.5183e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04 2.4e+02  4100 97 82 63  95100 97 82 73 32327750
> STApply               89 1.0 9.3107e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04 9.1e+01  4100 97 82 24  93100 97 82 27 33048198
> STMatSolve            89 1.0 9.3084e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04 9.1e+01  4100 97 82 24  93100 97 82 27 33056525
> BVCreate               2 1.0 5.0357e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  2   0  0  0  0  2     0
> BVCopy                 1 1.0 9.2030e-05 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> BVMultVec            132 1.0 7.2259e-01 1.3 5.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0 14567
> BVMultInPlace          1 1.0 2.2316e-01 1.1 6.40e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 57357
> BVDotVec             132 1.0 1.3370e+00 1.1 5.46e+08 1.0 0.0e+00 0.0e+00 1.3e+02  0  0  0  0 35   1  0  0  0 40  8169
> BVOrthogonalizeV      81 1.0 1.9413e+00 1.1 1.07e+09 1.0 0.0e+00 0.0e+00 1.3e+02  0  0  0  0 35   2  0  0  0 40 11048
> BVScale               89 1.0 3.0558e-03 1.4 4.45e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 29125
> BVNormVec              8 1.0 1.5073e-02 1.9 1.20e+06 1.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  3   0  0  0  0  3  1592
> BVSetRandom            1 1.0 4.3440e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> DSSolve                1 1.0 2.5339e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> DSVectors             80 1.0 3.5286e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> DSOther                1 1.0 6.0797e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> 
> --- Event Stage 1: Setting Up EPS
> 
> BuildTwoSidedF         3 1.0 2.8591e-0211.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet                 4 1.0 6.1312e-03122.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatCholFctrSym         1 1.0 1.1540e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  1  0  0  0  1   1  0  0  0 11     0
> MatCholFctrNum         2 1.0 2.1019e+03 1.0 1.00e+09 4.3 0.0e+00 0.0e+00 0.0e+00 95  0  0  0  0  99100  0  0  0     4
> MatCopy                1 1.0 3.3707e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  1   0  0  0  0  4     0
> MatConvert             1 1.0 6.1760e-03 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin       3 1.0 2.8630e-0211.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd         3 1.0 3.2575e-02 1.1 0.00e+00 0.0 1.6e+02 2.0e+04 1.8e+01  0  0  1  0  5   0  0100100 39     0
> MatGetRowIJ            1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 2.6703e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatZeroEntries         1 1.0 1.0121e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAXPY                2 1.0 1.1354e-01 1.1 0.00e+00 0.0 1.6e+02 2.0e+04 2.0e+01  0  0  1  0  5   0  0100100 43     0
> KSPSetUp               2 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCSetUp                2 1.0 2.1135e+03 1.0 1.00e+09 4.3 0.0e+00 0.0e+00 1.2e+01 95  0  0  0  3 100100  0  0 26     4
> EPSSetUp               1 1.0 2.1137e+03 1.0 1.00e+09 4.3 1.6e+02 2.0e+04 4.6e+01 95  0  1  0 12 100100100100100     4
> STSetUp                2 1.0 1.0712e+03 1.0 4.95e+08 4.3 8.0e+01 2.0e+04 2.6e+01 48  0  0  0  7  51 50 50 50 57     3
> ------------------------------------------------------------------------------------------------------------------------
> 
> Memory usage is given in bytes:
> 
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
> 
> --- Event Stage 0: Main Stage
> 
>               Vector    37             50    126614208     0.
>               Matrix    13             17    159831092     0.
>               Viewer     6              5         4200     0.
>            Index Set    12             13      2507240     0.
>          Vec Scatter     5              7       128984     0.
>        Krylov Solver     3              4        22776     0.
>       Preconditioner     3              4         3848     0.
>           EPS Solver     1              2         8632     0.
>   Spectral Transform     1              2         1664     0.
>        Basis Vectors     3              4        45600     0.
>          PetscRandom     2              2         1292     0.
>               Region     1              2         1344     0.
>        Direct Solver     1              2       163856     0.
> 
> --- Event Stage 1: Setting Up EPS
> 
>               Vector    19              6       729576     0.
>               Matrix    10              6     12178892     0.
>            Index Set     9              8       766336     0.
>          Vec Scatter     4              2         2640     0.
>        Krylov Solver     1              0            0     0.
>       Preconditioner     1              0            0     0.
>           EPS Solver     1              0            0     0.
>   Spectral Transform     1              0            0     0.
>        Basis Vectors     1              0            0     0.
>               Region     1              0            0     0.
>        Direct Solver     1              0            0     0.
> ========================================================================================================================
> Average time to get PetscTime(): 9.53674e-08
> Average time for MPI_Barrier(): 0.000263596
> Average time for zero size MPI_Send(): 5.78523e-05
> #PETSc Option Table entries:
> -log_view
> -mat_mumps_cntl_3 1e-12
> -mat_mumps_icntl_13 1
> -mat_mumps_icntl_14 60
> -mat_mumps_icntl_24 1
> -matload_block_size 1
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure options: --prefix=/share/apps/petsc/3.10.5 --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --download-mpich --download-fblaslapack --download-scalapack --download-mumps
> 
> 
> 
> Best regards,
> 
> Perceval,
> 
> 
> 
> 
> 
>> On Mon, Nov 25, 2019 at 11:45 AM Perceval Desforges <perceval.desforges at polytechnique.edu> wrote:
>> I am basically trying to solve a finite element problem, which is why in 3D I have 7 non-zero diagonals that are quite farm apart from one another. In 2D I only have 5 non-zero diagonals that are less far apart. So is it normal that the set up time is around 400 times greater in the 3D case? Is there nothing to be done?
>> 
>> 
>> 
>> No. It is almost certain that preallocation is screwed up. There is no way it can take 400x longer for a few nonzeros.
>>  
>> In order to debug, please send the output of -log_view and indicate where the time is taken for assembly. You can usually
>> track down bad preallocation using -info.
>>  
>>   Thanks,
>>  
>>      Matt 
>> I will try setting up only one partition.
>> 
>> Thanks,
>> 
>> Perceval,
>> 
>> Probably it is not a preallocation issue, as it shows "total number of mallocs used during MatSetValues calls =0".
>> 
>> Adding new diagonals may increase fill-in a lot, if the new diagonals are displaced with respect to the other ones.
>> 
>> The partitions option is intended for running several nodes. If you are using just one node probably it is better to set one partition only.
>> 
>> Jose
>> 
>> 
>> El 25 nov 2019, a las 18:25, Matthew Knepley <knepley at gmail.com> escribi?:
>> 
>> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <perceval.desforges at polytechnique.edu> wrote:
>> Hi,
>> 
>> So I'm loading two matrices from files, both 1000000 by 10000000. I ran the program with -mat_view::ascii_info and I got:
>> 
>> Mat Object: 1 MPI processes
>>   type: seqaij
>>   rows=1000000, cols=1000000
>>   total: nonzeros=7000000, allocated nonzeros=7000000
>>   total number of mallocs used during MatSetValues calls =0
>>     not using I-node routines
>> 
>> 20 times, and then
>> 
>> Mat Object: 1 MPI processes
>>   type: seqaij
>>   rows=1000000, cols=1000000
>>   total: nonzeros=1000000, allocated nonzeros=1000000
>>   total number of mallocs used during MatSetValues calls =0
>>     not using I-node routines
>> 
>> 20 times as well, and then
>> 
>> Mat Object: 1 MPI processes
>>   type: seqaij
>>   rows=1000000, cols=1000000
>>   total: nonzeros=7000000, allocated nonzeros=7000000
>>   total number of mallocs used during MatSetValues calls =0
>>     not using I-node routines
>> 
>> 20 times as well before crashing.
>> 
>> I realized it might be because I am setting up 20 krylov schur partitions which may be too much. I tried running the code again with only 2 partitions and now the code runs but I have speed issues.
>> 
>> I have one version of the code where my first matrix has 5 non-zero diagonals (so 5000000 non-zero entries), and the set up time is quite fast (8 seconds)  and solving is also quite fast. The second version is the same but I have two extra non-zero diagonals (7000000 non-zero entries)  and the set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also a lot slower. Is it normal that adding two extra diagonals increases solve and set up time so much?
>> 
>> 
>> I can't see the rest of your code, but I am guessing your preallocation statement has "5", so it does no mallocs when you create
>> your first matrix, but mallocs for every row when you create your second matrix. When you load them from disk, we do all the
>> preallocation correctly.
>> 
>>   Thanks,
>> 
>>     Matt 
>> Thanks again,
>> 
>> Best regards,
>> 
>> Perceval,
>> 
>> 
>> 
>> 
>> 
>> Then I guess it is the factorization that is failing. How many nonzero entries do you have? Run with
>> -mat_view ::ascii_info
>> 
>> Jose
>> 
>> 
>> El 22 nov 2019, a las 19:56, Perceval Desforges <perceval.desforges at polytechnique.edu> escribi?:
>> 
>> Hi,
>> 
>> Thanks for your answer. I tried looking at the inertias before solving, but the problem is that the program crashes when I call EPSSetUp with this error:
>> 
>> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 107317760), being killed
>> 
>> I get this error even when there are no eigenvalues in the interval.
>> 
>> I've started using BVMAT instead of BVVECS by the way.
>> 
>> Thanks,
>> 
>> Perceval,
>> 
>> 
>> 
>> 
>> 
>> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>> 
>> Most likely the problem is that the interval you gave is too large and contains too many eigenvalues (SLEPc needs to allocate at least one vector per each eigenvalue). You can count the eigenvalues in the interval with the inertias, which are available at EPSSetUp (no need to call EPSSolve). See this example:
>> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
>> You can comment out the call to EPSSolve() and run with the option -show_inertias
>> For example, the output
>>    Shift 0.1  Inertia 3 
>>    Shift 0.35  Inertia 11 
>> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>> 
>> By the way, I would suggest using BVMAT instead of BVVECS (the latter is slower).
>> 
>> Jose
>> 
>> 
>> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <petsc-users at mcs.anl.gov> escribi?:
>> 
>> Hello all,
>> 
>> I am trying to obtain all the eigenvalues in a certain interval for a fairly large matrix (1000000 * 1000000). I therefore use the spectrum slicing method detailed in section 3.4.5 of the manual. The calculations are run on a processor with 20 cores and 96 Go of RAM.
>> 
>> The options I use are :
>> 
>> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>> 
>> 
>> 
>> However the program quickly crashes with this error:
>> 
>> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 107317760), being killed
>> 
>> I've tried reducing the amount of memory used by slepc with the -mat_mumps_icntl_14 option by setting it at -70 for example but then I get this error:
>> 
>> [1]PETSC ERROR: Error in external library
>> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: INFOG(1)=-9, INFO(2)=82733614
>> 
>> which is an error due to setting the mumps icntl option so low from what I've gathered.
>> 
>> Is there any other way I can reduce memory usage?
>> 
>> 
>> 
>> Thanks,
>> 
>> Regards,
>> 
>> Perceval,
>> 
>> 
>> 
>> P.S. I sent the same email a few minutes ago but I think I made a mistake in the address, I'm sorry if I've sent it twice.
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://www.cse.buffalo.edu/~knepley/
>> 
>> 
>> 
>>  
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>>  
>> https://www.cse.buffalo.edu/~knepley/
> 
> 


From danyang.su at gmail.com  Tue Nov 26 11:42:34 2019
From: danyang.su at gmail.com (Danyang Su)
Date: Tue, 26 Nov 2019 09:42:34 -0800
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
	<CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
Message-ID: <04c21713-b569-0b30-fd48-d1cbf4bb53c9@gmail.com>

On 2019-11-25 7:54 p.m., Matthew Knepley wrote:
> On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh <swarnava89 at gmail.com 
> <mailto:swarnava89 at gmail.com>> wrote:
>
>     Dear PETSc users and developers,
>
>     I am working with dmplex to distribute a 3D unstructured mesh made
>     of tetrahedrons in a cuboidal domain. I had a few queries:
>     1) Is there any way of ensuring load balancing based on the number
>     of vertices per MPI process.
>
>
> You can now call?DMPlexRebalanceSharedPoints() to try and get better 
> balance of vertices.

Hi Matt,

I just want to follow up if this new function can help to solve the 
"Strange Partition in PETSc 3.11" problem I mentioned before. Would you 
please let me know when shall I call this function? Right before 
DMPlexDistribute?

call DMPlexCreateFromCellList

call DMPlexGetPartitioner

call PetscPartitionerSetFromOptions

call DMPlexDistribute

Thanks,

Danyang

>     2) As the global domain is cuboidal, is the resulting domain
>     decomposition also cuboidal on every MPI process? If not, is there
>     a way to ensure this? For example in DMDA, the default domain
>     decomposition for a cuboidal domain is cuboidal.
>
>
> It sounds like you do not want something that is actually 
> unstructured. Rather, it seems like you want to
> take a DMDA type thing and split it into tets. You can get a cuboidal 
> decomposition of a hex mesh easily.
> Call DMPlexCreateBoxMesh() with one cell for every process, 
> distribute, and then uniformly refine. This
> will not quite work for tets since the mesh partitioner will tend to 
> violate that constraint. You could:
>
> ? a) Prescribe the distribution yourself using the Shell partitioner type
>
> or
>
> ? b) Write a refiner that turns hexes into tets
>
> We already have a refiner that turns tets into hexes, but we never 
> wrote the other direction because it was not clear
> that it was useful.
>
> ? Thanks,
>
> ? ? ?Matt
>
>     Sincerely,
>     SG
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
> <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191126/b235e917/attachment-0001.html>

From knepley at gmail.com  Tue Nov 26 12:18:50 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 26 Nov 2019 12:18:50 -0600
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <04c21713-b569-0b30-fd48-d1cbf4bb53c9@gmail.com>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
	<CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
	<04c21713-b569-0b30-fd48-d1cbf4bb53c9@gmail.com>
Message-ID: <CAMYG4Gn+yQMt9ZhNF9yy7tP5gOkm2Bz+DVA5ahjrOZf_Ttxf8Q@mail.gmail.com>

On Tue, Nov 26, 2019 at 11:43 AM Danyang Su <danyang.su at gmail.com> wrote:

> On 2019-11-25 7:54 p.m., Matthew Knepley wrote:
>
> On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh <swarnava89 at gmail.com>
> wrote:
>
>> Dear PETSc users and developers,
>>
>> I am working with dmplex to distribute a 3D unstructured mesh made of
>> tetrahedrons in a cuboidal domain. I had a few queries:
>> 1) Is there any way of ensuring load balancing based on the number of
>> vertices per MPI process.
>>
>
> You can now call DMPlexRebalanceSharedPoints() to try and get better
> balance of vertices.
>
> Hi Matt,
>
> I just want to follow up if this new function can help to solve the
> "Strange Partition in PETSc 3.11" problem I mentioned before. Would you
> please let me know when shall I call this function? Right before
> DMPlexDistribute?
>
This is not the problem. I believe the problem is that you are partitioning
hybrid cells, and the way we handle
them internally changed, which I think screwed up the dual mesh for
partitioning in your example. I have been
sick, so I have not gotten to your example yet, but I will.

  Sorry about that,

    Matt

> call DMPlexCreateFromCellList
>
> call DMPlexGetPartitioner
>
> call PetscPartitionerSetFromOptions
>
> call DMPlexDistribute
>
> Thanks,
>
> Danyang
>
>
>
>> 2) As the global domain is cuboidal, is the resulting domain
>> decomposition also cuboidal on every MPI process? If not, is there a way to
>> ensure this? For example in DMDA, the default domain decomposition for a
>> cuboidal domain is cuboidal.
>>
>
> It sounds like you do not want something that is actually unstructured.
> Rather, it seems like you want to
> take a DMDA type thing and split it into tets. You can get a cuboidal
> decomposition of a hex mesh easily.
> Call DMPlexCreateBoxMesh() with one cell for every process, distribute,
> and then uniformly refine. This
> will not quite work for tets since the mesh partitioner will tend to
> violate that constraint. You could:
>
>   a) Prescribe the distribution yourself using the Shell partitioner type
>
> or
>
>   b) Write a refiner that turns hexes into tets
>
> We already have a refiner that turns tets into hexes, but we never wrote
> the other direction because it was not clear
> that it was useful.
>
>   Thanks,
>
>      Matt
>
>
>> Sincerely,
>> SG
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191126/4122472e/attachment.html>

From danyang.su at gmail.com  Tue Nov 26 12:24:22 2019
From: danyang.su at gmail.com (Danyang Su)
Date: Tue, 26 Nov 2019 10:24:22 -0800
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <CAMYG4Gn+yQMt9ZhNF9yy7tP5gOkm2Bz+DVA5ahjrOZf_Ttxf8Q@mail.gmail.com>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
	<CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
	<04c21713-b569-0b30-fd48-d1cbf4bb53c9@gmail.com>
	<CAMYG4Gn+yQMt9ZhNF9yy7tP5gOkm2Bz+DVA5ahjrOZf_Ttxf8Q@mail.gmail.com>
Message-ID: <0049d72d-cefd-20f5-1fb7-a81295fbaf63@gmail.com>

On 2019-11-26 10:18 a.m., Matthew Knepley wrote:
> On Tue, Nov 26, 2019 at 11:43 AM Danyang Su <danyang.su at gmail.com 
> <mailto:danyang.su at gmail.com>> wrote:
>
>     On 2019-11-25 7:54 p.m., Matthew Knepley wrote:
>>     On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh
>>     <swarnava89 at gmail.com <mailto:swarnava89 at gmail.com>> wrote:
>>
>>         Dear PETSc users and developers,
>>
>>         I am working with dmplex to distribute a 3D unstructured mesh
>>         made of tetrahedrons in a cuboidal domain. I had a few queries:
>>         1) Is there any way of ensuring load balancing based on the
>>         number of vertices per MPI process.
>>
>>
>>     You can now call?DMPlexRebalanceSharedPoints() to try and get
>>     better balance of vertices.
>
>     Hi Matt,
>
>     I just want to follow up if this new function can help to solve
>     the "Strange Partition in PETSc 3.11" problem I mentioned before.
>     Would you please let me know when shall I call this function?
>     Right before DMPlexDistribute?
>
> This is not the problem. I believe the problem is that you are 
> partitioning hybrid cells, and the way we handle
> them internally changed, which I think screwed up the dual mesh for 
> partitioning in your example. I have been
> sick, so I have not gotten to your example yet, but I will.

Hope you are getting well soon. The mesh is not hybrid, only prism cells 
layer by layer. But the height of the prism varies significantly.

Thanks,

Danyang

>
> ? Sorry about that,
>
> ? ? Matt
>
>     call DMPlexCreateFromCellList
>
>     call DMPlexGetPartitioner
>
>     call PetscPartitionerSetFromOptions
>
>     call DMPlexDistribute
>
>     Thanks,
>
>     Danyang
>
>>         2) As the global domain is cuboidal, is the resulting domain
>>         decomposition also cuboidal on every MPI process? If not, is
>>         there a way to ensure this? For example in DMDA, the default
>>         domain decomposition for a cuboidal domain is cuboidal.
>>
>>
>>     It sounds like you do not want something that is actually
>>     unstructured. Rather, it seems like you want to
>>     take a DMDA type thing and split it into tets. You can get a
>>     cuboidal decomposition of a hex mesh easily.
>>     Call DMPlexCreateBoxMesh() with one cell for every process,
>>     distribute, and then uniformly refine. This
>>     will not quite work for tets since the mesh partitioner will tend
>>     to violate that constraint. You could:
>>
>>     ? a) Prescribe the distribution yourself using the Shell
>>     partitioner type
>>
>>     or
>>
>>     ? b) Write a refiner that turns hexes into tets
>>
>>     We already have a refiner that turns tets into hexes, but we
>>     never wrote the other direction because it was not clear
>>     that it was useful.
>>
>>     ? Thanks,
>>
>>     ? ? ?Matt
>>
>>         Sincerely,
>>         SG
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     https://www.cse.buffalo.edu/~knepley/
>>     <http://www.cse.buffalo.edu/~knepley/>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
> <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191126/dedbd6a8/attachment.html>

From knepley at gmail.com  Tue Nov 26 12:34:54 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 26 Nov 2019 12:34:54 -0600
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <0049d72d-cefd-20f5-1fb7-a81295fbaf63@gmail.com>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
	<CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
	<04c21713-b569-0b30-fd48-d1cbf4bb53c9@gmail.com>
	<CAMYG4Gn+yQMt9ZhNF9yy7tP5gOkm2Bz+DVA5ahjrOZf_Ttxf8Q@mail.gmail.com>
	<0049d72d-cefd-20f5-1fb7-a81295fbaf63@gmail.com>
Message-ID: <CAMYG4GkxaMWVF6RdUYMcHSP=ozsTPC_XtZ0c4S0JjPVc53drig@mail.gmail.com>

On Tue, Nov 26, 2019 at 12:24 PM Danyang Su <danyang.su at gmail.com> wrote:

> On 2019-11-26 10:18 a.m., Matthew Knepley wrote:
>
> On Tue, Nov 26, 2019 at 11:43 AM Danyang Su <danyang.su at gmail.com> wrote:
>
>> On 2019-11-25 7:54 p.m., Matthew Knepley wrote:
>>
>> On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh <swarnava89 at gmail.com>
>> wrote:
>>
>>> Dear PETSc users and developers,
>>>
>>> I am working with dmplex to distribute a 3D unstructured mesh made of
>>> tetrahedrons in a cuboidal domain. I had a few queries:
>>> 1) Is there any way of ensuring load balancing based on the number of
>>> vertices per MPI process.
>>>
>>
>> You can now call DMPlexRebalanceSharedPoints() to try and get better
>> balance of vertices.
>>
>> Hi Matt,
>>
>> I just want to follow up if this new function can help to solve the
>> "Strange Partition in PETSc 3.11" problem I mentioned before. Would you
>> please let me know when shall I call this function? Right before
>> DMPlexDistribute?
>>
> This is not the problem. I believe the problem is that you are
> partitioning hybrid cells, and the way we handle
> them internally changed, which I think screwed up the dual mesh for
> partitioning in your example. I have been
> sick, so I have not gotten to your example yet, but I will.
>
> Hope you are getting well soon. The mesh is not hybrid, only prism cells
> layer by layer.
>
Prism cells are called "hybrid" right now, which is indeed a bad term and I
will change.

  Thanks,

     Matt

> But the height of the prism varies significantly.
>
> Thanks,
>
> Danyang
>
>
>   Sorry about that,
>
>     Matt
>
>> call DMPlexCreateFromCellList
>>
>> call DMPlexGetPartitioner
>>
>> call PetscPartitionerSetFromOptions
>>
>> call DMPlexDistribute
>>
>> Thanks,
>>
>> Danyang
>>
>>
>>
>>> 2) As the global domain is cuboidal, is the resulting domain
>>> decomposition also cuboidal on every MPI process? If not, is there a way to
>>> ensure this? For example in DMDA, the default domain decomposition for a
>>> cuboidal domain is cuboidal.
>>>
>>
>> It sounds like you do not want something that is actually unstructured.
>> Rather, it seems like you want to
>> take a DMDA type thing and split it into tets. You can get a cuboidal
>> decomposition of a hex mesh easily.
>> Call DMPlexCreateBoxMesh() with one cell for every process, distribute,
>> and then uniformly refine. This
>> will not quite work for tets since the mesh partitioner will tend to
>> violate that constraint. You could:
>>
>>   a) Prescribe the distribution yourself using the Shell partitioner type
>>
>> or
>>
>>   b) Write a refiner that turns hexes into tets
>>
>> We already have a refiner that turns tets into hexes, but we never wrote
>> the other direction because it was not clear
>> that it was useful.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Sincerely,
>>> SG
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191126/6723cf94/attachment-0001.html>

From pranayreddy865 at gmail.com  Thu Nov 28 21:07:57 2019
From: pranayreddy865 at gmail.com (baikadi pranay)
Date: Thu, 28 Nov 2019 20:07:57 -0700
Subject: [petsc-users] Outputting matrix for viewing in matlab
Message-ID: <CA+zFCTms4nRzw03cey1xpLRw-GAsW67T15xcFVbp4BYSGVSxTw@mail.gmail.com>

Hello PETSc users,

I have a sparse matrix built and I want to output the matrix for viewing in
matlab. However i'm having difficulty outputting the matrix. I am writing
my program in Fortran90 and I've included the following lines to output the
matrix.


*call
PetscViewerBinaryOpen(PETSC_COMM_SELF,'matrix',FILE_MODE_WRITE,view,ierr) call
PetscViewerBinaryGetDescriptor(view,fd,ierr) call
PetscBinaryWrite(fd,ham,1,PETSC_SCALAR,PETSC_FALSE,ierr)*

These lines do create a matrix but matlab says its not a binary file. Could
you please provide me some inputs on where I'm going wrong and how to
proceed with this problem. I can provide any further information that you
might need to help me solve this problem.


Thank you.

Sincerely,
Pranay.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191128/1d669e6a/attachment.html>

From swarnava89 at gmail.com  Thu Nov 28 21:44:57 2019
From: swarnava89 at gmail.com (Swarnava Ghosh)
Date: Thu, 28 Nov 2019 19:44:57 -0800
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <AADD769B-E4EB-4692-81C8-6E0803237525@anl.gov>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
	<CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
	<CAC9YzR4eRJ3xtnacA3ubKHwOjAkBPp8ubGtiJWPgHG0ggY_OBQ@mail.gmail.com>
	<AADD769B-E4EB-4692-81C8-6E0803237525@anl.gov>
Message-ID: <CAC9YzR5aYmOT-H=Q_KL+F4nJ+jf7yrshrGcHcfbmTStE=v3Ukg@mail.gmail.com>

Hi Barry,

 "Why do you need a cuboidal domain decomposition?"

I gave it some thought. I don't always need a cuboidal decomposition. But I
would need something that essentially minimized the surface area of the
faces of each decomposition. Is there a way to get this? Could you please
direct me to a reference a reference where I can read about the domain
decomposition strategies used in petsc dmplex.

Sincerely,
Swarnava

On Mon, Nov 25, 2019 at 9:02 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
> "No, I have an unstructured mesh that increases in resolution away from
> the center of the cuboid. See Figure: 5 in the ArXiv paper
> https://arxiv.org/pdf/1907.02604.pdf  for a slice through the midplane of
> the cuboid.  Given this type of mesh, will dmplex do a cuboidal domain
> decomposition?"
>
>   No definitely not. Why do you need a cuboidal domain decomposition?
>
>   Barry
>
>
> > On Nov 25, 2019, at 10:45 PM, Swarnava Ghosh <swarnava89 at gmail.com>
> wrote:
> >
> > Hi Matt,
> >
> >
> > https://arxiv.org/pdf/1907.02604.pdf
> >
> > On Mon, Nov 25, 2019 at 7:54 PM Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh <swarnava89 at gmail.com>
> wrote:
> > Dear PETSc users and developers,
> >
> > I am working with dmplex to distribute a 3D unstructured mesh made of
> tetrahedrons in a cuboidal domain. I had a few queries:
> > 1) Is there any way of ensuring load balancing based on the number of
> vertices per MPI process.
> >
> > You can now call DMPlexRebalanceSharedPoints() to try and get better
> balance of vertices.
> >
> >   Thank you for pointing out this function!
> >
> > 2) As the global domain is cuboidal, is the resulting domain
> decomposition also cuboidal on every MPI process? If not, is there a way to
> ensure this? For example in DMDA, the default domain decomposition for a
> cuboidal domain is cuboidal.
> >
> > It sounds like you do not want something that is actually unstructured.
> Rather, it seems like you want to
> > take a DMDA type thing and split it into tets. You can get a cuboidal
> decomposition of a hex mesh easily.
> > Call DMPlexCreateBoxMesh() with one cell for every process, distribute,
> and then uniformly refine. This
> > will not quite work for tets since the mesh partitioner will tend to
> violate that constraint. You could:
> >
> > No, I have an unstructured mesh that increases in resolution away from
> the center of the cuboid. See Figure: 5 in the ArXiv paper
> https://arxiv.org/pdf/1907.02604.pdf  for a slice through the midplane of
> the cuboid.  Given this type of mesh, will dmplex do a cuboidal domain
> decomposition?
> >
> > Sincerely,
> > SG
> >
> >   a) Prescribe the distribution yourself using the Shell partitioner type
> >
> > or
> >
> >   b) Write a refiner that turns hexes into tets
> >
> > We already have a refiner that turns tets into hexes, but we never wrote
> the other direction because it was not clear
> > that it was useful.
> >
> >   Thanks,
> >
> >      Matt
> >
> > Sincerely,
> > SG
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191128/c5e94d34/attachment.html>

From patrick.sanan at gmail.com  Fri Nov 29 02:14:51 2019
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Fri, 29 Nov 2019 09:14:51 +0100
Subject: [petsc-users] Outputting matrix for viewing in matlab
In-Reply-To: <CA+zFCTms4nRzw03cey1xpLRw-GAsW67T15xcFVbp4BYSGVSxTw@mail.gmail.com>
References: <CA+zFCTms4nRzw03cey1xpLRw-GAsW67T15xcFVbp4BYSGVSxTw@mail.gmail.com>
Message-ID: <F9517BF4-15B0-4241-9798-94AA075D68BE@gmail.com>

PETSc has its own binary format, which is not the same as MATLAB's. 

However, PETSc includes some MATLAB/Octave scripts which will load these binary files.

See $PETSC_DIR/share/matlab/PetscBinaryRead.m - there are some examples in the comments at the top of that file.

Note that you will probably want to add $PETSC_DIR/share/matlab to your MATLAB path so that you can run the script. This is what I have for Octave, but I'm not sure if it this, precisely, works in MATLAB:

$ cat ~/.octaverc
PETSC_DIR=getenv('PETSC_DIR');
if length(PETSC_DIR)==0
  PETSC_DIR='~/code/petsc'
end
addpath([PETSC_DIR,'/share/petsc/matlab'])

(As an aside, note that there are also scripts included to load PETSc binary files to use with numpy/scipy in  Python, e.g. $PETSC_DIR/lib/petsc/bin/PetscBinaryIO.py)

> Am 29.11.2019 um 04:07 schrieb baikadi pranay <pranayreddy865 at gmail.com>:
> 
> Hello PETSc users, 
> 
> I have a sparse matrix built and I want to output the matrix for viewing in matlab. However i'm having difficulty outputting the matrix. I am writing my program in Fortran90 and I've included the following lines to output the matrix.
> 
>  call PetscViewerBinaryOpen(PETSC_COMM_SELF,'matrix',FILE_MODE_WRITE,view,ierr)
>  call PetscViewerBinaryGetDescriptor(view,fd,ierr)
>  call PetscBinaryWrite(fd,ham,1,PETSC_SCALAR,PETSC_FALSE,ierr)
> 
> These lines do create a matrix but matlab says its not a binary file. Could you please provide me some inputs on where I'm going wrong and how to proceed with this problem. I can provide any further information that you might need to help me solve this problem. 
> 
> 
> Thank you.
> 
> Sincerely,
> Pranay. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191129/0a51c10c/attachment.html>

From patrick.sanan at gmail.com  Fri Nov 29 02:16:19 2019
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Fri, 29 Nov 2019 09:16:19 +0100
Subject: [petsc-users] Outputting matrix for viewing in matlab
In-Reply-To: <F9517BF4-15B0-4241-9798-94AA075D68BE@gmail.com>
References: <CA+zFCTms4nRzw03cey1xpLRw-GAsW67T15xcFVbp4BYSGVSxTw@mail.gmail.com>
	<F9517BF4-15B0-4241-9798-94AA075D68BE@gmail.com>
Message-ID: <F694120A-92A4-4A6D-95B2-6C99D65F8713@gmail.com>


> Am 29.11.2019 um 09:14 schrieb Patrick Sanan <patrick.sanan at gmail.com>:
> 
> PETSc has its own binary format, which is not the same as MATLAB's. 
> 
> However, PETSc includes some MATLAB/Octave scripts which will load these binary files.
> 
> See $PETSC_DIR/share/matlab/PetscBinaryRead.m - there are some examples in the comments at the top of that file.
Correction: $PETSC_DIR/share/petsc/matlab/PetscBinaryRead.m 
> 
> 
> Note that you will probably want to add $PETSC_DIR/share/matlab to your MATLAB path so that you can run the script. This is what I have for Octave, but I'm not sure if it this, precisely, works in MATLAB:
> 
> $ cat ~/.octaverc
> PETSC_DIR=getenv('PETSC_DIR');
> if length(PETSC_DIR)==0
>   PETSC_DIR='~/code/petsc'
> end
> addpath([PETSC_DIR,'/share/petsc/matlab'])
> 
> (As an aside, note that there are also scripts included to load PETSc binary files to use with numpy/scipy in  Python, e.g. $PETSC_DIR/lib/petsc/bin/PetscBinaryIO.py)
> 
>> Am 29.11.2019 um 04:07 schrieb baikadi pranay <pranayreddy865 at gmail.com <mailto:pranayreddy865 at gmail.com>>:
>> 
>> Hello PETSc users, 
>> 
>> I have a sparse matrix built and I want to output the matrix for viewing in matlab. However i'm having difficulty outputting the matrix. I am writing my program in Fortran90 and I've included the following lines to output the matrix.
>> 
>>  call PetscViewerBinaryOpen(PETSC_COMM_SELF,'matrix',FILE_MODE_WRITE,view,ierr)
>>  call PetscViewerBinaryGetDescriptor(view,fd,ierr)
>>  call PetscBinaryWrite(fd,ham,1,PETSC_SCALAR,PETSC_FALSE,ierr)
>> 
>> These lines do create a matrix but matlab says its not a binary file. Could you please provide me some inputs on where I'm going wrong and how to proceed with this problem. I can provide any further information that you might need to help me solve this problem. 
>> 
>> 
>> Thank you.
>> 
>> Sincerely,
>> Pranay. 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191129/472d9367/attachment-0001.html>

From knepley at gmail.com  Fri Nov 29 08:44:14 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 29 Nov 2019 08:44:14 -0600
Subject: [petsc-users] Domain decomposition using DMPLEX
In-Reply-To: <CAC9YzR5aYmOT-H=Q_KL+F4nJ+jf7yrshrGcHcfbmTStE=v3Ukg@mail.gmail.com>
References: <CAC9YzR6EC=hGFtD-VOnkqJvYXXkL=GM4iooAagosq-yq9=X=pg@mail.gmail.com>
	<CAMYG4GnzL4+gmKeeiXp-PLwC9yBpCh36NTyq05Saj=+N7-aR3w@mail.gmail.com>
	<CAC9YzR4eRJ3xtnacA3ubKHwOjAkBPp8ubGtiJWPgHG0ggY_OBQ@mail.gmail.com>
	<AADD769B-E4EB-4692-81C8-6E0803237525@anl.gov>
	<CAC9YzR5aYmOT-H=Q_KL+F4nJ+jf7yrshrGcHcfbmTStE=v3Ukg@mail.gmail.com>
Message-ID: <CAMYG4G=DNoNFkkC0eFM4C+fH-rmn+jR+xHMRTu3TAyp-KixjQw@mail.gmail.com>

On Thu, Nov 28, 2019 at 9:45 PM Swarnava Ghosh <swarnava89 at gmail.com> wrote:

> Hi Barry,
>
>  "Why do you need a cuboidal domain decomposition?"
>
> I gave it some thought. I don't always need a cuboidal decomposition. But
> I would need something that essentially minimized the surface area of the
> faces of each decomposition. Is there a way to get this? Could you please
> direct me to a reference a reference where I can read about the domain
> decomposition strategies used in petsc dmplex.
>

This is the point of graph partitioning, which minimizes the "cut" which
the the number of links between one partition and another. The ParMetis
manual has this kind of information, and citations.

  Thanks,

     Matt


> Sincerely,
> Swarnava
>
> On Mon, Nov 25, 2019 at 9:02 PM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
>
>>
>> "No, I have an unstructured mesh that increases in resolution away from
>> the center of the cuboid. See Figure: 5 in the ArXiv paper
>> https://arxiv.org/pdf/1907.02604.pdf  for a slice through the midplane
>> of the cuboid.  Given this type of mesh, will dmplex do a cuboidal domain
>> decomposition?"
>>
>>   No definitely not. Why do you need a cuboidal domain decomposition?
>>
>>   Barry
>>
>>
>> > On Nov 25, 2019, at 10:45 PM, Swarnava Ghosh <swarnava89 at gmail.com>
>> wrote:
>> >
>> > Hi Matt,
>> >
>> >
>> > https://arxiv.org/pdf/1907.02604.pdf
>> >
>> > On Mon, Nov 25, 2019 at 7:54 PM Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > On Mon, Nov 25, 2019 at 6:25 PM Swarnava Ghosh <swarnava89 at gmail.com>
>> wrote:
>> > Dear PETSc users and developers,
>> >
>> > I am working with dmplex to distribute a 3D unstructured mesh made of
>> tetrahedrons in a cuboidal domain. I had a few queries:
>> > 1) Is there any way of ensuring load balancing based on the number of
>> vertices per MPI process.
>> >
>> > You can now call DMPlexRebalanceSharedPoints() to try and get better
>> balance of vertices.
>> >
>> >   Thank you for pointing out this function!
>> >
>> > 2) As the global domain is cuboidal, is the resulting domain
>> decomposition also cuboidal on every MPI process? If not, is there a way to
>> ensure this? For example in DMDA, the default domain decomposition for a
>> cuboidal domain is cuboidal.
>> >
>> > It sounds like you do not want something that is actually unstructured.
>> Rather, it seems like you want to
>> > take a DMDA type thing and split it into tets. You can get a cuboidal
>> decomposition of a hex mesh easily.
>> > Call DMPlexCreateBoxMesh() with one cell for every process, distribute,
>> and then uniformly refine. This
>> > will not quite work for tets since the mesh partitioner will tend to
>> violate that constraint. You could:
>> >
>> > No, I have an unstructured mesh that increases in resolution away from
>> the center of the cuboid. See Figure: 5 in the ArXiv paper
>> https://arxiv.org/pdf/1907.02604.pdf  for a slice through the midplane
>> of the cuboid.  Given this type of mesh, will dmplex do a cuboidal domain
>> decomposition?
>> >
>> > Sincerely,
>> > SG
>> >
>> >   a) Prescribe the distribution yourself using the Shell partitioner
>> type
>> >
>> > or
>> >
>> >   b) Write a refiner that turns hexes into tets
>> >
>> > We already have a refiner that turns tets into hexes, but we never
>> wrote the other direction because it was not clear
>> > that it was useful.
>> >
>> >   Thanks,
>> >
>> >      Matt
>> >
>> > Sincerely,
>> > SG
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> > https://www.cse.buffalo.edu/~knepley/
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191129/d58d6a43/attachment.html>

From bsmith at mcs.anl.gov  Fri Nov 29 11:39:24 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Fri, 29 Nov 2019 17:39:24 +0000
Subject: [petsc-users] Outputting matrix for viewing in matlab
In-Reply-To: <CA+zFCTms4nRzw03cey1xpLRw-GAsW67T15xcFVbp4BYSGVSxTw@mail.gmail.com>
References: <CA+zFCTms4nRzw03cey1xpLRw-GAsW67T15xcFVbp4BYSGVSxTw@mail.gmail.com>
Message-ID: <40760746-8753-4B63-9F16-31B55C5A9005@anl.gov>


> On Nov 28, 2019, at 7:07 PM, baikadi pranay <pranayreddy865 at gmail.com> wrote:
> 
> Hello PETSc users, 
> 
> I have a sparse matrix built and I want to output the matrix for viewing in matlab. However i'm having difficulty outputting the matrix. I am writing my program in Fortran90 and I've included the following lines to output the matrix.
> 
>  call PetscViewerBinaryOpen(PETSC_COMM_SELF,'matrix',FILE_MODE_WRITE,view,ierr)

  Normally on would use to save the matrix and then use the scripts Patrick mentioned to read the matrix into Matlab or Python.

   call MatView(matrix, view,ierr)
   call PetscViewerDestroy(view,ierr)


>  call PetscViewerBinaryGetDescriptor(view,fd,ierr)
>  call PetscBinaryWrite(fd,ham,1,PETSC_SCALAR,PETSC_FALSE,ierr)
> 
> These lines do create a matrix but matlab says its not a binary file. Could you please provide me some inputs on where I'm going wrong and how to proceed with this problem. I can provide any further information that you might need to help me solve this problem. 
> 
> 
> Thank you.
> 
> Sincerely,
> Pranay. 


From fe.wallner at gmail.com  Fri Nov 29 18:14:36 2019
From: fe.wallner at gmail.com (Felipe Giacomelli)
Date: Fri, 29 Nov 2019 22:14:36 -0200
Subject: [petsc-users] Weird behaviour of PCGAMG in coupled poroelasticity
Message-ID: <CA+t_Twuir7r_3__TjoFn7czQKaUXUwT8zr=eNNYZ8MRpYOD6wA@mail.gmail.com>

Hello,

I'm trying to solve Biot's poroelasticity (Cryer's sphere problem) through
a fully coupled scheme. Thus, the solution of a single linear system yields
both displacement and pressure fields,

  |K       L      | | u | = |b_u|.
  |Q  (A + H) | | p | = |b_p|

The linear system is asymmetric, given that the discrete equations were
obtained through the Element based Finite Volume Method (EbFVM). An
unstructured tetrahedral grid is utilised, it has about 10000 nodal points
(not coarse, nor too refined). Therefore, GMRES and GAMG are employed to
solve it.

Furthermore, the program was parallelised through a Domain Decomposition
Method. Thus, each processor works in its subdomain only.

So far, so good. For a given set of poroelastic properties (which are
constant throughout time and space), the speedup increases as more
processors are utilised:

  coupling intensity: 7.51e-01

    proc    solve time [s]
       1        314.23
       2        171.65
       3        143.21
       4        149.26 (> 143.21, but ok)

However, after making the problem MORE coupled (different poroelastic
properties), a strange behavior is observed:

  coupling intensity: 2.29e+01

    proc    solve time [s]
       1      28909.35
       2        192.39
       3        181.29
       4      14463.63

Recalling that GMRES and GAMG are used, KSP takes about 4300 iterations to
converge when 1 processor is employed. On the other hand, for 2 processors,
KSP takes around 30 iterations to reach convergence. Hence, explaining the
difference between the solution times.

Increasing the coupling even MORE, everything goes as expected:

  coupling intensity: 4.63e+01

    proc    solve time [s]
       1        229.26
       2        146.04
       3        121.49
       4        107.80

Because of this, I ask:

* What may be the source of this behavior? Can it be predicted?
* How can I remedy this situation?

At last, are there better solver-pc choices for coupled poroelasticity?

Thank you,
Felipe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191129/e534cad2/attachment.html>

From bsmith at mcs.anl.gov  Sat Nov 30 00:48:59 2019
From: bsmith at mcs.anl.gov (Smith, Barry F.)
Date: Sat, 30 Nov 2019 06:48:59 +0000
Subject: [petsc-users] Weird behaviour of PCGAMG in coupled
 poroelasticity
In-Reply-To: <CA+t_Twuir7r_3__TjoFn7czQKaUXUwT8zr=eNNYZ8MRpYOD6wA@mail.gmail.com>
References: <CA+t_Twuir7r_3__TjoFn7czQKaUXUwT8zr=eNNYZ8MRpYOD6wA@mail.gmail.com>
Message-ID: <FB51390B-991B-49F0-B0A3-5ECF648E5ECB@anl.gov>


  I would first run with -ksp_monitor_true_residual -ksp_converged_reason to make sure that those "very fast" cases are actually converging in those runs also use -ksp_view to see what the GMAG parameters are. Also use the -info option  to have it print details on the solution process. 

   Barry


> On Nov 29, 2019, at 4:14 PM, Felipe Giacomelli <fe.wallner at gmail.com> wrote:
> 
> Hello,
> 
> I'm trying to solve Biot's poroelasticity (Cryer's sphere problem) through a fully coupled scheme. Thus, the solution of a single linear system yields both displacement and pressure fields,
> 
>   |K       L      | | u | = |b_u|.
>   |Q  (A + H) | | p | = |b_p|
> 
> The linear system is asymmetric, given that the discrete equations were obtained through the Element based Finite Volume Method (EbFVM). An unstructured tetrahedral grid is utilised, it has about 10000 nodal points (not coarse, nor too refined). Therefore, GMRES and GAMG are employed to solve it.
> 
> Furthermore, the program was parallelised through a Domain Decomposition Method. Thus, each processor works in its subdomain only.
> 
> So far, so good. For a given set of poroelastic properties (which are constant throughout time and space), the speedup increases as more processors are utilised:
> 
>   coupling intensity: 7.51e-01
> 
>     proc    solve time [s]
>        1        314.23
>        2        171.65
>        3        143.21
>        4        149.26 (> 143.21, but ok)
> 
> However, after making the problem MORE coupled (different poroelastic properties), a strange behavior is observed:
> 
>   coupling intensity: 2.29e+01
> 
>     proc    solve time [s]
>        1      28909.35
>        2        192.39
>        3        181.29
>        4      14463.63
> 
> Recalling that GMRES and GAMG are used, KSP takes about 4300 iterations to converge when 1 processor is employed. On the other hand, for 2 processors, KSP takes around 30 iterations to reach convergence. Hence, explaining the difference between the solution times.
> 
> Increasing the coupling even MORE, everything goes as expected:
> 
>   coupling intensity: 4.63e+01
> 
>     proc    solve time [s]
>        1        229.26
>        2        146.04
>        3        121.49
>        4        107.80
> 
> Because of this, I ask:
> 
> * What may be the source of this behavior? Can it be predicted?
> * How can I remedy this situation?
> 
> At last, are there better solver-pc choices for coupled poroelasticity?
> 
> Thank you,
> Felipe


From mfadams at lbl.gov  Sat Nov 30 04:01:05 2019
From: mfadams at lbl.gov (Mark Adams)
Date: Sat, 30 Nov 2019 05:01:05 -0500
Subject: [petsc-users] Weird behaviour of PCGAMG in coupled
 poroelasticity
In-Reply-To: <FB51390B-991B-49F0-B0A3-5ECF648E5ECB@anl.gov>
References: <CA+t_Twuir7r_3__TjoFn7czQKaUXUwT8zr=eNNYZ8MRpYOD6wA@mail.gmail.com>
	<FB51390B-991B-49F0-B0A3-5ECF648E5ECB@anl.gov>
Message-ID: <CADOhEh7jsUGDmna0=NvocGWuV3FuS7ta+mqDJpQRN4p5CG+wLw@mail.gmail.com>

Let me add that generic AMG is not great for systems like this (indefinite,
asymmetric) so yes, check that your good cases are really good.

GAMG uses eigenvalues, which are problematic for indefinite and
asymmetric matrices. I don't know why this is ever working well, but try
'-pc_type hypre' (and configure with --download-hypre'). Hypre is better
with asymmetric matrices. This would provide useful information to
diagnose what is going on here if not solve your problem.

Note, the algorithms and implementations of hypre and GAMG are not very
domain decomposition dependant so it is surprising to see these huge
differences from the number of processors used.

On Sat, Nov 30, 2019 at 1:49 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>   I would first run with -ksp_monitor_true_residual -ksp_converged_reason
> to make sure that those "very fast" cases are actually converging in those
> runs also use -ksp_view to see what the GMAG parameters are. Also use the
> -info option  to have it print details on the solution process.
>
>    Barry
>
>
>
> > On Nov 29, 2019, at 4:14 PM, Felipe Giacomelli <fe.wallner at gmail.com>
> wrote:
> >
> > Hello,
> >
> > I'm trying to solve Biot's poroelasticity (Cryer's sphere problem)
> through a fully coupled scheme. Thus, the solution of a single linear
> system yields both displacement and pressure fields,
> >
> >   |K       L      | | u | = |b_u|.
> >   |Q  (A + H) | | p | = |b_p|
> >
> > The linear system is asymmetric, given that the discrete equations were
> obtained through the Element based Finite Volume Method (EbFVM). An
> unstructured tetrahedral grid is utilised, it has about 10000 nodal points
> (not coarse, nor too refined). Therefore, GMRES and GAMG are employed to
> solve it.
> >
> > Furthermore, the program was parallelised through a Domain Decomposition
> Method. Thus, each processor works in its subdomain only.
> >
> > So far, so good. For a given set of poroelastic properties (which are
> constant throughout time and space), the speedup increases as more
> processors are utilised:
> >
> >   coupling intensity: 7.51e-01
> >
> >     proc    solve time [s]
> >        1        314.23
> >        2        171.65
> >        3        143.21
> >        4        149.26 (> 143.21, but ok)
> >
> > However, after making the problem MORE coupled (different poroelastic
> properties), a strange behavior is observed:
> >
> >   coupling intensity: 2.29e+01
> >
> >     proc    solve time [s]
> >        1      28909.35
> >        2        192.39
> >        3        181.29
> >        4      14463.63
> >
> > Recalling that GMRES and GAMG are used, KSP takes about 4300 iterations
> to converge when 1 processor is employed. On the other hand, for 2
> processors, KSP takes around 30 iterations to reach convergence. Hence,
> explaining the difference between the solution times.
> >
> > Increasing the coupling even MORE, everything goes as expected:
> >
> >   coupling intensity: 4.63e+01
> >
> >     proc    solve time [s]
> >        1        229.26
> >        2        146.04
> >        3        121.49
> >        4        107.80
> >
> > Because of this, I ask:
> >
> > * What may be the source of this behavior? Can it be predicted?
> > * How can I remedy this situation?
> >
> > At last, are there better solver-pc choices for coupled poroelasticity?
> >
> > Thank you,
> > Felipe
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/5e85e4f0/attachment.html>

From pranayreddy865 at gmail.com  Sat Nov 30 14:30:51 2019
From: pranayreddy865 at gmail.com (baikadi pranay)
Date: Sat, 30 Nov 2019 13:30:51 -0700
Subject: [petsc-users] Outputting matrix for viewing in matlab
In-Reply-To: <40760746-8753-4B63-9F16-31B55C5A9005@anl.gov>
References: <CA+zFCTms4nRzw03cey1xpLRw-GAsW67T15xcFVbp4BYSGVSxTw@mail.gmail.com>
	<40760746-8753-4B63-9F16-31B55C5A9005@anl.gov>
Message-ID: <CA+zFCTkqxYmE_BGOU92dSAVZ=bPL9W1CkjmCw9=sVZYoM7+4hA@mail.gmail.com>

Thank you all for the suggestions. I have it working now.

Regards,
Pranay.

On Fri, Nov 29, 2019 at 10:39 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>
> > On Nov 28, 2019, at 7:07 PM, baikadi pranay <pranayreddy865 at gmail.com>
> wrote:
> >
> > Hello PETSc users,
> >
> > I have a sparse matrix built and I want to output the matrix for viewing
> in matlab. However i'm having difficulty outputting the matrix. I am
> writing my program in Fortran90 and I've included the following lines to
> output the matrix.
> >
> >  call
> PetscViewerBinaryOpen(PETSC_COMM_SELF,'matrix',FILE_MODE_WRITE,view,ierr)
>
>   Normally on would use to save the matrix and then use the scripts
> Patrick mentioned to read the matrix into Matlab or Python.
>
>    call MatView(matrix, view,ierr)
>    call PetscViewerDestroy(view,ierr)
>
>
>
> >  call PetscViewerBinaryGetDescriptor(view,fd,ierr)
> >  call PetscBinaryWrite(fd,ham,1,PETSC_SCALAR,PETSC_FALSE,ierr)
> >
> > These lines do create a matrix but matlab says its not a binary file.
> Could you please provide me some inputs on where I'm going wrong and how to
> proceed with this problem. I can provide any further information that you
> might need to help me solve this problem.
> >
> >
> > Thank you.
> >
> > Sincerely,
> > Pranay.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/94be3de8/attachment.html>

From pranayreddy865 at gmail.com  Sat Nov 30 15:18:12 2019
From: pranayreddy865 at gmail.com (baikadi pranay)
Date: Sat, 30 Nov 2019 14:18:12 -0700
Subject: [petsc-users] Floating point exception
Message-ID: <CA+zFCTkX2Nf-dHnWdxKZtqtLzn2deRRstJbk_dHn9hKxCe6ZsA@mail.gmail.com>

Hello PETSc users,

I am currently trying to build a 1-D Schrodinger solver. I have built my
hamiltonian matrix (of size 121 x 121) and i'm trying to find the
eigenvalues. I have the following lines of code for the solver:

*call EPSCreate(PETSC_COMM_WORLD,eps,ierr)*

*call EPSSetOperators(eps,ham,S,ierr)call
EPSSetProblemType(eps,EPS_GHEP,ierr)*


*call EPSSetFromOptions(eps,ierr)call
EPSSetDimensions(eps,10,PETSC_DEFAULT_INTEGER,PETSC_DEFAULT_INTEGER,ierr)call
EPSSolve(eps,ierr)call EPSDestroy(eps,ierr)*

At the EPSSolve line, i get the following error:


*[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------[0]PETSC
ERROR: Floating point exception[0]PETSC ERROR: Vec entry at local location
0 is not-a-number or infinite at end of function: Parameter number
3[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
<http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble
shooting.[0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019*

I am using the options *-st_pc_factor_shift_type NONZERO
-st_pc_factor_shift_amount 1*     ( else I end up getting the "zero pivot
in LU factorization" error ).

I outputted my matrix to matlab and confirmed that the null space is empty
and the matrix is not singular. I am not sure why I'm getting this error.
Could you provide me a hint as to how to solve this problem.

Sincerely,
Pranay.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/0333a083/attachment.html>

From knepley at gmail.com  Sat Nov 30 15:45:55 2019
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 30 Nov 2019 15:45:55 -0600
Subject: [petsc-users] Floating point exception
In-Reply-To: <CA+zFCTkX2Nf-dHnWdxKZtqtLzn2deRRstJbk_dHn9hKxCe6ZsA@mail.gmail.com>
References: <CA+zFCTkX2Nf-dHnWdxKZtqtLzn2deRRstJbk_dHn9hKxCe6ZsA@mail.gmail.com>
Message-ID: <CAMYG4GkU9hQGk+bOKm81xab8UbNsGp5OvV3fTF=t7uTi3eDeZA@mail.gmail.com>

On Sat, Nov 30, 2019 at 3:19 PM baikadi pranay <pranayreddy865 at gmail.com>
wrote:

> Hello PETSc users,
>
> I am currently trying to build a 1-D Schrodinger solver. I have built my
> hamiltonian matrix (of size 121 x 121) and i'm trying to find the
> eigenvalues. I have the following lines of code for the solver:
>
> *call EPSCreate(PETSC_COMM_WORLD,eps,ierr)*
>
> *call EPSSetOperators(eps,ham,S,ierr)call
> EPSSetProblemType(eps,EPS_GHEP,ierr)*
>
>
>
> *call EPSSetFromOptions(eps,ierr)call
> EPSSetDimensions(eps,10,PETSC_DEFAULT_INTEGER,PETSC_DEFAULT_INTEGER,ierr)call
> EPSSolve(eps,ierr)call EPSDestroy(eps,ierr)*
>
> At the EPSSolve line, i get the following error:
>
>
>
>
>
> *[0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------[0]PETSC
> ERROR: Floating point exception[0]PETSC ERROR: Vec entry at local location
> 0 is not-a-number or infinite at end of function: Parameter number
> 3[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble
> shooting.[0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019*
>

You need to show the entire stack trace that is output here.

  Thanks,

     Matt


> I am using the options *-st_pc_factor_shift_type NONZERO
> -st_pc_factor_shift_amount 1*     ( else I end up getting the "zero pivot
> in LU factorization" error ).
>
> I outputted my matrix to matlab and confirmed that the null space is empty
> and the matrix is not singular. I am not sure why I'm getting this error.
> Could you provide me a hint as to how to solve this problem.
>
> Sincerely,
> Pranay.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/d57a4cd7/attachment.html>

From pranayreddy865 at gmail.com  Sat Nov 30 15:56:15 2019
From: pranayreddy865 at gmail.com (baikadi pranay)
Date: Sat, 30 Nov 2019 14:56:15 -0700
Subject: [petsc-users] Floating point exception
In-Reply-To: <CAMYG4GkU9hQGk+bOKm81xab8UbNsGp5OvV3fTF=t7uTi3eDeZA@mail.gmail.com>
References: <CA+zFCTkX2Nf-dHnWdxKZtqtLzn2deRRstJbk_dHn9hKxCe6ZsA@mail.gmail.com>
	<CAMYG4GkU9hQGk+bOKm81xab8UbNsGp5OvV3fTF=t7uTi3eDeZA@mail.gmail.com>
Message-ID: <CA+zFCTk4fLaGf8WXdKiuLHtj6H8S7wJWHYtiE+yJfFayqBC4uw@mail.gmail.com>

Hello,

The entire output is the following:


[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Floating point exception
[0]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite
at end of function: Parameter number 3
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019
[0]PETSC ERROR: ./a.out on a linux-gnu-c-debug named agave1.agave.rc.asu.edu
by pbaikadi Sat Nov 30 14:54:31 2019
[0]PETSC ERROR: Configure options
[0]PETSC ERROR: #1 VecValidValues() line 28 in
/packages/7x/petsc/3.11.1/petsc-3.11.1/src/vec/vec/interface/rvector.c
[0]PETSC ERROR: #2 PCApply() line 464 in
/packages/7x/petsc/3.11.1/petsc-3.11.1/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #3 KSP_PCApply() line 281 in
/packages/7x/petsc/3.11.1/petsc-3.11.1/include/petsc/private/kspimpl.h
[0]PETSC ERROR: #4 KSPSolve_PREONLY() line 22 in
/packages/7x/petsc/3.11.1/petsc-3.11.1/src/ksp/ksp/impls/preonly/preonly.c
[0]PETSC ERROR: #5 KSPSolve() line 782 in
/packages/7x/petsc/3.11.1/petsc-3.11.1/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #6 STMatSolve() line 193 in
/packages/7x/slepc/3.11.1/slepc-3.11.1/src/sys/classes/st/interface/stsles.c
[0]PETSC ERROR: #7 STApply_Shift() line 25 in
/packages/7x/slepc/3.11.1/slepc-3.11.1/src/sys/classes/st/impls/shift/shift.c
[0]PETSC ERROR: #8 STApply() line 57 in
/packages/7x/slepc/3.11.1/slepc-3.11.1/src/sys/classes/st/interface/stsolve.c
[0]PETSC ERROR: #9 EPSGetStartVector() line 797 in
/packages/7x/slepc/3.11.1/slepc-3.11.1/src/eps/interface/epssolve.c
[0]PETSC ERROR: #10 EPSSolve_KrylovSchur_Symm() line 32 in
/packages/7x/slepc/3.11.1/slepc-3.11.1/src/eps/impls/krylov/krylovschur/ks-symm.c
[0]PETSC ERROR: #11 EPSSolve() line 149 in
/packages/7x/slepc/3.11.1/slepc-3.11.1/src/eps/interface/epssolve.c

Regards,
Pranay.

On Sat, Nov 30, 2019 at 2:46 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Sat, Nov 30, 2019 at 3:19 PM baikadi pranay <pranayreddy865 at gmail.com>
> wrote:
>
>> Hello PETSc users,
>>
>> I am currently trying to build a 1-D Schrodinger solver. I have built my
>> hamiltonian matrix (of size 121 x 121) and i'm trying to find the
>> eigenvalues. I have the following lines of code for the solver:
>>
>> *call EPSCreate(PETSC_COMM_WORLD,eps,ierr)*
>>
>> *call EPSSetOperators(eps,ham,S,ierr)call
>> EPSSetProblemType(eps,EPS_GHEP,ierr)*
>>
>>
>>
>> *call EPSSetFromOptions(eps,ierr)call
>> EPSSetDimensions(eps,10,PETSC_DEFAULT_INTEGER,PETSC_DEFAULT_INTEGER,ierr)call
>> EPSSolve(eps,ierr)call EPSDestroy(eps,ierr)*
>>
>> At the EPSSolve line, i get the following error:
>>
>>
>>
>>
>>
>> *[0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------[0]PETSC
>> ERROR: Floating point exception[0]PETSC ERROR: Vec entry at local location
>> 0 is not-a-number or infinite at end of function: Parameter number
>> 3[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble
>> shooting.[0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019*
>>
>
> You need to show the entire stack trace that is output here.
>
>   Thanks,
>
>      Matt
>
>
>> I am using the options *-st_pc_factor_shift_type NONZERO
>> -st_pc_factor_shift_amount 1*     ( else I end up getting the "zero
>> pivot in LU factorization" error ).
>>
>> I outputted my matrix to matlab and confirmed that the null space is
>> empty and the matrix is not singular. I am not sure why I'm getting this
>> error. Could you provide me a hint as to how to solve this problem.
>>
>> Sincerely,
>> Pranay.
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/d2498b43/attachment-0001.html>

From pranayreddy865 at gmail.com  Sat Nov 30 19:28:05 2019
From: pranayreddy865 at gmail.com (baikadi pranay)
Date: Sat, 30 Nov 2019 18:28:05 -0700
Subject: [petsc-users] petsc-users Digest, Vol 131, Issue 49
In-Reply-To: <mailman.1561.1575151051.4655.petsc-users@mcs.anl.gov>
References: <mailman.1561.1575151051.4655.petsc-users@mcs.anl.gov>
Message-ID: <CA+zFCTkU=ud+RejuFs1Dukdc=cswZu+oGsAK24djVEkqu4ujYw@mail.gmail.com>

Hello all,
I was able to figure out why i was getting a floating point error. Although
i knew that petsc uses 0-based indexing for fortran, i forgot to modify my
code accordingly. So it was accessing elements of the array which are
essentially zero.
Thank you for your time.
Regards,
Pranay.

On Sat, Nov 30, 2019 at 2:57 PM <petsc-users-request at mcs.anl.gov> wrote:

> Send petsc-users mailing list submissions to
>         petsc-users at mcs.anl.gov
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.mcs.anl.gov/mailman/listinfo/petsc-users
> or, via email, send a message with subject or body 'help' to
>         petsc-users-request at mcs.anl.gov
>
> You can reach the person managing the list at
>         petsc-users-owner at mcs.anl.gov
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of petsc-users digest..."
>
>
> Today's Topics:
>
>    1. Re:  Outputting matrix for viewing in matlab (baikadi pranay)
>    2.  Floating point exception (baikadi pranay)
>    3. Re:  Floating point exception (Matthew Knepley)
>    4. Re:  Floating point exception (baikadi pranay)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 30 Nov 2019 13:30:51 -0700
> From: baikadi pranay <pranayreddy865 at gmail.com>
> To: "Smith, Barry F." <bsmith at mcs.anl.gov>
> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Outputting matrix for viewing in matlab
> Message-ID:
>         <CA+zFCTkqxYmE_BGOU92dSAVZ=bPL9W1CkjmCw9=
> sVZYoM7+4hA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Thank you all for the suggestions. I have it working now.
>
> Regards,
> Pranay.
>
> On Fri, Nov 29, 2019 at 10:39 AM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
>
> >
> >
> > > On Nov 28, 2019, at 7:07 PM, baikadi pranay <pranayreddy865 at gmail.com>
> > wrote:
> > >
> > > Hello PETSc users,
> > >
> > > I have a sparse matrix built and I want to output the matrix for
> viewing
> > in matlab. However i'm having difficulty outputting the matrix. I am
> > writing my program in Fortran90 and I've included the following lines to
> > output the matrix.
> > >
> > >  call
> > PetscViewerBinaryOpen(PETSC_COMM_SELF,'matrix',FILE_MODE_WRITE,view,ierr)
> >
> >   Normally on would use to save the matrix and then use the scripts
> > Patrick mentioned to read the matrix into Matlab or Python.
> >
> >    call MatView(matrix, view,ierr)
> >    call PetscViewerDestroy(view,ierr)
> >
> >
> >
> > >  call PetscViewerBinaryGetDescriptor(view,fd,ierr)
> > >  call PetscBinaryWrite(fd,ham,1,PETSC_SCALAR,PETSC_FALSE,ierr)
> > >
> > > These lines do create a matrix but matlab says its not a binary file.
> > Could you please provide me some inputs on where I'm going wrong and how
> to
> > proceed with this problem. I can provide any further information that you
> > might need to help me solve this problem.
> > >
> > >
> > > Thank you.
> > >
> > > Sincerely,
> > > Pranay.
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/94be3de8/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Sat, 30 Nov 2019 14:18:12 -0700
> From: baikadi pranay <pranayreddy865 at gmail.com>
> To: petsc-users at mcs.anl.gov
> Subject: [petsc-users] Floating point exception
> Message-ID:
>         <
> CA+zFCTkX2Nf-dHnWdxKZtqtLzn2deRRstJbk_dHn9hKxCe6ZsA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hello PETSc users,
>
> I am currently trying to build a 1-D Schrodinger solver. I have built my
> hamiltonian matrix (of size 121 x 121) and i'm trying to find the
> eigenvalues. I have the following lines of code for the solver:
>
> *call EPSCreate(PETSC_COMM_WORLD,eps,ierr)*
>
> *call EPSSetOperators(eps,ham,S,ierr)call
> EPSSetProblemType(eps,EPS_GHEP,ierr)*
>
>
>
> *call EPSSetFromOptions(eps,ierr)call
>
> EPSSetDimensions(eps,10,PETSC_DEFAULT_INTEGER,PETSC_DEFAULT_INTEGER,ierr)call
> EPSSolve(eps,ierr)call EPSDestroy(eps,ierr)*
>
> At the EPSSolve line, i get the following error:
>
>
>
>
>
> *[0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------[0]PETSC
> ERROR: Floating point exception[0]PETSC ERROR: Vec entry at local location
> 0 is not-a-number or infinite at end of function: Parameter number
> 3[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble
> shooting.[0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019*
>
> I am using the options *-st_pc_factor_shift_type NONZERO
> -st_pc_factor_shift_amount 1*     ( else I end up getting the "zero pivot
> in LU factorization" error ).
>
> I outputted my matrix to matlab and confirmed that the null space is empty
> and the matrix is not singular. I am not sure why I'm getting this error.
> Could you provide me a hint as to how to solve this problem.
>
> Sincerely,
> Pranay.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/0333a083/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 3
> Date: Sat, 30 Nov 2019 15:45:55 -0600
> From: Matthew Knepley <knepley at gmail.com>
> To: baikadi pranay <pranayreddy865 at gmail.com>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Floating point exception
> Message-ID:
>         <CAMYG4GkU9hQGk+bOKm81xab8UbNsGp5OvV3fTF=
> t7uTi3eDeZA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Sat, Nov 30, 2019 at 3:19 PM baikadi pranay <pranayreddy865 at gmail.com>
> wrote:
>
> > Hello PETSc users,
> >
> > I am currently trying to build a 1-D Schrodinger solver. I have built my
> > hamiltonian matrix (of size 121 x 121) and i'm trying to find the
> > eigenvalues. I have the following lines of code for the solver:
> >
> > *call EPSCreate(PETSC_COMM_WORLD,eps,ierr)*
> >
> > *call EPSSetOperators(eps,ham,S,ierr)call
> > EPSSetProblemType(eps,EPS_GHEP,ierr)*
> >
> >
> >
> > *call EPSSetFromOptions(eps,ierr)call
> >
> EPSSetDimensions(eps,10,PETSC_DEFAULT_INTEGER,PETSC_DEFAULT_INTEGER,ierr)call
> > EPSSolve(eps,ierr)call EPSDestroy(eps,ierr)*
> >
> > At the EPSSolve line, i get the following error:
> >
> >
> >
> >
> >
> > *[0]PETSC ERROR: --------------------- Error Message
> > --------------------------------------------------------------[0]PETSC
> > ERROR: Floating point exception[0]PETSC ERROR: Vec entry at local
> location
> > 0 is not-a-number or infinite at end of function: Parameter number
> > 3[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble
> > shooting.[0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019*
> >
>
> You need to show the entire stack trace that is output here.
>
>   Thanks,
>
>      Matt
>
>
> > I am using the options *-st_pc_factor_shift_type NONZERO
> > -st_pc_factor_shift_amount 1*     ( else I end up getting the "zero pivot
> > in LU factorization" error ).
> >
> > I outputted my matrix to matlab and confirmed that the null space is
> empty
> > and the matrix is not singular. I am not sure why I'm getting this error.
> > Could you provide me a hint as to how to solve this problem.
> >
> > Sincerely,
> > Pranay.
> >
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ <
> http://www.cse.buffalo.edu/~knepley/>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/d57a4cd7/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 4
> Date: Sat, 30 Nov 2019 14:56:15 -0700
> From: baikadi pranay <pranayreddy865 at gmail.com>
> To: Matthew Knepley <knepley at gmail.com>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Floating point exception
> Message-ID:
>         <
> CA+zFCTk4fLaGf8WXdKiuLHtj6H8S7wJWHYtiE+yJfFayqBC4uw at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hello,
>
> The entire output is the following:
>
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Floating point exception
> [0]PETSC ERROR: Vec entry at local location 0 is not-a-number or infinite
> at end of function: Parameter number 3
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for
> trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019
> [0]PETSC ERROR: ./a.out on a linux-gnu-c-debug named
> agave1.agave.rc.asu.edu
> by pbaikadi Sat Nov 30 14:54:31 2019
> [0]PETSC ERROR: Configure options
> [0]PETSC ERROR: #1 VecValidValues() line 28 in
> /packages/7x/petsc/3.11.1/petsc-3.11.1/src/vec/vec/interface/rvector.c
> [0]PETSC ERROR: #2 PCApply() line 464 in
> /packages/7x/petsc/3.11.1/petsc-3.11.1/src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: #3 KSP_PCApply() line 281 in
> /packages/7x/petsc/3.11.1/petsc-3.11.1/include/petsc/private/kspimpl.h
> [0]PETSC ERROR: #4 KSPSolve_PREONLY() line 22 in
> /packages/7x/petsc/3.11.1/petsc-3.11.1/src/ksp/ksp/impls/preonly/preonly.c
> [0]PETSC ERROR: #5 KSPSolve() line 782 in
> /packages/7x/petsc/3.11.1/petsc-3.11.1/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: #6 STMatSolve() line 193 in
>
> /packages/7x/slepc/3.11.1/slepc-3.11.1/src/sys/classes/st/interface/stsles.c
> [0]PETSC ERROR: #7 STApply_Shift() line 25 in
>
> /packages/7x/slepc/3.11.1/slepc-3.11.1/src/sys/classes/st/impls/shift/shift.c
> [0]PETSC ERROR: #8 STApply() line 57 in
>
> /packages/7x/slepc/3.11.1/slepc-3.11.1/src/sys/classes/st/interface/stsolve.c
> [0]PETSC ERROR: #9 EPSGetStartVector() line 797 in
> /packages/7x/slepc/3.11.1/slepc-3.11.1/src/eps/interface/epssolve.c
> [0]PETSC ERROR: #10 EPSSolve_KrylovSchur_Symm() line 32 in
>
> /packages/7x/slepc/3.11.1/slepc-3.11.1/src/eps/impls/krylov/krylovschur/ks-symm.c
> [0]PETSC ERROR: #11 EPSSolve() line 149 in
> /packages/7x/slepc/3.11.1/slepc-3.11.1/src/eps/interface/epssolve.c
>
> Regards,
> Pranay.
>
> On Sat, Nov 30, 2019 at 2:46 PM Matthew Knepley <knepley at gmail.com> wrote:
>
> > On Sat, Nov 30, 2019 at 3:19 PM baikadi pranay <pranayreddy865 at gmail.com
> >
> > wrote:
> >
> >> Hello PETSc users,
> >>
> >> I am currently trying to build a 1-D Schrodinger solver. I have built my
> >> hamiltonian matrix (of size 121 x 121) and i'm trying to find the
> >> eigenvalues. I have the following lines of code for the solver:
> >>
> >> *call EPSCreate(PETSC_COMM_WORLD,eps,ierr)*
> >>
> >> *call EPSSetOperators(eps,ham,S,ierr)call
> >> EPSSetProblemType(eps,EPS_GHEP,ierr)*
> >>
> >>
> >>
> >> *call EPSSetFromOptions(eps,ierr)call
> >>
> EPSSetDimensions(eps,10,PETSC_DEFAULT_INTEGER,PETSC_DEFAULT_INTEGER,ierr)call
> >> EPSSolve(eps,ierr)call EPSDestroy(eps,ierr)*
> >>
> >> At the EPSSolve line, i get the following error:
> >>
> >>
> >>
> >>
> >>
> >> *[0]PETSC ERROR: --------------------- Error Message
> >> --------------------------------------------------------------[0]PETSC
> >> ERROR: Floating point exception[0]PETSC ERROR: Vec entry at local
> location
> >> 0 is not-a-number or infinite at end of function: Parameter number
> >> 3[0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html
> >> <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble
> >> shooting.[0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019*
> >>
> >
> > You need to show the entire stack trace that is output here.
> >
> >   Thanks,
> >
> >      Matt
> >
> >
> >> I am using the options *-st_pc_factor_shift_type NONZERO
> >> -st_pc_factor_shift_amount 1*     ( else I end up getting the "zero
> >> pivot in LU factorization" error ).
> >>
> >> I outputted my matrix to matlab and confirmed that the null space is
> >> empty and the matrix is not singular. I am not sure why I'm getting this
> >> error. Could you provide me a hint as to how to solve this problem.
> >>
> >> Sincerely,
> >> Pranay.
> >>
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> > experiments is infinitely more interesting than any results to which
> their
> > experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
> > <http://www.cse.buffalo.edu/~knepley/>
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/d2498b43/attachment.html
> >
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> petsc-users mailing list
> petsc-users at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/petsc-users
>
>
> ------------------------------
>
> End of petsc-users Digest, Vol 131, Issue 49
> ********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191130/83aabdab/attachment.html>