From jroman at dsic.upv.es  Sat Aug  1 02:05:53 2015
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sat, 1 Aug 2015 09:05:53 +0200
Subject: [petsc-users] SLEPc example failed ...
In-Reply-To: <CAHOKZ65e9zdYdwONTc=GLMV_-AQqsoELMPB20=uHL9xJkz51Rg@mail.gmail.com>
References: <CAHOKZ65e9zdYdwONTc=GLMV_-AQqsoELMPB20=uHL9xJkz51Rg@mail.gmail.com>
Message-ID: <42827D94-B36B-4268-9CE7-425F45B2CF34@dsic.upv.es>

There is not enough information to give an answer.
Did you modify the example code? Did 'make test' work after SLEPc installation?
Use a debugger to see the exact point where the execution failed.

Jose


El 01/08/2015, a las 00:29, Xujun Zhao escribi?:

> Hi all,
> 
> I run the EPS example ex9, and it failed. Can anyone help me figure out the problem? Thanks. The following are the output error msg:
> 
> mcswl156:eps_tutorials xzhao$ ./ex9 -n 10
> 
> 
> 
> Brusselator wave model, n=10
> 
> 
> 
> ---> my test: VecCreateMPIWithArray is done.
> 
> ---> my test: Shell Matrix is created.
> 
> ---> my test: EPS is set.
> 
> ---> my test: Start to solve the EPS ...
> 
> [0]PETSC ERROR: ------------------------------------------------------------------------
> 
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> 
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> 
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> 
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> 
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
> 
> [0]PETSC ERROR: to get more information on the crash.
> 
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> 
> [0]PETSC ERROR: Signal received
> 
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> 
> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> 
> [0]PETSC ERROR: ./ex9 on a arch-darwin-c-opt named mcswl156.mcs.anl.gov by xzhao Fri Jul 31 17:26:52 2015
> 
> [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps ?download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0
> 
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> 
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> 
> [unset]: aborting job:
> 
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> 


From fdkong.jd at gmail.com  Sat Aug  1 16:01:49 2015
From: fdkong.jd at gmail.com (Fande Kong)
Date: Sat, 1 Aug 2015 15:01:49 -0600
Subject: [petsc-users] failed to compile HDF5 on vesta
Message-ID: <CAN5Wd-+5Pe_PfonUF=1EENi+WwyKZ8PunJ1Rwf3Y9OWiqRmFgg@mail.gmail.com>

Hi all,

I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed to
compile HDF5. The configure log file is attached. Any suggestions would be
greatly appreciated.

Thanks,

Fande Kong,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150801/1f23c70c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 2753937 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150801/1f23c70c/attachment-0001.obj>

From bsmith at mcs.anl.gov  Sat Aug  1 16:15:01 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 1 Aug 2015 16:15:01 -0500
Subject: [petsc-users] failed to compile HDF5 on vesta
In-Reply-To: <CAN5Wd-+5Pe_PfonUF=1EENi+WwyKZ8PunJ1Rwf3Y9OWiqRmFgg@mail.gmail.com>
References: <CAN5Wd-+5Pe_PfonUF=1EENi+WwyKZ8PunJ1Rwf3Y9OWiqRmFgg@mail.gmail.com>
Message-ID: <C44824DB-DDCF-4725-A755-CFA585531839@mcs.anl.gov>


  Try with  --with-shared-libraries=0  The HDF5 build is having some issue with shared libraries

  Barry

> On Aug 1, 2015, at 4:01 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> 
> Hi all,
> 
> I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed to compile HDF5. The configure log file is attached. Any suggestions would be greatly appreciated. 
> 
> Thanks,
> 
> Fande Kong,
> <configure.log>


From fdkong.jd at gmail.com  Sat Aug  1 19:55:08 2015
From: fdkong.jd at gmail.com (Fande Kong)
Date: Sat, 1 Aug 2015 18:55:08 -0600
Subject: [petsc-users] failed to compile HDF5 on vesta
In-Reply-To: <C44824DB-DDCF-4725-A755-CFA585531839@mcs.anl.gov>
References: <CAN5Wd-+5Pe_PfonUF=1EENi+WwyKZ8PunJ1Rwf3Y9OWiqRmFgg@mail.gmail.com>
	<C44824DB-DDCF-4725-A755-CFA585531839@mcs.anl.gov>
Message-ID: <CAN5Wd-LpsbK-78NG7AWiHDNxa3+ZouC5KXr0D7nyTwnKAnsw1A@mail.gmail.com>

HI barry,

Thanks a lot. I could compile hdf5, but failed to compile fblaslapack. Log
file is attached.

Fande Kong,

On Sat, Aug 1, 2015 at 3:15 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Try with  --with-shared-libraries=0  The HDF5 build is having some issue
> with shared libraries
>
>   Barry
>
> > On Aug 1, 2015, at 4:01 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> >
> > Hi all,
> >
> > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed
> to compile HDF5. The configure log file is attached. Any suggestions would
> be greatly appreciated.
> >
> > Thanks,
> >
> > Fande Kong,
> > <configure.log>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150801/0e6b7560/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 5576806 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150801/0e6b7560/attachment-0001.obj>

From bsmith at mcs.anl.gov  Sun Aug  2 11:30:50 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 2 Aug 2015 11:30:50 -0500
Subject: [petsc-users] failed to compile HDF5 on vesta
In-Reply-To: <CAN5Wd-LpsbK-78NG7AWiHDNxa3+ZouC5KXr0D7nyTwnKAnsw1A@mail.gmail.com>
References: <CAN5Wd-+5Pe_PfonUF=1EENi+WwyKZ8PunJ1Rwf3Y9OWiqRmFgg@mail.gmail.com>
	<C44824DB-DDCF-4725-A755-CFA585531839@mcs.anl.gov>
	<CAN5Wd-LpsbK-78NG7AWiHDNxa3+ZouC5KXr0D7nyTwnKAnsw1A@mail.gmail.com>
Message-ID: <78D07896-DA83-4572-B0AE-D022AE673D20@mcs.anl.gov>


  You shouldn't need --download-fblaslapack  almost every system has it already installed.

  Barry

Looks like the Fortran compiler on this system is rejecting the "old" Fortran in blas/lapack code.


> On Aug 1, 2015, at 7:55 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> 
> HI barry,
> 
> Thanks a lot. I could compile hdf5, but failed to compile fblaslapack. Log file is attached.
> 
> Fande Kong,
> 
> On Sat, Aug 1, 2015 at 3:15 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   Try with  --with-shared-libraries=0  The HDF5 build is having some issue with shared libraries
> 
>   Barry
> 
> > On Aug 1, 2015, at 4:01 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> >
> > Hi all,
> >
> > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed to compile HDF5. The configure log file is attached. Any suggestions would be greatly appreciated.
> >
> > Thanks,
> >
> > Fande Kong,
> > <configure.log>
> 
> 
> <configure.log>


From fdkong.jd at gmail.com  Sun Aug  2 22:22:34 2015
From: fdkong.jd at gmail.com (Fande Kong)
Date: Sun, 2 Aug 2015 22:22:34 -0500
Subject: [petsc-users] failed to compile HDF5 on vesta
In-Reply-To: <78D07896-DA83-4572-B0AE-D022AE673D20@mcs.anl.gov>
References: <CAN5Wd-+5Pe_PfonUF=1EENi+WwyKZ8PunJ1Rwf3Y9OWiqRmFgg@mail.gmail.com>
	<C44824DB-DDCF-4725-A755-CFA585531839@mcs.anl.gov>
	<CAN5Wd-LpsbK-78NG7AWiHDNxa3+ZouC5KXr0D7nyTwnKAnsw1A@mail.gmail.com>
	<78D07896-DA83-4572-B0AE-D022AE673D20@mcs.anl.gov>
Message-ID: <CAN5Wd-JroDz4MrfgJvUgi7mR6+1Bsm2eDoYmest7eZQifP5e7Q@mail.gmail.com>

Hi, Barry,

Looks like they did not have fblaslapack installed. I could compile
the fblaslapack when I switched the compiler from XL to gcc.

Thanks,
Fande Kong,

On Sun, Aug 2, 2015 at 11:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   You shouldn't need --download-fblaslapack  almost every system has it
> already installed.
>
>   Barry
>
> Looks like the Fortran compiler on this system is rejecting the "old"
> Fortran in blas/lapack code.
>
>
> > On Aug 1, 2015, at 7:55 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> >
> > HI barry,
> >
> > Thanks a lot. I could compile hdf5, but failed to compile fblaslapack.
> Log file is attached.
> >
> > Fande Kong,
> >
> > On Sat, Aug 1, 2015 at 3:15 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   Try with  --with-shared-libraries=0  The HDF5 build is having some
> issue with shared libraries
> >
> >   Barry
> >
> > > On Aug 1, 2015, at 4:01 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed
> to compile HDF5. The configure log file is attached. Any suggestions would
> be greatly appreciated.
> > >
> > > Thanks,
> > >
> > > Fande Kong,
> > > <configure.log>
> >
> >
> > <configure.log>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150802/b62edabd/attachment.html>

From Mahir.Ulker-Kaustell at tyrens.se  Mon Aug  3 07:02:11 2015
From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se)
Date: Mon, 3 Aug 2015 12:02:11 +0000
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
Message-ID: <ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>

Hong and Sherry,

I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:

If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c

Mahir

From: Hong [mailto:hzhang at mcs.anl.gov]
Sent: den 30 juli 2015 02:58
To: ?lker-Kaustell, Mahir
Cc: Xiaoye Li; PETSc users list
Subject: Fwd: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry fixed several bugs in superlu_dist-v4.1.
The current petsc-release interfaces with superlu_dist-v4.0.
We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?

Here is how to do it:
1. download superlu_dist v4.1
2. remove existing PETSC_ARCH directory, then configure petsc with
'--download-superlu_dist=superlu_dist_4.1.tar.gz'
3. build petsc

Let us know if the issue remains.

Hong


---------- Forwarded message ----------
From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Date: Wed, Jul 29, 2015 at 2:24 PM
Subject: Fwd: [petsc-users] SuperLU MPI-problem
To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>

Hong,
I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:

Invalid ISPEC at line 484 in file get_perm_c.c
This has nothing to do with my bug fix.
?  Shall we ask him to try the new version, or try to get him matrix?
Sherry
?

---------- Forwarded message ----------
From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
Date: Wed, Jul 22, 2015 at 1:32 PM
Subject: RE: [petsc-users] SuperLU MPI-problem
To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>

The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?

If i use -mat_superlu_dist_parsymbfact the program crashes with

Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------

If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with

Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
col block 3006 -------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------


/Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 22 juli 2015 21:34
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; petsc-users

Subject: Re: [petsc-users] SuperLU MPI-problem

In Petsc/superlu_dist interface, we set default

options.ParSymbFact = NO;

When user raises the flag "-mat_superlu_dist_parsymbfact",
we set

    options.ParSymbFact = YES;
    options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */

We do not change anything else.

Hong

On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.

The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.

I don't understand why you get the following error when you use
?-mat_superlu_dist_parsymbfact?.

Invalid ISPEC at line 484 in file get_perm_c.c

Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.

?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
?-mat_superlu_dist_parsymbfact?
? ?  (the default is to use  sequential symbolic factorization.)


Sherry

On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Thank you for your reply.

As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.

I am working in a Windows-environment and have installed PETSc through Cygwin.
Apparently, there is no support for Valgrind in this OS.

If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?


Best regards,
Mahir

______________________________________________
Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
______________________________________________

-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
Sent: den 22 juli 2015 02:57
To: ?lker-Kaustell, Mahir
Cc: Xiaoye S. Li; petsc-users
Subject: Re: [petsc-users] SuperLU MPI-problem


   Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)

   Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.

  Barry


==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42048== Syscall param write(buf) points to uninitialised byte(s)
==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Address 0x104810704 is on thread 1's stack
==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049== Conditional jump or move depends on uninitialised value(s)
==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049== Conditional jump or move depends on uninitialised value(s)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==


> On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Ok. So I have been creating the full factorization on each process. That gives me some hope!
>
> I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> However, now the program crashes with:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> And so on?
>
> From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
>
> Mahir
>
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
>
> From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> Sent: den 20 juli 2015 18:12
> To: ?lker-Kaustell, Mahir
> Cc: Hong; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
>
> The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
>
> You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
>
> Sherry Li
>
>
> On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Hong:
>
> Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
>
> The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
>
> Mahir
>
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 20 juli 2015 17:39
> To: ?lker-Kaustell, Mahir
> Cc: petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir:
> Direct solvers consume large amount of memory. Suggest to try followings:
>
> 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
>
> 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> Do you get memory crash in the 1st symbolic factorization?
> In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
>
> 3. Use a machine that gives larger memory.
>
> Hong
>
> Dear Petsc-Users,
>
> I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> The frequency dependency of the problem requires that the system
>
>                              [-omega^2M + K]u = F
>
> where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> K is a complex matrix, including material damping.
>
> I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
>
> The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
>
> I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
>
> Mahir


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/ac375ffe/attachment-0001.html>

From nicolas.pozin at inria.fr  Mon Aug  3 09:13:08 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Mon, 3 Aug 2015 16:13:08 +0200 (CEST)
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <1219151237.6415983.1438609951363.JavaMail.zimbra@inria.fr>
Message-ID: <115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr>

Hello everyone, 

I am having trouble using MatShellGetContext. 

Here's the simple test I did : 


typedef struct{ 
PetscInt testValue; 
Mat matShell; 
KSP currentCtx; 
} AppCtx; 


AppCtx context1; 
KSPCreate(PETSC_COMM_WORLD,&context1.currentCtx); 
context1.testValue=18; 
MatCreateShell(PETSC_COMM_WORLD, nl, nl, nL, nL, context1.currentCtx, &context1.matShell); 

AppCtx context2; 
MatShellGetContext(context1.matShell, (void*)&context2); 


It happens that context2.testValue is different from 18. 

Any would have a clue on what I miss? 

thanks a lot, 
Nicolas 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/17b325ab/attachment.html>

From xsli at lbl.gov  Mon Aug  3 09:17:54 2015
From: xsli at lbl.gov (Xiaoye S. Li)
Date: Mon, 3 Aug 2015 07:17:54 -0700
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
Message-ID: <CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>

I think I know the problem.   Since zdistribute.c is called, I guess you
are using the global (replicated) matrix input interface,
pzgssvx_ABglobal().  This interface does not allow you to use parallel
symbolic factorization (since matrix is centralized).

That's why you get the following error:
Invalid ISPEC at line 484 in file get_perm_c.c

You need to use distributed matrix input interface pzgssvx() (without
ABglobal)

Sherry


On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se <
Mahir.Ulker-Kaustell at tyrens.se> wrote:

> Hong and Sherry,
>
>
>
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
>
>
>
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid
> ISPEC at line 484 in file get_perm_c.c
>
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the
> program crashes with:  Calloc fails for SPA dense[]. at line 438 in file
> zdistribute.c
>
>
>
> Mahir
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 30 juli 2015 02:58
> *To:* ?lker-Kaustell, Mahir
> *Cc:* Xiaoye Li; PETSc users list
>
> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry fixed several bugs in superlu_dist-v4.1.
>
> The current petsc-release interfaces with superlu_dist-v4.0.
>
> We do not know whether the reported issue (attached below) has been
> resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
>
>
>
> Here is how to do it:
>
> 1. download superlu_dist v4.1
>
> 2. remove existing PETSC_ARCH directory, then configure petsc with
>
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
>
> 3. build petsc
>
>
>
> Let us know if the issue remains.
>
>
>
> Hong
>
>
>
>
>
> ---------- Forwarded message ----------
> From: *Xiaoye S. Li* <xsli at lbl.gov>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <hzhang at mcs.anl.gov>
>
> Hong,
>
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure
> whether the new fix to parallel symbolic factorization solves the problem.
> What bothers be is that he is getting the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> This has nothing to do with my bug fix.
>
> ?  Shall we ask him to try the new version, or try to get him matrix?
>
> Sherry
> ?
>
>
>
> ---------- Forwarded message ----------
> From: *Mahir.Ulker-Kaustell at tyrens.se <Mahir.Ulker-Kaustell at tyrens.se>* <
> Mahir.Ulker-Kaustell at tyrens.se>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <hzhang at mcs.anl.gov>, "Xiaoye S. Li" <xsli at lbl.gov>
> Cc: petsc-users <petsc-users at mcs.anl.gov>
>
> The 1000 was just a conservative guess. The number of non-zeros per row is
> in the tens in general but certain constraints lead to non-diagonal streaks
> in the sparsity-pattern.
>
> Is it the reordering of the matrix that is killing me here? How can I set
> options.ColPerm?
>
>
>
> If i use -mat_superlu_dist_parsymbfact the program crashes with
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:23 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat
> later) with
>
>
>
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
>
> col block 3006 -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> col block 1924 [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:58 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
>
>
> /Mahir
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>
> *Sent:* den 22 juli 2015 21:34
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; petsc-users
>
>
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> In Petsc/superlu_dist interface, we set default
>
>
>
> options.ParSymbFact = NO;
>
>
>
> When user raises the flag "-mat_superlu_dist_parsymbfact",
>
> we set
>
>
>
>     options.ParSymbFact = YES;
>
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for
> ParSymbFact regardless of user ordering setting */
>
>
>
> We do not change anything else.
>
>
>
> Hong
>
>
>
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I am trying to understand your problem. You said you are solving Naviers
> equation (elastodynamics) in the frequency domain, using finite element
> discretization.  I wonder why you have about 1000 nonzeros per row.
> Usually in many PDE discretized matrices, the number of nonzeros per row is
> in the tens (even for 3D problems), not in the thousands.   So, your matrix
> is quite a bit denser than many sparse matrices we deal with.
>
>
>
> The number of nonzeros in the L and U factors is much more than that in
> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be
> as bad as 50-100x fill ratio for 3D.  But since your matrix starts much
> denser (i.e., the underlying graph has many connections), it may not lend
> to any good ordering strategy to preserve sparsity of L and U; that is, the
> L and U fill ratio may be large.
>
>
>
> I don't understand why you get the following error when you use
>
> ?-mat_superlu_dist_parsymbfact?.
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
>
>
>
> ?Hong -- in order to use parallel symbolic factorization, is it sufficient
> to specify only
>
> ?-mat_superlu_dist_parsymbfact?
>
> ? ?  (the default is to use  sequential symbolic factorization.)
>
>
>
>
>
> Sherry
>
>
>
> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Thank you for your reply.
>
> As you have probably figured out already, I am not a computational
> scientist. I am a researcher in civil engineering (railways for high-speed
> traffic), trying to produce some, from my perspective, fairly large
> parametric studies based on finite element discretizations.
>
> I am working in a Windows-environment and have installed PETSc through
> Cygwin.
> Apparently, there is no support for Valgrind in this OS.
>
> If I have understood you correct, the memory issues are related to superLU
> and given my background, there is not much I can do. Is this correct?
>
>
> Best regards,
> Mahir
>
> ______________________________________________
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> ______________________________________________
>
>
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
> Sent: den 22 juli 2015 02:57
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
>
>    Run the program under valgrind
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use
> the option -mat_superlu_dist_parsymbfact I get many scary memory problems
> some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>
>    Note that I consider it unacceptable for running programs to EVER use
> uninitialized values; until these are all cleaned up I won't trust any runs
> like this.
>
>   Barry
>
>
>
>
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis
> (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis
> (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size
> 131,072 alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
>
>
> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That
> gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option
> ?-mat_superlu_dist_parsymbfact?.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on?
> >
> > From the SuperLU manual; I should give the option either YES or NO,
> however -mat_superlu_dist_parsymbfact YES makes the program crash in the
> same way as above.
> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the
> PETSc documentation
> >
> > Mahir
> >
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> >
> > From: Xiaoye S. Li [mailto:xsli at lbl.gov]
> > Sent: den 20 juli 2015 18:12
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization.
> Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to
> {L\U} data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option:
> '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very
> difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6
> degrees of freedom. The matrices are derived from finite elements so they
> are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory
> needed to less than 20GB, so if the solver needs twice or even three times
> as much, it should still work well. Or have I completely misunderstood
> something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov]
> > Sent: den 20 juli 2015 17:39
> > To: ?lker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too
> ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix
> orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so
> you only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from
> Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and
> stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small
> (1000 degrees of freedom) test problem on one or several processors, but it
> keeps crashing when I try it on my full scale (in the order of 10^6 degrees
> of freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error
> messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so
> I am hoping that someone can push  me in the right direction?
> >
> > Mahir
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/7aaccf45/attachment-0001.html>

From knepley at gmail.com  Mon Aug  3 09:33:16 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 3 Aug 2015 09:33:16 -0500
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr>
References: <1219151237.6415983.1438609951363.JavaMail.zimbra@inria.fr>
	<115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4GkiWDej3xo42P8rb2XWrKc9_yPUgnNUNTJBWGR1z-dm5A@mail.gmail.com>

On Mon, Aug 3, 2015 at 9:13 AM, Nicolas Pozin <nicolas.pozin at inria.fr>
wrote:

> Hello everyone,
>
> I am having trouble using MatShellGetContext.
>
> Here's the simple test I did :
>
>
> typedef struct{
>     PetscInt testValue;
>     Mat matShell;
>     KSP currentCtx;
> } AppCtx;
>
>
> AppCtx context1;
> KSPCreate(PETSC_COMM_WORLD,&context1.currentCtx);
> context1.testValue=18;
> MatCreateShell(PETSC_COMM_WORLD, nl, nl, nL, nL, context1.currentCtx,
> &context1.matShell);
>

It looks like you want ''&context1" for the context argument. You are just
passing the KSP pointer.


> AppCtx context2;
> MatShellGetContext(context1.matShell, (void*)&context2);
>

Here you better declare

AppCtx *context2;

and access it as

  context2->testValue;

    Matt


> It happens that context2.testValue is different from 18.
>
> Any would have a clue on what I miss?
>
> thanks a lot,
> Nicolas
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/e1892707/attachment.html>

From nicolas.pozin at inria.fr  Mon Aug  3 09:42:00 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Mon, 3 Aug 2015 16:42:00 +0200 (CEST)
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <CAMYG4GkiWDej3xo42P8rb2XWrKc9_yPUgnNUNTJBWGR1z-dm5A@mail.gmail.com>
References: <1219151237.6415983.1438609951363.JavaMail.zimbra@inria.fr>
	<115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr>
	<CAMYG4GkiWDej3xo42P8rb2XWrKc9_yPUgnNUNTJBWGR1z-dm5A@mail.gmail.com>
Message-ID: <748729499.6423014.1438612920482.JavaMail.zimbra@inria.fr>

----- Mail original -----

> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
> Cc: "PETSc" <petsc-users at mcs.anl.gov>
> Envoy?: Lundi 3 Ao?t 2015 16:33:16
> Objet: Re: [petsc-users] problem with MatShellGetContext

> On Mon, Aug 3, 2015 at 9:13 AM, Nicolas Pozin < nicolas.pozin at inria.fr >
> wrote:

> > Hello everyone,
> 

> > I am having trouble using MatShellGetContext.
> 

> > Here's the simple test I did :
> 

> > typedef struct{
> 
> > PetscInt testValue;
> 
> > Mat matShell;
> 
> > KSP currentCtx;
> 
> > } AppCtx;
> 

> > AppCtx context1;
> 
> > KSPCreate(PETSC_COMM_WORLD,&context1.currentCtx);
> 
> > context1.testValue=18;
> 
> > MatCreateShell(PETSC_COMM_WORLD, nl, nl, nL, nL, context1.currentCtx,
> > &context1.matShell);
> 

> It looks like you want ''&context1" for the context argument. You are just
> passing the KSP pointer.

> > AppCtx context2;
> 
> > MatShellGetContext(context1.matShell, (void*)&context2);
> 

> Here you better declare

> AppCtx *context2;

> and access it as

> context2->testValue;

Thanks, but It doesn't work better unfortunately 

> Matt

> > It happens that context2.testValue is different from 18.
> 

> > Any would have a clue on what I miss?
> 

> > thanks a lot,
> 
> > Nicolas
> 

> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/d8e68008/attachment.html>

From hzhang at mcs.anl.gov  Mon Aug  3 09:46:04 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Mon, 3 Aug 2015 09:46:04 -0500
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
Message-ID: <CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>

Mahir,

Sherry found the culprit. I can reproduce it:
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist
-mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact

Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
...

PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when
using more than one processes.
Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or
set matinput=GLOBAL for parallel run?

I'll add an error flag for these use cases.

Hong

On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov> wrote:

> I think I know the problem.   Since zdistribute.c is called, I guess you
> are using the global (replicated) matrix input interface,
> pzgssvx_ABglobal().  This interface does not allow you to use parallel
> symbolic factorization (since matrix is centralized).
>
> That's why you get the following error:
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> You need to use distributed matrix input interface pzgssvx() (without
> ABglobal)
>
> Sherry
>
>
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
>> Hong and Sherry,
>>
>>
>>
>> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
>>
>>
>>
>> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid
>> ISPEC at line 484 in file get_perm_c.c
>>
>> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the
>> program crashes with:  Calloc fails for SPA dense[]. at line 438 in file
>> zdistribute.c
>>
>>
>>
>> Mahir
>>
>>
>>
>> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>> *Sent:* den 30 juli 2015 02:58
>> *To:* ?lker-Kaustell, Mahir
>> *Cc:* Xiaoye Li; PETSc users list
>>
>> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem
>>
>>
>>
>> Mahir,
>>
>>
>>
>> Sherry fixed several bugs in superlu_dist-v4.1.
>>
>> The current petsc-release interfaces with superlu_dist-v4.0.
>>
>> We do not know whether the reported issue (attached below) has been
>> resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
>>
>>
>>
>> Here is how to do it:
>>
>> 1. download superlu_dist v4.1
>>
>> 2. remove existing PETSC_ARCH directory, then configure petsc with
>>
>> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
>>
>> 3. build petsc
>>
>>
>>
>> Let us know if the issue remains.
>>
>>
>>
>> Hong
>>
>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: *Xiaoye S. Li* <xsli at lbl.gov>
>> Date: Wed, Jul 29, 2015 at 2:24 PM
>> Subject: Fwd: [petsc-users] SuperLU MPI-problem
>> To: Hong Zhang <hzhang at mcs.anl.gov>
>>
>> Hong,
>>
>> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure
>> whether the new fix to parallel symbolic factorization solves the problem.
>> What bothers be is that he is getting the following error:
>>
>> Invalid ISPEC at line 484 in file get_perm_c.c
>>
>> This has nothing to do with my bug fix.
>>
>> ?  Shall we ask him to try the new version, or try to get him matrix?
>>
>> Sherry
>> ?
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: *Mahir.Ulker-Kaustell at tyrens.se <Mahir.Ulker-Kaustell at tyrens.se>* <
>> Mahir.Ulker-Kaustell at tyrens.se>
>> Date: Wed, Jul 22, 2015 at 1:32 PM
>> Subject: RE: [petsc-users] SuperLU MPI-problem
>> To: Hong <hzhang at mcs.anl.gov>, "Xiaoye S. Li" <xsli at lbl.gov>
>> Cc: petsc-users <petsc-users at mcs.anl.gov>
>>
>> The 1000 was just a conservative guess. The number of non-zeros per row
>> is in the tens in general but certain constraints lead to non-diagonal
>> streaks in the sparsity-pattern.
>>
>> Is it the reordering of the matrix that is killing me here? How can I set
>> options.ColPerm?
>>
>>
>>
>> If i use -mat_superlu_dist_parsymbfact the program crashes with
>>
>>
>>
>> Invalid ISPEC at line 484 in file get_perm_c.c
>>
>> -------------------------------------------------------
>>
>> Primary job  terminated normally, but 1 process returned
>>
>> a non-zero exit code.. Per user-direction, the job has been aborted.
>>
>> -------------------------------------------------------
>>
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
>> batch system) has told this process to end
>>
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
>> X to find memory corruption errors
>>
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>>
>> [0]PETSC ERROR: to get more information on the crash.
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>>
>> [0]PETSC ERROR: Signal received
>>
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>>
>> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>>
>> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
>> muk Wed Jul 22 21:59:23 2015
>>
>> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
>> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
>> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
>> --with-scalar-type=complex --download-fblaspack --download-mpich
>> --download-scalapack --download-mumps --download-metis --download-parmetis
>> --download-superlu --download-superlu_dist --download-fftw
>>
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>> [unset]: aborting job:
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>>
>>
>> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat
>> later) with
>>
>>
>>
>> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
>>
>> col block 3006 -------------------------------------------------------
>>
>> Primary job  terminated normally, but 1 process returned
>>
>> a non-zero exit code.. Per user-direction, the job has been aborted.
>>
>> -------------------------------------------------------
>>
>> col block 1924 [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
>> batch system) has told this process to end
>>
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
>> X to find memory corruption errors
>>
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>>
>> [0]PETSC ERROR: to get more information on the crash.
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>>
>> [0]PETSC ERROR: Signal received
>>
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>>
>> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>>
>> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
>> muk Wed Jul 22 21:59:58 2015
>>
>> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
>> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
>> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
>> --with-scalar-type=complex --download-fblaspack --download-mpich
>> --download-scalapack --download-mumps --download-metis --download-parmetis
>> --download-superlu --download-superlu_dist --download-fftw
>>
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>> [unset]: aborting job:
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>>
>> /Mahir
>>
>>
>>
>>
>>
>> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>>
>> *Sent:* den 22 juli 2015 21:34
>> *To:* Xiaoye S. Li
>> *Cc:* ?lker-Kaustell, Mahir; petsc-users
>>
>>
>> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>>
>>
>>
>> In Petsc/superlu_dist interface, we set default
>>
>>
>>
>> options.ParSymbFact = NO;
>>
>>
>>
>> When user raises the flag "-mat_superlu_dist_parsymbfact",
>>
>> we set
>>
>>
>>
>>     options.ParSymbFact = YES;
>>
>>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for
>> ParSymbFact regardless of user ordering setting */
>>
>>
>>
>> We do not change anything else.
>>
>>
>>
>> Hong
>>
>>
>>
>> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>>
>> I am trying to understand your problem. You said you are solving Naviers
>> equation (elastodynamics) in the frequency domain, using finite element
>> discretization.  I wonder why you have about 1000 nonzeros per row.
>> Usually in many PDE discretized matrices, the number of nonzeros per row is
>> in the tens (even for 3D problems), not in the thousands.   So, your matrix
>> is quite a bit denser than many sparse matrices we deal with.
>>
>>
>>
>> The number of nonzeros in the L and U factors is much more than that in
>> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be
>> as bad as 50-100x fill ratio for 3D.  But since your matrix starts much
>> denser (i.e., the underlying graph has many connections), it may not lend
>> to any good ordering strategy to preserve sparsity of L and U; that is, the
>> L and U fill ratio may be large.
>>
>>
>>
>> I don't understand why you get the following error when you use
>>
>> ?-mat_superlu_dist_parsymbfact?.
>>
>>
>>
>> Invalid ISPEC at line 484 in file get_perm_c.c
>>
>>
>>
>> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
>>
>>
>>
>> ?Hong -- in order to use parallel symbolic factorization, is it
>> sufficient to specify only
>>
>> ?-mat_superlu_dist_parsymbfact?
>>
>> ? ?  (the default is to use  sequential symbolic factorization.)
>>
>>
>>
>>
>>
>> Sherry
>>
>>
>>
>> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se <
>> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>>
>> Thank you for your reply.
>>
>> As you have probably figured out already, I am not a computational
>> scientist. I am a researcher in civil engineering (railways for high-speed
>> traffic), trying to produce some, from my perspective, fairly large
>> parametric studies based on finite element discretizations.
>>
>> I am working in a Windows-environment and have installed PETSc through
>> Cygwin.
>> Apparently, there is no support for Valgrind in this OS.
>>
>> If I have understood you correct, the memory issues are related to
>> superLU and given my background, there is not much I can do. Is this
>> correct?
>>
>>
>> Best regards,
>> Mahir
>>
>> ______________________________________________
>> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
>> Tyr?ns AB
>> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
>> ______________________________________________
>>
>>
>> -----Original Message-----
>> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
>> Sent: den 22 juli 2015 02:57
>> To: ?lker-Kaustell, Mahir
>> Cc: Xiaoye S. Li; petsc-users
>> Subject: Re: [petsc-users] SuperLU MPI-problem
>>
>>
>>    Run the program under valgrind
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I
>> use the option -mat_superlu_dist_parsymbfact I get many scary memory
>> problems some involving for example ddist_psymbtonum
>> (pdsymbfact_distdata.c:1332)
>>
>>    Note that I consider it unacceptable for running programs to EVER use
>> uninitialized values; until these are all cleaned up I won't trust any runs
>> like this.
>>
>>   Barry
>>
>>
>>
>>
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
>> ==42050==    by 0x101557F60: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:285)
>> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10155751B: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
>> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
>> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
>> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
>> ==42050==    by 0x101557F60: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:285)
>> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10155751B: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:96)
>> ==42050==
>> ==42049== Syscall param writev(vector[...]) points to uninitialised
>> byte(s)
>> ==42049==    at 0x102DA1C3A: writev (in
>> /usr/lib/system/libsystem_kernel.dylib)
>> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
>> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
>> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
>> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
>> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
>> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
>> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
>> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
>> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
>> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
>> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
>> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42049==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048== Syscall param writev(vector[...]) points to uninitialised
>> byte(s)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size
>> 752,720 alloc'd
>> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
>> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
>> ==42048==    at 0x102DA1C3A: writev (in
>> /usr/lib/system/libsystem_kernel.dylib)
>> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
>> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
>> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
>> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
>> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
>> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
>> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42049==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
>> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
>> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
>> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
>> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==  Uninitialised value was created by a heap allocation
>> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
>> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42048==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
>> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
>> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
>> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42049==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size
>> 752,720 alloc'd
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
>> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
>> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
>> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
>> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42048==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==  Uninitialised value was created by a heap allocation
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
>> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
>> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
>> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
>> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
>> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42048==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==
>> ==42048== Syscall param write(buf) points to uninitialised byte(s)
>> ==42048==    at 0x102DA1C22: write (in
>> /usr/lib/system/libsystem_kernel.dylib)
>> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
>> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
>> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend
>> (ch3u_eager.c:257)
>> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
>> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
>> ==42048==    by 0x10155802F: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:299)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==  Address 0x104810704 is on thread 1's stack
>> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
>> (ch3u_eager.c:218)
>> ==42048==  Uninitialised value was created by a heap allocation
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42048==    by 0x101557AB9: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:185)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
>> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
>> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
>> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Syscall param writev(vector[...]) points to uninitialised
>> byte(s)
>> ==42050==    at 0x102DA1C3A: writev (in
>> /usr/lib/system/libsystem_kernel.dylib)
>> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
>> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
>> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
>> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
>> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
>> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
>> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
>> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
>> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size
>> 131,072 alloc'd
>> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
>> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
>> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a heap allocation
>> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
>> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
>> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==
>> ==42048== Conditional jump or move depends on uninitialised value(s)
>> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
>> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==  Uninitialised value was created by a heap allocation
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==
>> ==42049== Conditional jump or move depends on uninitialised value(s)
>> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
>> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==  Uninitialised value was created by a heap allocation
>> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==
>> ==42048== Conditional jump or move depends on uninitialised value(s)
>> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
>> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049== Conditional jump or move depends on uninitialised value(s)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==  Uninitialised value was created by a heap allocation
>> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==  Uninitialised value was created by a heap allocation
>> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==
>> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
>> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a heap allocation
>> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==
>>
>>
>> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote:
>> >
>> > Ok. So I have been creating the full factorization on each process.
>> That gives me some hope!
>> >
>> > I followed your suggestion and tried to use the runtime option
>> ?-mat_superlu_dist_parsymbfact?.
>> > However, now the program crashes with:
>> >
>> > Invalid ISPEC at line 484 in file get_perm_c.c
>> >
>> > And so on?
>> >
>> > From the SuperLU manual; I should give the option either YES or NO,
>> however -mat_superlu_dist_parsymbfact YES makes the program crash in the
>> same way as above.
>> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the
>> PETSc documentation
>> >
>> > Mahir
>> >
>> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
>> Tyr?ns AB
>> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
>> >
>> > From: Xiaoye S. Li [mailto:xsli at lbl.gov]
>> > Sent: den 20 juli 2015 18:12
>> > To: ?lker-Kaustell, Mahir
>> > Cc: Hong; petsc-users
>> > Subject: Re: [petsc-users] SuperLU MPI-problem
>> >
>> > The default SuperLU_DIST setting is to serial symbolic factorization.
>> Therefore, what matters is how much memory do you have per MPI task?
>> >
>> > The code failed to malloc memory during redistribution of matrix A to
>> {L\U} data struction (using result of serial symbolic factorization.)
>> >
>> > You can use parallel symbolic factorization, by runtime option:
>> '-mat_superlu_dist_parsymbfact'
>> >
>> > Sherry Li
>> >
>> >
>> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se <
>> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>> > Hong:
>> >
>> > Previous experiences with this equation have shown that it is very
>> difficult to solve it iteratively. Hence the use of a direct solver.
>> >
>> > The large test problem I am trying to solve has slightly less than 10^6
>> degrees of freedom. The matrices are derived from finite elements so they
>> are sparse.
>> > The machine I am working on has 128GB ram. I have estimated the memory
>> needed to less than 20GB, so if the solver needs twice or even three times
>> as much, it should still work well. Or have I completely misunderstood
>> something here?
>> >
>> > Mahir
>> >
>> >
>> >
>> > From: Hong [mailto:hzhang at mcs.anl.gov]
>> > Sent: den 20 juli 2015 17:39
>> > To: ?lker-Kaustell, Mahir
>> > Cc: petsc-users
>> > Subject: Re: [petsc-users] SuperLU MPI-problem
>> >
>> > Mahir:
>> > Direct solvers consume large amount of memory. Suggest to try
>> followings:
>> >
>> > 1. A sparse iterative solver if  [-omega^2M + K] is not too
>> ill-conditioned. You may test it using the small matrix.
>> >
>> > 2. Incrementally increase your matrix sizes. Try different matrix
>> orderings.
>> > Do you get memory crash in the 1st symbolic factorization?
>> > In your case, matrix data structure stays same when omega changes, so
>> you only need to do one matrix symbolic factorization and reuse it.
>> >
>> > 3. Use a machine that gives larger memory.
>> >
>> > Hong
>> >
>> > Dear Petsc-Users,
>> >
>> > I am trying to use PETSc to solve a set of linear equations arising
>> from Naviers equation (elastodynamics) in the frequency domain.
>> > The frequency dependency of the problem requires that the system
>> >
>> >                              [-omega^2M + K]u = F
>> >
>> > where M and K are constant, square, positive definite matrices (mass
>> and stiffness respectively) is solved for each frequency omega of interest.
>> > K is a complex matrix, including material damping.
>> >
>> > I have written a PETSc program which solves this problem for a small
>> (1000 degrees of freedom) test problem on one or several processors, but it
>> keeps crashing when I try it on my full scale (in the order of 10^6 degrees
>> of freedom) problem.
>> >
>> > The program crashes at KSPSetUp() and from what I can see in the error
>> messages, it appears as if it consumes too much memory.
>> >
>> > I would guess that similar problems have occurred in this mail-list, so
>> I am hoping that someone can push  me in the right direction?
>> >
>> > Mahir
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/a2a0629e/attachment-0001.html>

From knepley at gmail.com  Mon Aug  3 09:47:13 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 3 Aug 2015 09:47:13 -0500
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <748729499.6423014.1438612920482.JavaMail.zimbra@inria.fr>
References: <1219151237.6415983.1438609951363.JavaMail.zimbra@inria.fr>
	<115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr>
	<CAMYG4GkiWDej3xo42P8rb2XWrKc9_yPUgnNUNTJBWGR1z-dm5A@mail.gmail.com>
	<748729499.6423014.1438612920482.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4GnaHJcdeAmeRnHykC+hrRUGszOMt8naAfPJ87hTeOU25A@mail.gmail.com>

On Mon, Aug 3, 2015 at 9:42 AM, Nicolas Pozin <nicolas.pozin at inria.fr>
wrote:

>
>
> ------------------------------
>
> *De: *"Matthew Knepley" <knepley at gmail.com>
> *?: *"Nicolas Pozin" <nicolas.pozin at inria.fr>
> *Cc: *"PETSc" <petsc-users at mcs.anl.gov>
> *Envoy?: *Lundi 3 Ao?t 2015 16:33:16
> *Objet: *Re: [petsc-users] problem with MatShellGetContext
>
> On Mon, Aug 3, 2015 at 9:13 AM, Nicolas Pozin <nicolas.pozin at inria.fr>
> wrote:
>
>> Hello everyone,
>>
>> I am having trouble using MatShellGetContext.
>>
>> Here's the simple test I did :
>>
>>
>> typedef struct{
>>     PetscInt testValue;
>>     Mat matShell;
>>     KSP currentCtx;
>> } AppCtx;
>>
>>
>> AppCtx context1;
>> KSPCreate(PETSC_COMM_WORLD,&context1.currentCtx);
>> context1.testValue=18;
>> MatCreateShell(PETSC_COMM_WORLD, nl, nl, nL, nL, context1.currentCtx,
>> &context1.matShell);
>>
>
> It looks like you want ''&context1" for the context argument. You are just
> passing the KSP pointer.
>
>> AppCtx context2;
>> MatShellGetContext(context1.matShell, (void*)&context2);
>>
>
> Here you better declare
>
> AppCtx *context2;
>
> and access it as
>
>   context2->testValue;
>
> Thanks, but It doesn't work better unfortunately
>

This tells me NOTHING. How can I help you with this information? This has
nothing to do with PETSc. It is
simple C semantics. Write a test code and send it in.

  Matt


>     Matt
>
>> It happens that context2.testValue is different from 18.
>>
>> Any would have a clue on what I miss?
>>
>> thanks a lot,
>> Nicolas
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/24e5ed7f/attachment.html>

From Mahir.Ulker-Kaustell at tyrens.se  Mon Aug  3 10:34:46 2015
From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se)
Date: Mon, 3 Aug 2015 15:34:46 +0000
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
Message-ID: <c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>

Sherry and Hong,

If I use:
-mat_superlu_dist_parsymbfact,
I get:
Invalid ISPEC at line 484 in file get_perm_c.c
regardless of what I give to ?mat_superlu_dist_matinput

I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs.

If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1:
mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1

and
mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact=1

I guess this corresponds to not setting parsymbfact at all. Both programs consume the same amount of RAM and seem to run well.

If I use (what seems to be correct):
mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact

the result is: Invalid ISPEC at line 484 in file get_perm_c.c


Mahir


From: Hong [mailto:hzhang at mcs.anl.gov]
Sent: den 3 augusti 2015 16:46
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry found the culprit. I can reproduce it:
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact

Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
...

PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes.
Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run?

I'll add an error flag for these use cases.

Hong

On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I think I know the problem.   Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal().  This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).

That's why you get the following error:
Invalid ISPEC at line 484 in file get_perm_c.c

You need to use distributed matrix input interface pzgssvx() (without ABglobal)

Sherry


On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Hong and Sherry,

I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:

If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c

Mahir

From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 30 juli 2015 02:58
To: ?lker-Kaustell, Mahir
Cc: Xiaoye Li; PETSc users list

Subject: Fwd: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry fixed several bugs in superlu_dist-v4.1.
The current petsc-release interfaces with superlu_dist-v4.0.
We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?

Here is how to do it:
1. download superlu_dist v4.1
2. remove existing PETSC_ARCH directory, then configure petsc with
'--download-superlu_dist=superlu_dist_4.1.tar.gz'
3. build petsc

Let us know if the issue remains.

Hong


---------- Forwarded message ----------
From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Date: Wed, Jul 29, 2015 at 2:24 PM
Subject: Fwd: [petsc-users] SuperLU MPI-problem
To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>
Hong,
I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:

Invalid ISPEC at line 484 in file get_perm_c.c
This has nothing to do with my bug fix.
?  Shall we ask him to try the new version, or try to get him matrix?
Sherry
?

---------- Forwarded message ----------
From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
Date: Wed, Jul 22, 2015 at 1:32 PM
Subject: RE: [petsc-users] SuperLU MPI-problem
To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?

If i use -mat_superlu_dist_parsymbfact the program crashes with

Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------

If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with

Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
col block 3006 -------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------


/Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 22 juli 2015 21:34
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; petsc-users

Subject: Re: [petsc-users] SuperLU MPI-problem

In Petsc/superlu_dist interface, we set default

options.ParSymbFact = NO;

When user raises the flag "-mat_superlu_dist_parsymbfact",
we set

    options.ParSymbFact = YES;
    options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */

We do not change anything else.

Hong

On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.

The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.

I don't understand why you get the following error when you use
?-mat_superlu_dist_parsymbfact?.

Invalid ISPEC at line 484 in file get_perm_c.c

Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.

?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
?-mat_superlu_dist_parsymbfact?
? ?  (the default is to use  sequential symbolic factorization.)


Sherry

On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Thank you for your reply.

As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.

I am working in a Windows-environment and have installed PETSc through Cygwin.
Apparently, there is no support for Valgrind in this OS.

If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?


Best regards,
Mahir

______________________________________________
Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
______________________________________________

-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
Sent: den 22 juli 2015 02:57
To: ?lker-Kaustell, Mahir
Cc: Xiaoye S. Li; petsc-users
Subject: Re: [petsc-users] SuperLU MPI-problem


   Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)

   Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.

  Barry


==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42048== Syscall param write(buf) points to uninitialised byte(s)
==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Address 0x104810704 is on thread 1's stack
==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049== Conditional jump or move depends on uninitialised value(s)
==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049== Conditional jump or move depends on uninitialised value(s)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==


> On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Ok. So I have been creating the full factorization on each process. That gives me some hope!
>
> I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> However, now the program crashes with:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> And so on?
>
> From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
>
> Mahir
>
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
>
> From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> Sent: den 20 juli 2015 18:12
> To: ?lker-Kaustell, Mahir
> Cc: Hong; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
>
> The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
>
> You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
>
> Sherry Li
>
>
> On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Hong:
>
> Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
>
> The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
>
> Mahir
>
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 20 juli 2015 17:39
> To: ?lker-Kaustell, Mahir
> Cc: petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir:
> Direct solvers consume large amount of memory. Suggest to try followings:
>
> 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
>
> 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> Do you get memory crash in the 1st symbolic factorization?
> In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
>
> 3. Use a machine that gives larger memory.
>
> Hong
>
> Dear Petsc-Users,
>
> I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> The frequency dependency of the problem requires that the system
>
>                              [-omega^2M + K]u = F
>
> where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> K is a complex matrix, including material damping.
>
> I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
>
> The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
>
> I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
>
> Mahir


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/9b193ad5/attachment-0001.html>

From knepley at gmail.com  Mon Aug  3 10:39:47 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 3 Aug 2015 10:39:47 -0500
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
Message-ID: <CAMYG4G=0cTSBk2cVu6b8UxhtDaVmfijXh7zq_TfsdvJ=46t8pQ@mail.gmail.com>

On Mon, Aug 3, 2015 at 10:34 AM, Mahir.Ulker-Kaustell at tyrens.se <
Mahir.Ulker-Kaustell at tyrens.se> wrote:

> Sherry and Hong,
>
>
>
> If I use:
>
> -mat_superlu_dist_parsymbfact,
>
> I get:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> regardless of what I give to ?mat_superlu_dist_matinput
>
>
>
> I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for
> parallel runs.
>
>
>
> If I use 2 processors, the program runs if I use
> *?mat_superlu_dist_parsymbfact=1*:
>

Do not use "=1" for any PETSc option. This is improper syntax. It will
ignore that option. You use "-option 1" since
all option arguments are separated by a space, not an =.

  Matt


> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> GLOBAL -mat_superlu_dist_parsymbfact=1
>
>
>
> and
>
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> DISTRIBUTED -mat_superlu_dist_parsymbfact=1
>
>
>
> I guess this corresponds to not setting parsymbfact at all. Both programs
> consume the same amount of RAM and seem to run well.
>
>
>
> If I use (what seems to be correct):
>
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> DISTRIBUTED -mat_superlu_dist_parsymbfact
>
>
>
> the result is: Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
>
>
> Mahir
>
>
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 3 augusti 2015 16:46
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; Hong; PETSc users list
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry found the culprit. I can reproduce it:
>
> petsc/src/ksp/ksp/examples/tutorials
>
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist
> -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> ...
>
>
>
> PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when
> using more than one processes.
>
> Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or
> set matinput=GLOBAL for parallel run?
>
>
>
> I'll add an error flag for these use cases.
>
>
>
> Hong
>
>
>
> On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I think I know the problem.   Since zdistribute.c is called, I guess you
> are using the global (replicated) matrix input interface,
> pzgssvx_ABglobal().  This interface does not allow you to use parallel
> symbolic factorization (since matrix is centralized).
>
>
>
> That's why you get the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> You need to use distributed matrix input interface pzgssvx() (without
> ABglobal)
>
> Sherry
>
>
>
>
>
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Hong and Sherry,
>
>
>
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
>
>
>
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid
> ISPEC at line 484 in file get_perm_c.c
>
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the
> program crashes with:  Calloc fails for SPA dense[]. at line 438 in file
> zdistribute.c
>
>
>
> Mahir
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 30 juli 2015 02:58
> *To:* ?lker-Kaustell, Mahir
> *Cc:* Xiaoye Li; PETSc users list
>
>
> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry fixed several bugs in superlu_dist-v4.1.
>
> The current petsc-release interfaces with superlu_dist-v4.0.
>
> We do not know whether the reported issue (attached below) has been
> resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
>
>
>
> Here is how to do it:
>
> 1. download superlu_dist v4.1
>
> 2. remove existing PETSC_ARCH directory, then configure petsc with
>
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
>
> 3. build petsc
>
>
>
> Let us know if the issue remains.
>
>
>
> Hong
>
>
>
>
>
> ---------- Forwarded message ----------
> From: *Xiaoye S. Li* <xsli at lbl.gov>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <hzhang at mcs.anl.gov>
>
> Hong,
>
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure
> whether the new fix to parallel symbolic factorization solves the problem.
> What bothers be is that he is getting the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> This has nothing to do with my bug fix.
>
> ?  Shall we ask him to try the new version, or try to get him matrix?
>
> Sherry
> ?
>
>
>
> ---------- Forwarded message ----------
> From: *Mahir.Ulker-Kaustell at tyrens.se <Mahir.Ulker-Kaustell at tyrens.se>* <
> Mahir.Ulker-Kaustell at tyrens.se>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <hzhang at mcs.anl.gov>, "Xiaoye S. Li" <xsli at lbl.gov>
> Cc: petsc-users <petsc-users at mcs.anl.gov>
>
> The 1000 was just a conservative guess. The number of non-zeros per row is
> in the tens in general but certain constraints lead to non-diagonal streaks
> in the sparsity-pattern.
>
> Is it the reordering of the matrix that is killing me here? How can I set
> options.ColPerm?
>
>
>
> If i use -mat_superlu_dist_parsymbfact the program crashes with
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:23 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat
> later) with
>
>
>
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
>
> col block 3006 -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> col block 1924 [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:58 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
>
>
> /Mahir
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>
> *Sent:* den 22 juli 2015 21:34
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; petsc-users
>
>
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> In Petsc/superlu_dist interface, we set default
>
>
>
> options.ParSymbFact = NO;
>
>
>
> When user raises the flag "-mat_superlu_dist_parsymbfact",
>
> we set
>
>
>
>     options.ParSymbFact = YES;
>
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for
> ParSymbFact regardless of user ordering setting */
>
>
>
> We do not change anything else.
>
>
>
> Hong
>
>
>
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I am trying to understand your problem. You said you are solving Naviers
> equation (elastodynamics) in the frequency domain, using finite element
> discretization.  I wonder why you have about 1000 nonzeros per row.
> Usually in many PDE discretized matrices, the number of nonzeros per row is
> in the tens (even for 3D problems), not in the thousands.   So, your matrix
> is quite a bit denser than many sparse matrices we deal with.
>
>
>
> The number of nonzeros in the L and U factors is much more than that in
> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be
> as bad as 50-100x fill ratio for 3D.  But since your matrix starts much
> denser (i.e., the underlying graph has many connections), it may not lend
> to any good ordering strategy to preserve sparsity of L and U; that is, the
> L and U fill ratio may be large.
>
>
>
> I don't understand why you get the following error when you use
>
> ?-mat_superlu_dist_parsymbfact?.
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
>
>
>
> ?Hong -- in order to use parallel symbolic factorization, is it sufficient
> to specify only
>
> ?-mat_superlu_dist_parsymbfact?
>
> ? ?  (the default is to use  sequential symbolic factorization.)
>
>
>
>
>
> Sherry
>
>
>
> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Thank you for your reply.
>
> As you have probably figured out already, I am not a computational
> scientist. I am a researcher in civil engineering (railways for high-speed
> traffic), trying to produce some, from my perspective, fairly large
> parametric studies based on finite element discretizations.
>
> I am working in a Windows-environment and have installed PETSc through
> Cygwin.
> Apparently, there is no support for Valgrind in this OS.
>
> If I have understood you correct, the memory issues are related to superLU
> and given my background, there is not much I can do. Is this correct?
>
>
> Best regards,
> Mahir
>
> ______________________________________________
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> ______________________________________________
>
>
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
> Sent: den 22 juli 2015 02:57
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
>
>    Run the program under valgrind
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use
> the option -mat_superlu_dist_parsymbfact I get many scary memory problems
> some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>
>    Note that I consider it unacceptable for running programs to EVER use
> uninitialized values; until these are all cleaned up I won't trust any runs
> like this.
>
>   Barry
>
>
>
>
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis
> (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis
> (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size
> 131,072 alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
>
>
> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That
> gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option
> ?-mat_superlu_dist_parsymbfact?.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on?
> >
> > From the SuperLU manual; I should give the option either YES or NO,
> however -mat_superlu_dist_parsymbfact YES makes the program crash in the
> same way as above.
> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the
> PETSc documentation
> >
> > Mahir
> >
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> >
> > From: Xiaoye S. Li [mailto:xsli at lbl.gov]
> > Sent: den 20 juli 2015 18:12
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization.
> Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to
> {L\U} data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option:
> '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very
> difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6
> degrees of freedom. The matrices are derived from finite elements so they
> are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory
> needed to less than 20GB, so if the solver needs twice or even three times
> as much, it should still work well. Or have I completely misunderstood
> something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov]
> > Sent: den 20 juli 2015 17:39
> > To: ?lker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too
> ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix
> orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so
> you only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from
> Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and
> stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small
> (1000 degrees of freedom) test problem on one or several processors, but it
> keeps crashing when I try it on my full scale (in the order of 10^6 degrees
> of freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error
> messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so
> I am hoping that someone can push  me in the right direction?
> >
> > Mahir
>
>
>
>
>
>
>
>
>
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/b388881f/attachment-0001.html>

From Mahir.Ulker-Kaustell at tyrens.se  Mon Aug  3 10:45:25 2015
From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se)
Date: Mon, 3 Aug 2015 15:45:25 +0000
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <CAMYG4G=0cTSBk2cVu6b8UxhtDaVmfijXh7zq_TfsdvJ=46t8pQ@mail.gmail.com>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAMYG4G=0cTSBk2cVu6b8UxhtDaVmfijXh7zq_TfsdvJ=46t8pQ@mail.gmail.com>
Message-ID: <f364c3966f9f4c4a9ecf5cd3e95c9005@STHWS42.tyrens.se>

Matt,

Thank you for clarifying this.

Mahir

________________________________
Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
________________________________

From: Matthew Knepley [mailto:knepley at gmail.com]
Sent: den 3 augusti 2015 17:40
To: ?lker-Kaustell, Mahir
Cc: Hong; Xiaoye S. Li; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

On Mon, Aug 3, 2015 at 10:34 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Sherry and Hong,

If I use:
-mat_superlu_dist_parsymbfact,
I get:
Invalid ISPEC at line 484 in file get_perm_c.c
regardless of what I give to ?mat_superlu_dist_matinput

I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs.

If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1:

Do not use "=1" for any PETSc option. This is improper syntax. It will ignore that option. You use "-option 1" since
all option arguments are separated by a space, not an =.

  Matt

mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1

and
mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact=1

I guess this corresponds to not setting parsymbfact at all. Both programs consume the same amount of RAM and seem to run well.

If I use (what seems to be correct):
mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact

the result is: Invalid ISPEC at line 484 in file get_perm_c.c


Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 3 augusti 2015 16:46
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry found the culprit. I can reproduce it:
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact

Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
...

PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes.
Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run?

I'll add an error flag for these use cases.

Hong

On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I think I know the problem.   Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal().  This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).

That's why you get the following error:
Invalid ISPEC at line 484 in file get_perm_c.c

You need to use distributed matrix input interface pzgssvx() (without ABglobal)

Sherry


On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Hong and Sherry,

I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:

If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c

Mahir

From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 30 juli 2015 02:58
To: ?lker-Kaustell, Mahir
Cc: Xiaoye Li; PETSc users list

Subject: Fwd: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry fixed several bugs in superlu_dist-v4.1.
The current petsc-release interfaces with superlu_dist-v4.0.
We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?

Here is how to do it:
1. download superlu_dist v4.1
2. remove existing PETSC_ARCH directory, then configure petsc with
'--download-superlu_dist=superlu_dist_4.1.tar.gz'
3. build petsc

Let us know if the issue remains.

Hong


---------- Forwarded message ----------
From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Date: Wed, Jul 29, 2015 at 2:24 PM
Subject: Fwd: [petsc-users] SuperLU MPI-problem
To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>
Hong,
I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:

Invalid ISPEC at line 484 in file get_perm_c.c
This has nothing to do with my bug fix.
?  Shall we ask him to try the new version, or try to get him matrix?
Sherry
?

---------- Forwarded message ----------
From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
Date: Wed, Jul 22, 2015 at 1:32 PM
Subject: RE: [petsc-users] SuperLU MPI-problem
To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?

If i use -mat_superlu_dist_parsymbfact the program crashes with

Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------

If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with

Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
col block 3006 -------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------


/Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 22 juli 2015 21:34
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; petsc-users

Subject: Re: [petsc-users] SuperLU MPI-problem

In Petsc/superlu_dist interface, we set default

options.ParSymbFact = NO;

When user raises the flag "-mat_superlu_dist_parsymbfact",
we set

    options.ParSymbFact = YES;
    options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */

We do not change anything else.

Hong

On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.

The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.

I don't understand why you get the following error when you use
?-mat_superlu_dist_parsymbfact?.

Invalid ISPEC at line 484 in file get_perm_c.c

Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.

?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
?-mat_superlu_dist_parsymbfact?
? ?  (the default is to use  sequential symbolic factorization.)


Sherry

On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Thank you for your reply.

As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.

I am working in a Windows-environment and have installed PETSc through Cygwin.
Apparently, there is no support for Valgrind in this OS.

If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?


Best regards,
Mahir

______________________________________________
Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
______________________________________________

-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
Sent: den 22 juli 2015 02:57
To: ?lker-Kaustell, Mahir
Cc: Xiaoye S. Li; petsc-users
Subject: Re: [petsc-users] SuperLU MPI-problem


   Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)

   Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.

  Barry


==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42048== Syscall param write(buf) points to uninitialised byte(s)
==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Address 0x104810704 is on thread 1's stack
==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049== Conditional jump or move depends on uninitialised value(s)
==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049== Conditional jump or move depends on uninitialised value(s)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==


> On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Ok. So I have been creating the full factorization on each process. That gives me some hope!
>
> I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> However, now the program crashes with:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> And so on?
>
> From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
>
> Mahir
>
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
>
> From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> Sent: den 20 juli 2015 18:12
> To: ?lker-Kaustell, Mahir
> Cc: Hong; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
>
> The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
>
> You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
>
> Sherry Li
>
>
> On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Hong:
>
> Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
>
> The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
>
> Mahir
>
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 20 juli 2015 17:39
> To: ?lker-Kaustell, Mahir
> Cc: petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir:
> Direct solvers consume large amount of memory. Suggest to try followings:
>
> 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
>
> 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> Do you get memory crash in the 1st symbolic factorization?
> In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
>
> 3. Use a machine that gives larger memory.
>
> Hong
>
> Dear Petsc-Users,
>
> I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> The frequency dependency of the problem requires that the system
>
>                              [-omega^2M + K]u = F
>
> where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> K is a complex matrix, including material damping.
>
> I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
>
> The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
>
> I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
>
> Mahir


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/87e16e5b/attachment-0001.html>

From gtheler at cites-gss.com  Mon Aug  3 10:50:36 2015
From: gtheler at cites-gss.com (Theler German Guillermo)
Date: Mon, 3 Aug 2015 15:50:36 +0000
Subject: [petsc-users] Get CPU time from events
In-Reply-To: <CAMYG4G=CAzVjdwbcyy7mgi2NVPb+xKNZHemv+Cg9osC6RmKKnw@mail.gmail.com>
References: <C638F8EEDA1EC642B0B13905A2ADAC5C31EE8AF5@SS10175.sancorseguros.net>
	<CAMYG4G=CAzVjdwbcyy7mgi2NVPb+xKNZHemv+Cg9osC6RmKKnw@mail.gmail.com>
Message-ID: <C638F8EEDA1EC642B0B13905A2ADAC5C31EEA073@SS10175.sancorseguros.net>

Hi Matt

I get empty PetscEventPerfInfo structures after calling
PetscLogEventGetPerfInfo(), i.e. both integers and floats are zero, as
if the structure was just calloc'ed and never filled. However, I managed
to get the overall stage CPU time (which is ok for me) by doing

PetscLogGetStageLog(&stageLog);

and then accessing stageLog->stageInfo[stage].perfInfo.time

I attach a modified src/ksp/ksp/examples/tutorials/ex1.c that tries to
illustrate my point.

--
jeremy


On Fri, 2015-07-31 at 09:00 -0500, Matthew Knepley wrote:
> 2015-07-31 8:43 GMT-05:00 Theler German Guillermo
> <gtheler at cites-gss.com>:
>         Is there a way to obtain as a PetscScalar the CPU time
>         associated to an
>         event or stage?
>         Something like PetscGetFlops() in an event or stage-based
>         basis?
>
>
> Here is a test where I do that:
>
>
>   https://bitbucket.org/petsc/petsc/src/77c2d1544b79e11f3573a3360b35a7573ef4d1bf/src/dm/impls/plex/examples/tests/ex9.c?at=master#ex9.c-237
>
>

>
________________________________
 Imprima este mensaje s?lo si es absolutamente necesario.
Para imprimir, en lo posible utilice el papel de ambos lados.
El Grupo Sancor Seguros se compromete con el cuidado del medioambiente.


************AVISO DE CONFIDENCIALIDAD************

El Grupo Sancor Seguros comunica que:

Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por ley. Si usted recibi? este mensaje err?neamente, por favor notif?quenos respondiendo al remitente, borre el mensaje original y destruya las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje. La publicaci?n, uso, copia o impresi?n total o parcial de este mensaje o documentos adjuntos queda prohibida.

Disposici?n DNDP 10-2008. El titular de los datos personales tiene la facultad de ejercer el derecho de acceso a los mismos en forma gratuita a intervalos no inferiores a seis meses, salvo que acredite un inter?s leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las denuncias y reclamos que se interpongan con relaci?n al incumplimiento de las normas sobre la protecci?n de datos personales.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1.c
Type: text/x-csrc
Size: 8205 bytes
Desc: ex1.c
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/2d022a34/attachment.c>

From knepley at gmail.com  Mon Aug  3 11:36:32 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 3 Aug 2015 11:36:32 -0500
Subject: [petsc-users] Get CPU time from events
In-Reply-To: <C638F8EEDA1EC642B0B13905A2ADAC5C31EEA073@SS10175.sancorseguros.net>
References: <C638F8EEDA1EC642B0B13905A2ADAC5C31EE8AF5@SS10175.sancorseguros.net>
	<CAMYG4G=CAzVjdwbcyy7mgi2NVPb+xKNZHemv+Cg9osC6RmKKnw@mail.gmail.com>
	<C638F8EEDA1EC642B0B13905A2ADAC5C31EEA073@SS10175.sancorseguros.net>
Message-ID: <CAMYG4GmE=7=xFfyLU+CVZoLhjniHiAZdXx6uBhQg0Zyey3Jk-Q@mail.gmail.com>

2015-08-03 10:50 GMT-05:00 Theler German Guillermo <gtheler at cites-gss.com>:

> Hi Matt
>
> I get empty PetscEventPerfInfo structures after calling
> PetscLogEventGetPerfInfo(), i.e. both integers and floats are zero, as
>

If you do not pass -log_summary, you have to call PetscLogBegin() after
PetscInitialize() to
get it to start logging.

  Thanks,

    Matt


> if the structure was just calloc'ed and never filled. However, I managed
> to get the overall stage CPU time (which is ok for me) by doing
>
> PetscLogGetStageLog(&stageLog);
>
> and then accessing stageLog->stageInfo[stage].perfInfo.time
>
> I attach a modified src/ksp/ksp/examples/tutorials/ex1.c that tries to
> illustrate my point.
>
> --
> jeremy
>
>
> On Fri, 2015-07-31 at 09:00 -0500, Matthew Knepley wrote:
> > 2015-07-31 8:43 GMT-05:00 Theler German Guillermo
> > <gtheler at cites-gss.com>:
> >         Is there a way to obtain as a PetscScalar the CPU time
> >         associated to an
> >         event or stage?
> >         Something like PetscGetFlops() in an event or stage-based
> >         basis?
> >
> >
> > Here is a test where I do that:
> >
> >
> >
> https://bitbucket.org/petsc/petsc/src/77c2d1544b79e11f3573a3360b35a7573ef4d1bf/src/dm/impls/plex/examples/tests/ex9.c?at=master#ex9.c-237
> >
> >
>
> >
> ________________________________
>  Imprima este mensaje s?lo si es absolutamente necesario.
> Para imprimir, en lo posible utilice el papel de ambos lados.
> El Grupo Sancor Seguros se compromete con el cuidado del medioambiente.
>
>
>
> ************AVISO DE CONFIDENCIALIDAD************
>
> El Grupo Sancor Seguros comunica que:
>
> Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del
> destinatario y pueden contener informaci?n confidencial o propietaria, cuya
> divulgaci?n es sancionada por ley. Si usted recibi? este mensaje
> err?neamente, por favor notif?quenos respondiendo al remitente, borre el
> mensaje original y destruya las copias (impresas o grabadas en cualquier
> medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones
> contenidas en este mail son propias del autor del mensaje. La publicaci?n,
> uso, copia o impresi?n total o parcial de este mensaje o documentos
> adjuntos queda prohibida.
>
> Disposici?n DNDP 10-2008. El titular de los datos personales tiene la
> facultad de ejercer el derecho de acceso a los mismos en forma gratuita a
> intervalos no inferiores a seis meses, salvo que acredite un inter?s
> leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de
> la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES,
> Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las
> denuncias y reclamos que se interpongan con relaci?n al incumplimiento de
> las normas sobre la protecci?n de datos personales.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/c7c0c095/attachment.html>

From gtheler at cites-gss.com  Mon Aug  3 11:52:41 2015
From: gtheler at cites-gss.com (Theler German Guillermo)
Date: Mon, 3 Aug 2015 16:52:41 +0000
Subject: [petsc-users] Get CPU time from events
In-Reply-To: <CAMYG4GmE=7=xFfyLU+CVZoLhjniHiAZdXx6uBhQg0Zyey3Jk-Q@mail.gmail.com>
References: <C638F8EEDA1EC642B0B13905A2ADAC5C31EE8AF5@SS10175.sancorseguros.net>
	<CAMYG4G=CAzVjdwbcyy7mgi2NVPb+xKNZHemv+Cg9osC6RmKKnw@mail.gmail.com>
	<C638F8EEDA1EC642B0B13905A2ADAC5C31EEA073@SS10175.sancorseguros.net>
	<CAMYG4GmE=7=xFfyLU+CVZoLhjniHiAZdXx6uBhQg0Zyey3Jk-Q@mail.gmail.com>
Message-ID: <C638F8EEDA1EC642B0B13905A2ADAC5C31EEA0A7@SS10175.sancorseguros.net>


>         I get empty PetscEventPerfInfo structures after calling
>         PetscLogEventGetPerfInfo(), i.e. both integers and floats are
>         zero, as
> If you do not pass -log_summary, you have to call PetscLogBegin()
> after PetscInitialize() to
> get it to start logging.

Got it! Thanks.
Maybe that sentence should be added to the description of
PetscLogEventGetPerfInfo() and friends.

--
jeremy
________________________________
 Imprima este mensaje s?lo si es absolutamente necesario.
Para imprimir, en lo posible utilice el papel de ambos lados.
El Grupo Sancor Seguros se compromete con el cuidado del medioambiente.


************AVISO DE CONFIDENCIALIDAD************

El Grupo Sancor Seguros comunica que:

Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por ley. Si usted recibi? este mensaje err?neamente, por favor notif?quenos respondiendo al remitente, borre el mensaje original y destruya las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje. La publicaci?n, uso, copia o impresi?n total o parcial de este mensaje o documentos adjuntos queda prohibida.

Disposici?n DNDP 10-2008. El titular de los datos personales tiene la facultad de ejercer el derecho de acceso a los mismos en forma gratuita a intervalos no inferiores a seis meses, salvo que acredite un inter?s leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las denuncias y reclamos que se interpongan con relaci?n al incumplimiento de las normas sobre la protecci?n de datos personales.

From bsmith at mcs.anl.gov  Mon Aug  3 12:00:45 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 3 Aug 2015 12:00:45 -0500
Subject: [petsc-users] Get CPU time from events
In-Reply-To: <C638F8EEDA1EC642B0B13905A2ADAC5C31EEA0A7@SS10175.sancorseguros.net>
References: <C638F8EEDA1EC642B0B13905A2ADAC5C31EE8AF5@SS10175.sancorseguros.net>
	<CAMYG4G=CAzVjdwbcyy7mgi2NVPb+xKNZHemv+Cg9osC6RmKKnw@mail.gmail.com>
	<C638F8EEDA1EC642B0B13905A2ADAC5C31EEA073@SS10175.sancorseguros.net>
	<CAMYG4GmE=7=xFfyLU+CVZoLhjniHiAZdXx6uBhQg0Zyey3Jk-Q@mail.gmail.com>
	<C638F8EEDA1EC642B0B13905A2ADAC5C31EEA0A7@SS10175.sancorseguros.net>
Message-ID: <EB15697B-84BD-4A2F-A3D4-A614CBEDED5A@mcs.anl.gov>


> On Aug 3, 2015, at 11:52 AM, Theler German Guillermo <gtheler at cites-gss.com> wrote:
> 
> 
>>        I get empty PetscEventPerfInfo structures after calling
>>        PetscLogEventGetPerfInfo(), i.e. both integers and floats are
>>        zero, as
>> If you do not pass -log_summary, you have to call PetscLogBegin()
>> after PetscInitialize() to
>> get it to start logging.
> 
> Got it! Thanks.
> Maybe that sentence should be added to the description of
> PetscLogEventGetPerfInfo() and friends.

   We should probably trigger an error, with a very helpful error message, if these are called but the initialization was never done.

  Barry

> 
> --
> jeremy
> ________________________________
> Imprima este mensaje s?lo si es absolutamente necesario.
> Para imprimir, en lo posible utilice el papel de ambos lados.
> El Grupo Sancor Seguros se compromete con el cuidado del medioambiente.
> 
> 
> 
> ************AVISO DE CONFIDENCIALIDAD************
> 
> El Grupo Sancor Seguros comunica que:
> 
> Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por ley. Si usted recibi? este mensaje err?neamente, por favor notif?quenos respondiendo al remitente, borre el mensaje original y destruya las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje. La publicaci?n, uso, copia o impresi?n total o parcial de este mensaje o documentos adjuntos queda prohibida.
> 
> Disposici?n DNDP 10-2008. El titular de los datos personales tiene la facultad de ejercer el derecho de acceso a los mismos en forma gratuita a intervalos no inferiores a seis meses, salvo que acredite un inter?s leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las denuncias y reclamos que se interpongan con relaci?n al incumplimiento de las normas sobre la protecci?n de datos personales.


From hzhang at mcs.anl.gov  Mon Aug  3 12:06:26 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Mon, 3 Aug 2015 12:06:26 -0500
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
Message-ID: <CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>

Mahir,


> I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for
> parallel runs.
>
>
>
> If I use 2 processors, the program runs if I use
> *?mat_superlu_dist_parsymbfact=1*:
>
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> GLOBAL -mat_superlu_dist_parsymbfact=1
>

The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so
your code runs well without parsymbfact.

Please run it with '-ksp_view' and see what
'SuperLU_DIST run parameters:' are being used, e.g.
petsc/src/ksp/ksp/examples/tutorials (maint)
$ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist
-mat_superlu_dist_parsymbfact=1 -ksp_view

...
  SuperLU_DIST run parameters:
              Process grid nprow 2 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 1
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 2 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm

I do not understand why your code uses matrix input mode = global.

Hong

>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 3 augusti 2015 16:46
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; Hong; PETSc users list
>
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry found the culprit. I can reproduce it:
>
> petsc/src/ksp/ksp/examples/tutorials
>
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist
> -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> ...
>
>
>
> PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when
> using more than one processes.
>
> Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or
> set matinput=GLOBAL for parallel run?
>
>
>
> I'll add an error flag for these use cases.
>
>
>
> Hong
>
>
>
> On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I think I know the problem.   Since zdistribute.c is called, I guess you
> are using the global (replicated) matrix input interface,
> pzgssvx_ABglobal().  This interface does not allow you to use parallel
> symbolic factorization (since matrix is centralized).
>
>
>
> That's why you get the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> You need to use distributed matrix input interface pzgssvx() (without
> ABglobal)
>
> Sherry
>
>
>
>
>
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Hong and Sherry,
>
>
>
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
>
>
>
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid
> ISPEC at line 484 in file get_perm_c.c
>
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the
> program crashes with:  Calloc fails for SPA dense[]. at line 438 in file
> zdistribute.c
>
>
>
> Mahir
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 30 juli 2015 02:58
> *To:* ?lker-Kaustell, Mahir
> *Cc:* Xiaoye Li; PETSc users list
>
>
> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry fixed several bugs in superlu_dist-v4.1.
>
> The current petsc-release interfaces with superlu_dist-v4.0.
>
> We do not know whether the reported issue (attached below) has been
> resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
>
>
>
> Here is how to do it:
>
> 1. download superlu_dist v4.1
>
> 2. remove existing PETSC_ARCH directory, then configure petsc with
>
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
>
> 3. build petsc
>
>
>
> Let us know if the issue remains.
>
>
>
> Hong
>
>
>
>
>
> ---------- Forwarded message ----------
> From: *Xiaoye S. Li* <xsli at lbl.gov>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <hzhang at mcs.anl.gov>
>
> Hong,
>
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure
> whether the new fix to parallel symbolic factorization solves the problem.
> What bothers be is that he is getting the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> This has nothing to do with my bug fix.
>
> ?  Shall we ask him to try the new version, or try to get him matrix?
>
> Sherry
> ?
>
>
>
> ---------- Forwarded message ----------
> From: *Mahir.Ulker-Kaustell at tyrens.se <Mahir.Ulker-Kaustell at tyrens.se>* <
> Mahir.Ulker-Kaustell at tyrens.se>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <hzhang at mcs.anl.gov>, "Xiaoye S. Li" <xsli at lbl.gov>
> Cc: petsc-users <petsc-users at mcs.anl.gov>
>
> The 1000 was just a conservative guess. The number of non-zeros per row is
> in the tens in general but certain constraints lead to non-diagonal streaks
> in the sparsity-pattern.
>
> Is it the reordering of the matrix that is killing me here? How can I set
> options.ColPerm?
>
>
>
> If i use -mat_superlu_dist_parsymbfact the program crashes with
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:23 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat
> later) with
>
>
>
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
>
> col block 3006 -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> col block 1924 [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:58 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
>
>
> /Mahir
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>
> *Sent:* den 22 juli 2015 21:34
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; petsc-users
>
>
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> In Petsc/superlu_dist interface, we set default
>
>
>
> options.ParSymbFact = NO;
>
>
>
> When user raises the flag "-mat_superlu_dist_parsymbfact",
>
> we set
>
>
>
>     options.ParSymbFact = YES;
>
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for
> ParSymbFact regardless of user ordering setting */
>
>
>
> We do not change anything else.
>
>
>
> Hong
>
>
>
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I am trying to understand your problem. You said you are solving Naviers
> equation (elastodynamics) in the frequency domain, using finite element
> discretization.  I wonder why you have about 1000 nonzeros per row.
> Usually in many PDE discretized matrices, the number of nonzeros per row is
> in the tens (even for 3D problems), not in the thousands.   So, your matrix
> is quite a bit denser than many sparse matrices we deal with.
>
>
>
> The number of nonzeros in the L and U factors is much more than that in
> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be
> as bad as 50-100x fill ratio for 3D.  But since your matrix starts much
> denser (i.e., the underlying graph has many connections), it may not lend
> to any good ordering strategy to preserve sparsity of L and U; that is, the
> L and U fill ratio may be large.
>
>
>
> I don't understand why you get the following error when you use
>
> ?-mat_superlu_dist_parsymbfact?.
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
>
>
>
> ?Hong -- in order to use parallel symbolic factorization, is it sufficient
> to specify only
>
> ?-mat_superlu_dist_parsymbfact?
>
> ? ?  (the default is to use  sequential symbolic factorization.)
>
>
>
>
>
> Sherry
>
>
>
> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Thank you for your reply.
>
> As you have probably figured out already, I am not a computational
> scientist. I am a researcher in civil engineering (railways for high-speed
> traffic), trying to produce some, from my perspective, fairly large
> parametric studies based on finite element discretizations.
>
> I am working in a Windows-environment and have installed PETSc through
> Cygwin.
> Apparently, there is no support for Valgrind in this OS.
>
> If I have understood you correct, the memory issues are related to superLU
> and given my background, there is not much I can do. Is this correct?
>
>
> Best regards,
> Mahir
>
> ______________________________________________
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> ______________________________________________
>
>
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
> Sent: den 22 juli 2015 02:57
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
>
>    Run the program under valgrind
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use
> the option -mat_superlu_dist_parsymbfact I get many scary memory problems
> some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>
>    Note that I consider it unacceptable for running programs to EVER use
> uninitialized values; until these are all cleaned up I won't trust any runs
> like this.
>
>   Barry
>
>
>
>
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis
> (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis
> (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size
> 131,072 alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
>
>
> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That
> gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option
> ?-mat_superlu_dist_parsymbfact?.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on?
> >
> > From the SuperLU manual; I should give the option either YES or NO,
> however -mat_superlu_dist_parsymbfact YES makes the program crash in the
> same way as above.
> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the
> PETSc documentation
> >
> > Mahir
> >
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> >
> > From: Xiaoye S. Li [mailto:xsli at lbl.gov]
> > Sent: den 20 juli 2015 18:12
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization.
> Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to
> {L\U} data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option:
> '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very
> difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6
> degrees of freedom. The matrices are derived from finite elements so they
> are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory
> needed to less than 20GB, so if the solver needs twice or even three times
> as much, it should still work well. Or have I completely misunderstood
> something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov]
> > Sent: den 20 juli 2015 17:39
> > To: ?lker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too
> ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix
> orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so
> you only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from
> Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and
> stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small
> (1000 degrees of freedom) test problem on one or several processors, but it
> keeps crashing when I try it on my full scale (in the order of 10^6 degrees
> of freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error
> messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so
> I am hoping that someone can push  me in the right direction?
> >
> > Mahir
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/a195af70/attachment-0001.html>

From balay at mcs.anl.gov  Mon Aug  3 13:19:41 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 3 Aug 2015 13:19:41 -0500
Subject: [petsc-users] failed to compile HDF5 on vesta
In-Reply-To: <CAN5Wd-JroDz4MrfgJvUgi7mR6+1Bsm2eDoYmest7eZQifP5e7Q@mail.gmail.com>
References: <CAN5Wd-+5Pe_PfonUF=1EENi+WwyKZ8PunJ1Rwf3Y9OWiqRmFgg@mail.gmail.com>
	<C44824DB-DDCF-4725-A755-CFA585531839@mcs.anl.gov>
	<CAN5Wd-LpsbK-78NG7AWiHDNxa3+ZouC5KXr0D7nyTwnKAnsw1A@mail.gmail.com>
	<78D07896-DA83-4572-B0AE-D022AE673D20@mcs.anl.gov>
	<CAN5Wd-JroDz4MrfgJvUgi7mR6+1Bsm2eDoYmest7eZQifP5e7Q@mail.gmail.com>
Message-ID: <alpine.LFD.2.20.1508031317040.14076@asterix>

You can look at
/soft/libraries/petsc/3.6.1.1/xl-opt/lib/petsc/conf/reconfigure-arch-xl-opt.py
for currently used configure options for bgq install with xl compilers.

Specifically:

    '--with-blas-lapack-lib=-L/soft/libraries/alcf/current/xl/LAPACK/lib -llapack -L/soft/libraries/alcf/current/xl/BLAS/lib -lblas',

Satish

On Sun, 2 Aug 2015, Fande Kong wrote:

> Hi, Barry,
> 
> Looks like they did not have fblaslapack installed. I could compile
> the fblaslapack when I switched the compiler from XL to gcc.
> 
> Thanks,
> Fande Kong,
> 
> On Sun, Aug 2, 2015 at 11:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> >
> >   You shouldn't need --download-fblaslapack  almost every system has it
> > already installed.
> >
> >   Barry
> >
> > Looks like the Fortran compiler on this system is rejecting the "old"
> > Fortran in blas/lapack code.
> >
> >
> > > On Aug 1, 2015, at 7:55 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> > >
> > > HI barry,
> > >
> > > Thanks a lot. I could compile hdf5, but failed to compile fblaslapack.
> > Log file is attached.
> > >
> > > Fande Kong,
> > >
> > > On Sat, Aug 1, 2015 at 3:15 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >   Try with  --with-shared-libraries=0  The HDF5 build is having some
> > issue with shared libraries
> > >
> > >   Barry
> > >
> > > > On Aug 1, 2015, at 4:01 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed
> > to compile HDF5. The configure log file is attached. Any suggestions would
> > be greatly appreciated.
> > > >
> > > > Thanks,
> > > >
> > > > Fande Kong,
> > > > <configure.log>
> > >
> > >
> > > <configure.log>
> >
> >
> 


From solvercorleone at gmail.com  Mon Aug  3 22:00:56 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Tue, 4 Aug 2015 12:00:56 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
Message-ID: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>

Hello,

I am a PhD student using PETsc for my research.
I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
multiplication) by using PETSc.
for example:
I want to get matrix B in AX=B, where A is a sparse matrix and both X and B
are dense matrices.

Thanks in advance

Regards

Cong Li
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/a3c9b069/attachment.html>

From jed at jedbrown.org  Tue Aug  4 01:27:54 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 04 Aug 2015 00:27:54 -0600
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
Message-ID: <87egjjr2j9.fsf@jedbrown.org>

Cong Li <solvercorleone at gmail.com> writes:

> Hello,
>
> I am a PhD student using PETsc for my research.
> I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> multiplication) by using PETSc.

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/e5bb4847/attachment.pgp>

From solvercorleone at gmail.com  Tue Aug  4 01:42:14 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Tue, 4 Aug 2015 15:42:14 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <87egjjr2j9.fsf@jedbrown.org>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
Message-ID: <CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>

Thanks for your reply.

I have an other question.
I want to do SPMM several times and combine result matrices into one bigger
matrix.
for example
I firstly calculate AX1=B1, AX2=B2 ...
then I want to combine B1, B2.. to get a C, where C=[B1,B2...]

Could you please suggest a way of how to do this.

Thanks

Cong Li

On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:

> Cong Li <solvercorleone at gmail.com> writes:
>
> > Hello,
> >
> > I am a PhD student using PETsc for my research.
> > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > multiplication) by using PETSc.
>
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/91bed7cc/attachment.html>

From patrick.sanan at gmail.com  Tue Aug  4 03:45:48 2015
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Tue, 4 Aug 2015 10:45:48 +0200
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
Message-ID: <20150804084548.GB52392@Patricks-MacBook-Pro-3.local>

On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> Thanks for your reply.
> 
> I have an other question.
> I want to do SPMM several times and combine result matrices into one bigger
> matrix.
> for example
> I firstly calculate AX1=B1, AX2=B2 ...
> then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> 
> Could you please suggest a way of how to do this.
This is just linear algebra, nothing to do with PETSc specifically.
A * [X1, X2, ... ] = [AX1, AX2, ...] 
> 
> Thanks
> 
> Cong Li
> 
> On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> 
> > Cong Li <solvercorleone at gmail.com> writes:
> >
> > > Hello,
> > >
> > > I am a PhD student using PETsc for my research.
> > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > > multiplication) by using PETSc.
> >
> >
> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/9aa5ff42/attachment.pgp>

From solvercorleone at gmail.com  Tue Aug  4 04:09:30 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Tue, 4 Aug 2015 18:09:30 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
Message-ID: <CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>

I am sorry that I should have explained it more clearly.
Actually I want to compute a recurrence.

Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
A*B2=B3 and so on.
Finally I want to combine all these results into a bigger matrix C=[B1,B2
...]

Is there any way to do this efficiently.


On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
wrote:

> On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > Thanks for your reply.
> >
> > I have an other question.
> > I want to do SPMM several times and combine result matrices into one
> bigger
> > matrix.
> > for example
> > I firstly calculate AX1=B1, AX2=B2 ...
> > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> >
> > Could you please suggest a way of how to do this.
> This is just linear algebra, nothing to do with PETSc specifically.
> A * [X1, X2, ... ] = [AX1, AX2, ...]
> >
> > Thanks
> >
> > Cong Li
> >
> > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> >
> > > Cong Li <solvercorleone at gmail.com> writes:
> > >
> > > > Hello,
> > > >
> > > > I am a PhD student using PETsc for my research.
> > > > I am wondering if there is a way to implement SPMM (Sparse
> matrix-matrix
> > > > multiplication) by using PETSc.
> > >
> > >
> > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/bd360beb/attachment.html>

From patrick.sanan at gmail.com  Tue Aug  4 04:46:31 2015
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Tue, 4 Aug 2015 11:46:31 +0200
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
Message-ID: <20150804094631.GF52392@Patricks-MacBook-Pro-3.local>

On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote:
> I am sorry that I should have explained it more clearly.
> Actually I want to compute a recurrence.
> 
> Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
> A*B2=B3 and so on.
> Finally I want to combine all these results into a bigger matrix C=[B1,B2
> ...]
> 
> Is there any way to do this efficiently.
With no other information about your problem, one literal solution might be to use MATNEST to define C once you have computed B1,B2,.. 
However, this invites questions about what you plan to do with C and whether you require explicit representations of some or all of these matrices, and what problem sizes you are considering. 
> 
> 
> 
> On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
> wrote:
> 
> > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > Thanks for your reply.
> > >
> > > I have an other question.
> > > I want to do SPMM several times and combine result matrices into one
> > bigger
> > > matrix.
> > > for example
> > > I firstly calculate AX1=B1, AX2=B2 ...
> > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > >
> > > Could you please suggest a way of how to do this.
> > This is just linear algebra, nothing to do with PETSc specifically.
> > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > >
> > > > Cong Li <solvercorleone at gmail.com> writes:
> > > >
> > > > > Hello,
> > > > >
> > > > > I am a PhD student using PETsc for my research.
> > > > > I am wondering if there is a way to implement SPMM (Sparse
> > matrix-matrix
> > > > > multiplication) by using PETSc.
> > > >
> > > >
> > > >
> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > >
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/82a5d226/attachment-0001.pgp>

From knepley at gmail.com  Tue Aug  4 04:50:08 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 4 Aug 2015 04:50:08 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
Message-ID: <CAMYG4GmEw+jvvaYOxkzC0kQVbjs7F441PHq+s2WVd0T66M-AVw@mail.gmail.com>

On Tue, Aug 4, 2015 at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:

> I am sorry that I should have explained it more clearly.
> Actually I want to compute a recurrence.
>
> Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
> A*B2=B3 and so on.
> Finally I want to combine all these results into a bigger matrix C=[B1,B2
> ...]
>
> Is there any way to do this efficiently.
>

You could use a MatNest, however now this seems like thw wrong way to
calculate it. Why
do you want to put a matrix polynomial into another matrix?

   Matt


> On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
> wrote:
>
>> On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>> > Thanks for your reply.
>> >
>> > I have an other question.
>> > I want to do SPMM several times and combine result matrices into one
>> bigger
>> > matrix.
>> > for example
>> > I firstly calculate AX1=B1, AX2=B2 ...
>> > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>> >
>> > Could you please suggest a way of how to do this.
>> This is just linear algebra, nothing to do with PETSc specifically.
>> A * [X1, X2, ... ] = [AX1, AX2, ...]
>> >
>> > Thanks
>> >
>> > Cong Li
>> >
>> > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
>> >
>> > > Cong Li <solvercorleone at gmail.com> writes:
>> > >
>> > > > Hello,
>> > > >
>> > > > I am a PhD student using PETsc for my research.
>> > > > I am wondering if there is a way to implement SPMM (Sparse
>> matrix-matrix
>> > > > multiplication) by using PETSc.
>> > >
>> > >
>> > >
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>> > >
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/d103a38b/attachment.html>

From solvercorleone at gmail.com  Tue Aug  4 05:31:57 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Tue, 4 Aug 2015 19:31:57 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <20150804094631.GF52392@Patricks-MacBook-Pro-3.local>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<20150804094631.GF52392@Patricks-MacBook-Pro-3.local>
Message-ID: <CALSmn-kagX63gc9_idp-m1LAB8b_Wq_PocAnMVU3cMjCKC=o6w@mail.gmail.com>

Actually, I am trying to implement s-step krylov subspace method.
I want to extend the Krylov subspace by s dimensions by using monomial,
which can be defined as C={X, AX, A^2X, ... , A^sX},  in one loop. So, my
plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x),
and then use the results to update the items in C. And then, in the next
loop of Krylov subspace method, the C will be updated again. This means I
need to update C in every iteration.
This continues till the convergence criteria is satisfied.

I suppose A is huge sparse SPD matrix with millions of rows, and X is
tall-skinny dense matrix.

Do you still think MATNEST is a good way to define C.

Actually I am wondering if there is a way to do SPMM by using a submatrix
of C and also store the result in a submatrix of C.  If it is possible, I
think we can remove some of cost of data movement.
For example, C=[c_1, c_2,.., c_s], and I want to use the result of A*c_1 to
update c_2, and then use he result of A*c_2(updated) to update c_3 and so
on.
I don't need the intermediate result separately, such as the result of
A*c_1, A*c_2. And I only need the final C.
Is there any SPMM function or strategies I can use to achievement this?

Thanks

Cong Li


On Tue, Aug 4, 2015 at 6:46 PM, Patrick Sanan <patrick.sanan at gmail.com>
wrote:

> On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote:
> > I am sorry that I should have explained it more clearly.
> > Actually I want to compute a recurrence.
> >
> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
> > A*B2=B3 and so on.
> > Finally I want to combine all these results into a bigger matrix C=[B1,B2
> > ...]
> >
> > Is there any way to do this efficiently.
> With no other information about your problem, one literal solution might
> be to use MATNEST to define C once you have computed B1,B2,..
> However, this invites questions about what you plan to do with C and
> whether you require explicit representations of some or all of these
> matrices, and what problem sizes you are considering.
> >
> >
> >
> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
> > wrote:
> >
> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > Thanks for your reply.
> > > >
> > > > I have an other question.
> > > > I want to do SPMM several times and combine result matrices into one
> > > bigger
> > > > matrix.
> > > > for example
> > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > >
> > > > Could you please suggest a way of how to do this.
> > > This is just linear algebra, nothing to do with PETSc specifically.
> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > > >
> > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I am a PhD student using PETsc for my research.
> > > > > > I am wondering if there is a way to implement SPMM (Sparse
> > > matrix-matrix
> > > > > > multiplication) by using PETSc.
> > > > >
> > > > >
> > > > >
> > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > >
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/ceb3c958/attachment.html>

From solvercorleone at gmail.com  Tue Aug  4 05:36:15 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Tue, 4 Aug 2015 19:36:15 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAMYG4GmEw+jvvaYOxkzC0kQVbjs7F441PHq+s2WVd0T66M-AVw@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<CAMYG4GmEw+jvvaYOxkzC0kQVbjs7F441PHq+s2WVd0T66M-AVw@mail.gmail.com>
Message-ID: <CALSmn-mHf4ZqsACR9eXBYOVrNU6NSjrgO+2CJVYwDTAenQMGAg@mail.gmail.com>

Hi

As I answered in the last email.
Actually, I am trying to implement s-step block krylov subspace method. So,
I need to expand the Krylov subspace by putting matrix polynomials into
another matrix.

Cong Li

On Tue, Aug 4, 2015 at 6:50 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Aug 4, 2015 at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
>
>> I am sorry that I should have explained it more clearly.
>> Actually I want to compute a recurrence.
>>
>> Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
>> A*B2=B3 and so on.
>> Finally I want to combine all these results into a bigger matrix C=[B1,B2
>> ...]
>>
>> Is there any way to do this efficiently.
>>
>
> You could use a MatNest, however now this seems like thw wrong way to
> calculate it. Why
> do you want to put a matrix polynomial into another matrix?
>
>    Matt
>
>
>> On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
>> wrote:
>>
>>> On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>> > Thanks for your reply.
>>> >
>>> > I have an other question.
>>> > I want to do SPMM several times and combine result matrices into one
>>> bigger
>>> > matrix.
>>> > for example
>>> > I firstly calculate AX1=B1, AX2=B2 ...
>>> > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>> >
>>> > Could you please suggest a way of how to do this.
>>> This is just linear algebra, nothing to do with PETSc specifically.
>>> A * [X1, X2, ... ] = [AX1, AX2, ...]
>>> >
>>> > Thanks
>>> >
>>> > Cong Li
>>> >
>>> > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
>>> >
>>> > > Cong Li <solvercorleone at gmail.com> writes:
>>> > >
>>> > > > Hello,
>>> > > >
>>> > > > I am a PhD student using PETsc for my research.
>>> > > > I am wondering if there is a way to implement SPMM (Sparse
>>> matrix-matrix
>>> > > > multiplication) by using PETSc.
>>> > >
>>> > >
>>> > >
>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>> > >
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/243a394c/attachment.html>

From knepley at gmail.com  Tue Aug  4 06:34:50 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 4 Aug 2015 06:34:50 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-kagX63gc9_idp-m1LAB8b_Wq_PocAnMVU3cMjCKC=o6w@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<20150804094631.GF52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-kagX63gc9_idp-m1LAB8b_Wq_PocAnMVU3cMjCKC=o6w@mail.gmail.com>
Message-ID: <CAMYG4GmuehhW-Kp9swH5U_YwC_7M=fqm5LzJwCCm9jn3YWbGrw@mail.gmail.com>

On Tue, Aug 4, 2015 at 5:31 AM, Cong Li <solvercorleone at gmail.com> wrote:

> Actually, I am trying to implement s-step krylov subspace method.
> I want to extend the Krylov subspace by s dimensions by using monomial,
> which can be defined as C={X, AX, A^2X, ... , A^sX},  in one loop. So, my
> plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x),
> and then use the results to update the items in C. And then, in the next
> loop of Krylov subspace method, the C will be updated again. This means I
> need to update C in every iteration.
> This continues till the convergence criteria is satisfied.
>
> I suppose A is huge sparse SPD matrix with millions of rows, and X is
> tall-skinny dense matrix.
>
> Do you still think MATNEST is a good way to define C.
>
> Actually I am wondering if there is a way to do SPMM by using a submatrix
> of C and also store the result in a submatrix of C.  If it is possible, I
> think we can remove some of cost of data movement.
> For example, C=[c_1, c_2,.., c_s], and I want to use the result of A*c_1
> to update c_2, and then use he result of A*c_2(updated) to update c_3 and
> so on.
> I don't need the intermediate result separately, such as the result of
> A*c_1, A*c_2. And I only need the final C.
> Is there any SPMM function or strategies I can use to achievement this?
>

So there are two optimizations here:

  1) Communication: You only communicate every s steps. If you are solving
a transport dominated problem, this can make sense.
      For elliptic problems, I think it makes no difference at all.

  2) Computation: You can alleviate bandwidth pressure by acting on
multiple vectors at once.

I would first implement this naively with a collection of Vecs to check
that 1) makes a difference for your problem.
If it does, then I think 2) can best be accomplished by using a TAIJ matrix
and a long Vec, where you shift the
memory at each iterate.

  Thanks,

     Matt


> Thanks
>
> Cong Li
>
>
>
> On Tue, Aug 4, 2015 at 6:46 PM, Patrick Sanan <patrick.sanan at gmail.com>
> wrote:
>
>> On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote:
>> > I am sorry that I should have explained it more clearly.
>> > Actually I want to compute a recurrence.
>> >
>> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
>> > A*B2=B3 and so on.
>> > Finally I want to combine all these results into a bigger matrix
>> C=[B1,B2
>> > ...]
>> >
>> > Is there any way to do this efficiently.
>> With no other information about your problem, one literal solution might
>> be to use MATNEST to define C once you have computed B1,B2,..
>> However, this invites questions about what you plan to do with C and
>> whether you require explicit representations of some or all of these
>> matrices, and what problem sizes you are considering.
>> >
>> >
>> >
>> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
>> > wrote:
>> >
>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>> > > > Thanks for your reply.
>> > > >
>> > > > I have an other question.
>> > > > I want to do SPMM several times and combine result matrices into one
>> > > bigger
>> > > > matrix.
>> > > > for example
>> > > > I firstly calculate AX1=B1, AX2=B2 ...
>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>> > > >
>> > > > Could you please suggest a way of how to do this.
>> > > This is just linear algebra, nothing to do with PETSc specifically.
>> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>> > > >
>> > > > Thanks
>> > > >
>> > > > Cong Li
>> > > >
>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
>> > > >
>> > > > > Cong Li <solvercorleone at gmail.com> writes:
>> > > > >
>> > > > > > Hello,
>> > > > > >
>> > > > > > I am a PhD student using PETsc for my research.
>> > > > > > I am wondering if there is a way to implement SPMM (Sparse
>> > > matrix-matrix
>> > > > > > multiplication) by using PETSc.
>> > > > >
>> > > > >
>> > > > >
>> > >
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>> > > > >
>> > >
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/b3576b65/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue Aug  4 11:27:39 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 11:27:39 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
Message-ID: <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>


> On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> 
> I am sorry that I should have explained it more clearly.
> Actually I want to compute a recurrence.
> 
> Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on.
> Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...]

   First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create 
B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc.

   Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix.

  Barry


> 
> Is there any way to do this efficiently.
> 
> 
> 
> On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > Thanks for your reply.
> >
> > I have an other question.
> > I want to do SPMM several times and combine result matrices into one bigger
> > matrix.
> > for example
> > I firstly calculate AX1=B1, AX2=B2 ...
> > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> >
> > Could you please suggest a way of how to do this.
> This is just linear algebra, nothing to do with PETSc specifically.
> A * [X1, X2, ... ] = [AX1, AX2, ...]
> >
> > Thanks
> >
> > Cong Li
> >
> > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> >
> > > Cong Li <solvercorleone at gmail.com> writes:
> > >
> > > > Hello,
> > > >
> > > > I am a PhD student using PETsc for my research.
> > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > > > multiplication) by using PETSc.
> > >
> > >
> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > >
> 


From solvercorleone at gmail.com  Tue Aug  4 11:59:31 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Wed, 5 Aug 2015 01:59:31 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
Message-ID: <CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>

Thanks very much. This answer is very helpful.
And I have a following question.
If I create B1, B2, .. by the way you suggested and then use MatMatMult to
do SPMM.

PetscErrorCode <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscErrorCode.html#PetscErrorCode>
 MatMatMult <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html#MatMatMult>(Mat
<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat>
A,Mat <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat>
B,MatReuse <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatReuse.html#MatReuse>
scall,PetscReal
<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscReal.html#PetscReal>
fill,Mat <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat>
*C)

should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.

Thanks

Cong Li

On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > I am sorry that I should have explained it more clearly.
> > Actually I want to compute a recurrence.
> >
> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
> A*B2=B3 and so on.
> > Finally I want to combine all these results into a bigger matrix
> C=[B1,B2 ...]
>
>    First create C with MatCreateDense(,&C). Then call
> MatDenseGetArray(C,&array); then create B1 with
> MatCreateDense(....,array,&B1); then create
> B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the
> number of __local__ rows in B1 times the number of columns in B1, then
> create B3 with a larger shift etc.
>
>    Note that you are "sharing" the array space of C with B1, B2, B3, ...,
> each Bi contains its columns of the C matrix.
>
>   Barry
>
>
>
> >
> > Is there any way to do this efficiently.
> >
> >
> >
> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
> wrote:
> > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > Thanks for your reply.
> > >
> > > I have an other question.
> > > I want to do SPMM several times and combine result matrices into one
> bigger
> > > matrix.
> > > for example
> > > I firstly calculate AX1=B1, AX2=B2 ...
> > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > >
> > > Could you please suggest a way of how to do this.
> > This is just linear algebra, nothing to do with PETSc specifically.
> > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > >
> > > > Cong Li <solvercorleone at gmail.com> writes:
> > > >
> > > > > Hello,
> > > > >
> > > > > I am a PhD student using PETsc for my research.
> > > > > I am wondering if there is a way to implement SPMM (Sparse
> matrix-matrix
> > > > > multiplication) by using PETSc.
> > > >
> > > >
> > > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/a9b42607/attachment.html>

From solvercorleone at gmail.com  Tue Aug  4 12:08:50 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Wed, 5 Aug 2015 02:08:50 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAMYG4GmuehhW-Kp9swH5U_YwC_7M=fqm5LzJwCCm9jn3YWbGrw@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<20150804094631.GF52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-kagX63gc9_idp-m1LAB8b_Wq_PocAnMVU3cMjCKC=o6w@mail.gmail.com>
	<CAMYG4GmuehhW-Kp9swH5U_YwC_7M=fqm5LzJwCCm9jn3YWbGrw@mail.gmail.com>
Message-ID: <CALSmn-=ci4MtOjyizvftg2ibAiERmcC3_PjtTzDDyot_LDCxuw@mail.gmail.com>

Thanks very much for your suggestions.

Actually I am also considering using communication-avoiding matrix power
kernel (CA-MPK) to do SPMM. However, the communication pattern of CA-MPK
depends on the sparsity pattern. So the implementation could be very
complex for some of problems.The efficient implementation of CA-MPK is
actually one of problems I want to solve during my PhD course.

As to the optimisation 2 you suggested, is it the same idea as what Barry
Smith suggested?
I am sorry that I am a Rookie to PETSc, so I am not quite familiar with
PETSc implementation strategies.

Thanks

Cong Li

On Tue, Aug 4, 2015 at 8:34 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Aug 4, 2015 at 5:31 AM, Cong Li <solvercorleone at gmail.com> wrote:
>
>> Actually, I am trying to implement s-step krylov subspace method.
>> I want to extend the Krylov subspace by s dimensions by using monomial,
>> which can be defined as C={X, AX, A^2X, ... , A^sX},  in one loop. So, my
>> plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x),
>> and then use the results to update the items in C. And then, in the next
>> loop of Krylov subspace method, the C will be updated again. This means I
>> need to update C in every iteration.
>> This continues till the convergence criteria is satisfied.
>>
>> I suppose A is huge sparse SPD matrix with millions of rows, and X is
>> tall-skinny dense matrix.
>>
>> Do you still think MATNEST is a good way to define C.
>>
>> Actually I am wondering if there is a way to do SPMM by using a submatrix
>> of C and also store the result in a submatrix of C.  If it is possible, I
>> think we can remove some of cost of data movement.
>> For example, C=[c_1, c_2,.., c_s], and I want to use the result of A*c_1
>> to update c_2, and then use he result of A*c_2(updated) to update c_3 and
>> so on.
>> I don't need the intermediate result separately, such as the result of
>> A*c_1, A*c_2. And I only need the final C.
>> Is there any SPMM function or strategies I can use to achievement this?
>>
>
> So there are two optimizations here:
>
>   1) Communication: You only communicate every s steps. If you are solving
> a transport dominated problem, this can make sense.
>       For elliptic problems, I think it makes no difference at all.
>
>   2) Computation: You can alleviate bandwidth pressure by acting on
> multiple vectors at once.
>
> I would first implement this naively with a collection of Vecs to check
> that 1) makes a difference for your problem.
> If it does, then I think 2) can best be accomplished by using a TAIJ
> matrix and a long Vec, where you shift the
> memory at each iterate.
>
>   Thanks,
>
>      Matt
>
>
>> Thanks
>>
>> Cong Li
>>
>>
>>
>> On Tue, Aug 4, 2015 at 6:46 PM, Patrick Sanan <patrick.sanan at gmail.com>
>> wrote:
>>
>>> On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote:
>>> > I am sorry that I should have explained it more clearly.
>>> > Actually I want to compute a recurrence.
>>> >
>>> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
>>> > A*B2=B3 and so on.
>>> > Finally I want to combine all these results into a bigger matrix
>>> C=[B1,B2
>>> > ...]
>>> >
>>> > Is there any way to do this efficiently.
>>> With no other information about your problem, one literal solution might
>>> be to use MATNEST to define C once you have computed B1,B2,..
>>> However, this invites questions about what you plan to do with C and
>>> whether you require explicit representations of some or all of these
>>> matrices, and what problem sizes you are considering.
>>> >
>>> >
>>> >
>>> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com
>>> >
>>> > wrote:
>>> >
>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>> > > > Thanks for your reply.
>>> > > >
>>> > > > I have an other question.
>>> > > > I want to do SPMM several times and combine result matrices into
>>> one
>>> > > bigger
>>> > > > matrix.
>>> > > > for example
>>> > > > I firstly calculate AX1=B1, AX2=B2 ...
>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>> > > >
>>> > > > Could you please suggest a way of how to do this.
>>> > > This is just linear algebra, nothing to do with PETSc specifically.
>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>> > > >
>>> > > > Thanks
>>> > > >
>>> > > > Cong Li
>>> > > >
>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>>> wrote:
>>> > > >
>>> > > > > Cong Li <solvercorleone at gmail.com> writes:
>>> > > > >
>>> > > > > > Hello,
>>> > > > > >
>>> > > > > > I am a PhD student using PETsc for my research.
>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse
>>> > > matrix-matrix
>>> > > > > > multiplication) by using PETSc.
>>> > > > >
>>> > > > >
>>> > > > >
>>> > >
>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>> > > > >
>>> > >
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/ed0cb1a9/attachment.html>

From knepley at gmail.com  Tue Aug  4 12:11:40 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 4 Aug 2015 12:11:40 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-=ci4MtOjyizvftg2ibAiERmcC3_PjtTzDDyot_LDCxuw@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<20150804094631.GF52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-kagX63gc9_idp-m1LAB8b_Wq_PocAnMVU3cMjCKC=o6w@mail.gmail.com>
	<CAMYG4GmuehhW-Kp9swH5U_YwC_7M=fqm5LzJwCCm9jn3YWbGrw@mail.gmail.com>
	<CALSmn-=ci4MtOjyizvftg2ibAiERmcC3_PjtTzDDyot_LDCxuw@mail.gmail.com>
Message-ID: <CAMYG4Gn2P2OxdjNDddVgKYFKsJc74oqcbgXn=fdX0jo7GDikhg@mail.gmail.com>

On Tue, Aug 4, 2015 at 12:08 PM, Cong Li <solvercorleone at gmail.com> wrote:

> Thanks very much for your suggestions.
>
> Actually I am also considering using communication-avoiding matrix power
> kernel (CA-MPK) to do SPMM. However, the communication pattern of CA-MPK
> depends on the sparsity pattern. So the implementation could be very
> complex for some of problems.The efficient implementation of CA-MPK is
> actually one of problems I want to solve during my PhD course.
>
> As to the optimisation 2 you suggested, is it the same idea as what Barry
> Smith suggested?
> I am sorry that I am a Rookie to PETSc, so I am not quite familiar with
> PETSc implementation strategies.
>

Yes, that is what Barry is suggesting.

  Thanks,

     Matt


> Thanks
>
> Cong Li
>
> On Tue, Aug 4, 2015 at 8:34 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Tue, Aug 4, 2015 at 5:31 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>
>>> Actually, I am trying to implement s-step krylov subspace method.
>>> I want to extend the Krylov subspace by s dimensions by using monomial,
>>> which can be defined as C={X, AX, A^2X, ... , A^sX},  in one loop. So, my
>>> plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x),
>>> and then use the results to update the items in C. And then, in the next
>>> loop of Krylov subspace method, the C will be updated again. This means I
>>> need to update C in every iteration.
>>> This continues till the convergence criteria is satisfied.
>>>
>>> I suppose A is huge sparse SPD matrix with millions of rows, and X is
>>> tall-skinny dense matrix.
>>>
>>> Do you still think MATNEST is a good way to define C.
>>>
>>> Actually I am wondering if there is a way to do SPMM by using a
>>> submatrix of C and also store the result in a submatrix of C.  If it is
>>> possible, I think we can remove some of cost of data movement.
>>> For example, C=[c_1, c_2,.., c_s], and I want to use the result of A*c_1
>>> to update c_2, and then use he result of A*c_2(updated) to update c_3 and
>>> so on.
>>> I don't need the intermediate result separately, such as the result of
>>> A*c_1, A*c_2. And I only need the final C.
>>> Is there any SPMM function or strategies I can use to achievement this?
>>>
>>
>> So there are two optimizations here:
>>
>>   1) Communication: You only communicate every s steps. If you are
>> solving a transport dominated problem, this can make sense.
>>       For elliptic problems, I think it makes no difference at all.
>>
>>   2) Computation: You can alleviate bandwidth pressure by acting on
>> multiple vectors at once.
>>
>> I would first implement this naively with a collection of Vecs to check
>> that 1) makes a difference for your problem.
>> If it does, then I think 2) can best be accomplished by using a TAIJ
>> matrix and a long Vec, where you shift the
>> memory at each iterate.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Thanks
>>>
>>> Cong Li
>>>
>>>
>>>
>>> On Tue, Aug 4, 2015 at 6:46 PM, Patrick Sanan <patrick.sanan at gmail.com>
>>> wrote:
>>>
>>>> On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote:
>>>> > I am sorry that I should have explained it more clearly.
>>>> > Actually I want to compute a recurrence.
>>>> >
>>>> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
>>>> > A*B2=B3 and so on.
>>>> > Finally I want to combine all these results into a bigger matrix
>>>> C=[B1,B2
>>>> > ...]
>>>> >
>>>> > Is there any way to do this efficiently.
>>>> With no other information about your problem, one literal solution
>>>> might be to use MATNEST to define C once you have computed B1,B2,..
>>>> However, this invites questions about what you plan to do with C and
>>>> whether you require explicit representations of some or all of these
>>>> matrices, and what problem sizes you are considering.
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>>>> patrick.sanan at gmail.com>
>>>> > wrote:
>>>> >
>>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>>> > > > Thanks for your reply.
>>>> > > >
>>>> > > > I have an other question.
>>>> > > > I want to do SPMM several times and combine result matrices into
>>>> one
>>>> > > bigger
>>>> > > > matrix.
>>>> > > > for example
>>>> > > > I firstly calculate AX1=B1, AX2=B2 ...
>>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>>> > > >
>>>> > > > Could you please suggest a way of how to do this.
>>>> > > This is just linear algebra, nothing to do with PETSc specifically.
>>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>>> > > >
>>>> > > > Thanks
>>>> > > >
>>>> > > > Cong Li
>>>> > > >
>>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>>>> wrote:
>>>> > > >
>>>> > > > > Cong Li <solvercorleone at gmail.com> writes:
>>>> > > > >
>>>> > > > > > Hello,
>>>> > > > > >
>>>> > > > > > I am a PhD student using PETsc for my research.
>>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse
>>>> > > matrix-matrix
>>>> > > > > > multiplication) by using PETSc.
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> > >
>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>>> > > > >
>>>> > >
>>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/f2da47be/attachment.html>

From martin.vymazal at vki.ac.be  Tue Aug  4 12:15:24 2015
From: martin.vymazal at vki.ac.be (Martin Vymazal)
Date: Tue, 04 Aug 2015 18:15:24 +0100
Subject: [petsc-users] C++ wrapper for petsc vector
Message-ID: <1585215.z8oGCl3ZR4@tinlaptop>

Hello,

 I'm trying to create a small C++ class to wrap the 'Vec' object. This class 
has an internal pointer to a member variable of type Vec, and in its 
destructor, it calls VecDestroy. Unfortunately, my test program segfaults and 
this seems to be due to the fact that the destructor of the wrapper class is 
called after main() calls PetscFinalize(). Apparently VecDestroy performs some 
collective communication, so calling it after PetscFinalize() is too late. How 
can I fix this?

Thank you,

 Martin Vymazal

From jed at jedbrown.org  Tue Aug  4 12:20:19 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 04 Aug 2015 11:20:19 -0600
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-=ci4MtOjyizvftg2ibAiERmcC3_PjtTzDDyot_LDCxuw@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<20150804094631.GF52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-kagX63gc9_idp-m1LAB8b_Wq_PocAnMVU3cMjCKC=o6w@mail.gmail.com>
	<CAMYG4GmuehhW-Kp9swH5U_YwC_7M=fqm5LzJwCCm9jn3YWbGrw@mail.gmail.com>
	<CALSmn-=ci4MtOjyizvftg2ibAiERmcC3_PjtTzDDyot_LDCxuw@mail.gmail.com>
Message-ID: <87pp33otrg.fsf@jedbrown.org>

Cong Li <solvercorleone at gmail.com> writes:

> Thanks very much for your suggestions.
>
> Actually I am also considering using communication-avoiding matrix power
> kernel (CA-MPK) to do SPMM. However, the communication pattern of CA-MPK
> depends on the sparsity pattern. So the implementation could be very
> complex for some of problems.The efficient implementation of CA-MPK is
> actually one of problems I want to solve during my PhD course.

You can get the pattern with MatIncreaseOverlap.

Lack of useful preconditioning all but destroys the practical utility of
these matrix powers kernels for solvers, even if the surface area/volume
ratio is favorable (an odd corner of the relevant problem space).  I
consider it a fashion that has already gotten more attention than it
deserves and should be on its way out.  I would recommend a different
thesis topic if you want to do work with a tangible impact on practical
computational science and engineering.  If you're really excited about
this niche, by all means, have fun.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/a766cde9/attachment.pgp>

From gpau at lbl.gov  Tue Aug  4 12:22:18 2015
From: gpau at lbl.gov (George Pau)
Date: Tue, 4 Aug 2015 10:22:18 -0700
Subject: [petsc-users] configure error with --with-shared-libraries=0
	--download-elemental
Message-ID: <CABUTOumM-jTcKRX2R5RksjVQfhc+YD0WCKcY0Y+vBSTNy1LAZQ@mail.gmail.com>

Hi,

I am configuring petsc on NERSC/Edison with the following configure
arguments:

--with-debugging=1 --with-shared-libraries=0
--prefix=/global/homes/g/gpau/clm-rom/install/t
pls --with-cxx-dialect=C++11 --download-elemental --download-mumps
--download-scalapack --do
wnload-parmetis --download-metis --download-hdf5 --download-netcdf
--with-x=0 --with-cc=/opt
/cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC
--with-fc=/opt/cray/crayp
e/2.3.1/bin/ftn

but it seems like the --with-shared-libraries=0 is not propagated when
building elemental.  In the end I get the following error:

gmake[3]: Leaving directory
`/global/u1/g/gpau/clm-rom/build/tpl-build/petsc/petsc-3.6.1-sou
rce/arch-linux2-c-debug/externalpackages/Elemental-0.85-p1/build'/usr/bin/ld:
/usr/common/us
g/darshan/2.3.0/lib/libdarshan-mpi-io.a(darshan-mpi-io.o): relocation
R_X86_64_32 against `.
rodata' can not be used when making a shared object; recompile with -fPIC

Any help will be appreciated.  Attached is the configure log file.

Thanks,
George


-- 
George Pau
Earth Sciences Division
Lawrence Berkeley National Laboratory
One Cyclotron, MS 74-120
Berkeley, CA 94720

(510) 486-7196
gpau at lbl.gov
http://esd.lbl.gov/about/staff/georgepau/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/bfa590b2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc-configure-out.log
Type: application/octet-stream
Size: 210962 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/bfa590b2/attachment-0001.obj>

From knepley at gmail.com  Tue Aug  4 12:22:54 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 4 Aug 2015 12:22:54 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <87pp33otrg.fsf@jedbrown.org>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<20150804094631.GF52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-kagX63gc9_idp-m1LAB8b_Wq_PocAnMVU3cMjCKC=o6w@mail.gmail.com>
	<CAMYG4GmuehhW-Kp9swH5U_YwC_7M=fqm5LzJwCCm9jn3YWbGrw@mail.gmail.com>
	<CALSmn-=ci4MtOjyizvftg2ibAiERmcC3_PjtTzDDyot_LDCxuw@mail.gmail.com>
	<87pp33otrg.fsf@jedbrown.org>
Message-ID: <CAMYG4GkQBOUj7kiy=MZk8h-WHh55e9JKpdiHMW6h_HJ655eqyw@mail.gmail.com>

On Tue, Aug 4, 2015 at 12:20 PM, Jed Brown <jed at jedbrown.org> wrote:

> Cong Li <solvercorleone at gmail.com> writes:
>
> > Thanks very much for your suggestions.
> >
> > Actually I am also considering using communication-avoiding matrix power
> > kernel (CA-MPK) to do SPMM. However, the communication pattern of CA-MPK
> > depends on the sparsity pattern. So the implementation could be very
> > complex for some of problems.The efficient implementation of CA-MPK is
> > actually one of problems I want to solve during my PhD course.
>
> You can get the pattern with MatIncreaseOverlap.
>
> Lack of useful preconditioning all but destroys the practical utility of
> these matrix powers kernels for solvers, even if the surface area/volume
> ratio is favorable (an odd corner of the relevant problem space).  I
> consider it a fashion that has already gotten more attention than it
> deserves and should be on its way out.  I would recommend a different
> thesis topic if you want to do work with a tangible impact on practical
> computational science and engineering.  If you're really excited about
> this niche, by all means, have fun.
>

Totally true if you are doing this for solvers, particularly elliptic
solvers. If you are
planning to use it for something like Tall-Skinny QR, or maybe some graph
problems
it could make more sense.

  Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/8454656c/attachment.html>

From knepley at gmail.com  Tue Aug  4 12:24:14 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 4 Aug 2015 12:24:14 -0500
Subject: [petsc-users] C++ wrapper for petsc vector
In-Reply-To: <1585215.z8oGCl3ZR4@tinlaptop>
References: <1585215.z8oGCl3ZR4@tinlaptop>
Message-ID: <CAMYG4Gk3gudapOP9F7nVDfJdNJ61DUOkf4Gyo0mpOTrQFgVd=Q@mail.gmail.com>

On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal <martin.vymazal at vki.ac.be>
wrote:

> Hello,
>
>  I'm trying to create a small C++ class to wrap the 'Vec' object. This
> class
> has an internal pointer to a member variable of type Vec, and in its
> destructor, it calls VecDestroy. Unfortunately, my test program segfaults
> and
> this seems to be due to the fact that the destructor of the wrapper class
> is
> called after main() calls PetscFinalize(). Apparently VecDestroy performs
> some
> collective communication, so calling it after PetscFinalize() is too late.
> How
> can I fix this?
>

1) Declare your C++ in a scope, so that it goes out of scope before
PetscFinalize()

2) Is there any utility to this wrapper since everything can be called
directly from C++?

  Thanks,

     Matt


> Thank you,
>
>  Martin Vymazal
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/463bad30/attachment.html>

From jed at jedbrown.org  Tue Aug  4 12:24:38 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 04 Aug 2015 11:24:38 -0600
Subject: [petsc-users] C++ wrapper for petsc vector
In-Reply-To: <1585215.z8oGCl3ZR4@tinlaptop>
References: <1585215.z8oGCl3ZR4@tinlaptop>
Message-ID: <87mvy7otk9.fsf@jedbrown.org>

Martin Vymazal <martin.vymazal at vki.ac.be> writes:

> Hello,
>
>  I'm trying to create a small C++ class to wrap the 'Vec' object. 

A word of warning: Lots of people try this, but I've never seen an
implementation that wasn't a leaky, high-maintenance abstraction with
purely cosmetic value.

> This class has an internal pointer to a member variable of type Vec,
> and in its destructor, it calls VecDestroy. Unfortunately, my test
> program segfaults and this seems to be due to the fact that the
> destructor of the wrapper class is called after main() calls
> PetscFinalize(). Apparently VecDestroy performs some collective
> communication, so calling it after PetscFinalize() is too late. How
> can I fix this?

Scope so your objects are destroyed before PetscFinalize.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/74c824c9/attachment.pgp>

From jed at jedbrown.org  Tue Aug  4 12:26:39 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 04 Aug 2015 11:26:39 -0600
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAMYG4GkQBOUj7kiy=MZk8h-WHh55e9JKpdiHMW6h_HJ655eqyw@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<20150804094631.GF52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-kagX63gc9_idp-m1LAB8b_Wq_PocAnMVU3cMjCKC=o6w@mail.gmail.com>
	<CAMYG4GmuehhW-Kp9swH5U_YwC_7M=fqm5LzJwCCm9jn3YWbGrw@mail.gmail.com>
	<CALSmn-=ci4MtOjyizvftg2ibAiERmcC3_PjtTzDDyot_LDCxuw@mail.gmail.com>
	<87pp33otrg.fsf@jedbrown.org>
	<CAMYG4GkQBOUj7kiy=MZk8h-WHh55e9JKpdiHMW6h_HJ655eqyw@mail.gmail.com>
Message-ID: <87io8votgw.fsf@jedbrown.org>

Matthew Knepley <knepley at gmail.com> writes:
> Totally true if you are doing this for solvers, particularly elliptic
> solvers. If you are planning to use it for something like Tall-Skinny
> QR, or maybe some graph problems it could make more sense.

Note that TSQR does not involve matrix powers.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/50f7c92c/attachment.pgp>

From martin.vymazal at vki.ac.be  Tue Aug  4 12:30:58 2015
From: martin.vymazal at vki.ac.be (Martin Vymazal)
Date: Tue, 04 Aug 2015 18:30:58 +0100
Subject: [petsc-users] C++ wrapper for petsc vector
In-Reply-To: <CAMYG4Gk3gudapOP9F7nVDfJdNJ61DUOkf4Gyo0mpOTrQFgVd=Q@mail.gmail.com>
References: <1585215.z8oGCl3ZR4@tinlaptop>
	<CAMYG4Gk3gudapOP9F7nVDfJdNJ61DUOkf4Gyo0mpOTrQFgVd=Q@mail.gmail.com>
Message-ID: <3775667.cNPH29TcoF@tinlaptop>

Hello,

 1) thank you for the suggestion.
 2) suppose you want to be able to switch between solver implementations 
provided by different libraries (e.g. petsc/trilinos). One obvious approach is 
through inheritance, but in order to keep child interfaces conforming to base 
class signatures, I need to wrap the solvers. If you can think of a better 
approach that would keep switching between solvers easy, I'm open to 
suggestions. I don't really need both trilinos and petsc, this is just a 
matter of curiosity.

Best regards,

 Martin Vymazal


On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote:
> On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal <martin.vymazal at vki.ac.be>
> 
> wrote:
> > Hello,
> > 
> >  I'm trying to create a small C++ class to wrap the 'Vec' object. This
> > 
> > class
> > has an internal pointer to a member variable of type Vec, and in its
> > destructor, it calls VecDestroy. Unfortunately, my test program segfaults
> > and
> > this seems to be due to the fact that the destructor of the wrapper class
> > is
> > called after main() calls PetscFinalize(). Apparently VecDestroy performs
> > some
> > collective communication, so calling it after PetscFinalize() is too late.
> > How
> > can I fix this?
> 
> 1) Declare your C++ in a scope, so that it goes out of scope before
> PetscFinalize()
> 
> 2) Is there any utility to this wrapper since everything can be called
> directly from C++?
> 
>   Thanks,
> 
>      Matt
> 
> > Thank you,
> > 
> >  Martin Vymazal


From knepley at gmail.com  Tue Aug  4 12:34:01 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 4 Aug 2015 12:34:01 -0500
Subject: [petsc-users] C++ wrapper for petsc vector
In-Reply-To: <3775667.cNPH29TcoF@tinlaptop>
References: <1585215.z8oGCl3ZR4@tinlaptop>
	<CAMYG4Gk3gudapOP9F7nVDfJdNJ61DUOkf4Gyo0mpOTrQFgVd=Q@mail.gmail.com>
	<3775667.cNPH29TcoF@tinlaptop>
Message-ID: <CAMYG4Gmu0PStcfxxsStxANj1VgzoOdy3qgMvMPBETWGR3AA0fw@mail.gmail.com>

On Tue, Aug 4, 2015 at 12:30 PM, Martin Vymazal <martin.vymazal at vki.ac.be>
wrote:

> Hello,
>
>  1) thank you for the suggestion.
>  2) suppose you want to be able to switch between solver implementations
> provided by different libraries (e.g. petsc/trilinos). One obvious
> approach is
> through inheritance, but in order to keep child interfaces conforming to
> base
> class signatures, I need to wrap the solvers. If you can think of a better
> approach that would keep switching between solvers easy, I'm open to
> suggestions. I don't really need both trilinos and petsc, this is just a
> matter of curiosity.
>

I think this is a bad way of doing that. You would introduce a whole bunch
of types
at the top level which are meaningless (just like Trilinos). If you want
another solver,
just wrap it up in the PETSc PCShell object (two calls at most). Its an
easier to write
wrapper, which also fits in with all the debugging and profiling. We wrap a
bunch of
things this way like Hypre (70+ packages last time I checked).

   Matt


> Best regards,
>
>  Martin Vymazal
>
>
> On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote:
> > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal <
> martin.vymazal at vki.ac.be>
> >
> > wrote:
> > > Hello,
> > >
> > >  I'm trying to create a small C++ class to wrap the 'Vec' object. This
> > >
> > > class
> > > has an internal pointer to a member variable of type Vec, and in its
> > > destructor, it calls VecDestroy. Unfortunately, my test program
> segfaults
> > > and
> > > this seems to be due to the fact that the destructor of the wrapper
> class
> > > is
> > > called after main() calls PetscFinalize(). Apparently VecDestroy
> performs
> > > some
> > > collective communication, so calling it after PetscFinalize() is too
> late.
> > > How
> > > can I fix this?
> >
> > 1) Declare your C++ in a scope, so that it goes out of scope before
> > PetscFinalize()
> >
> > 2) Is there any utility to this wrapper since everything can be called
> > directly from C++?
> >
> >   Thanks,
> >
> >      Matt
> >
> > > Thank you,
> > >
> > >  Martin Vymazal
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/e274775a/attachment-0001.html>

From martin.vymazal at vki.ac.be  Tue Aug  4 13:07:56 2015
From: martin.vymazal at vki.ac.be (Martin Vymazal)
Date: Tue, 04 Aug 2015 19:07:56 +0100
Subject: [petsc-users] C++ wrapper for petsc vector
In-Reply-To: <CAMYG4Gmu0PStcfxxsStxANj1VgzoOdy3qgMvMPBETWGR3AA0fw@mail.gmail.com>
References: <1585215.z8oGCl3ZR4@tinlaptop> <3775667.cNPH29TcoF@tinlaptop>
	<CAMYG4Gmu0PStcfxxsStxANj1VgzoOdy3qgMvMPBETWGR3AA0fw@mail.gmail.com>
Message-ID: <2182692.n8DuiFgnrM@tinlaptop>

On Tuesday, August 04, 2015 12:34:01 PM Matthew Knepley wrote:
> On Tue, Aug 4, 2015 at 12:30 PM, Martin Vymazal <martin.vymazal at vki.ac.be>
> 
> wrote:
> > Hello,
> > 
> >  1) thank you for the suggestion.
> >  2) suppose you want to be able to switch between solver implementations
> > 
> > provided by different libraries (e.g. petsc/trilinos). One obvious
> > approach is
> > through inheritance, but in order to keep child interfaces conforming to
> > base
> > class signatures, I need to wrap the solvers. If you can think of a better
> > approach that would keep switching between solvers easy, I'm open to
> > suggestions. I don't really need both trilinos and petsc, this is just a
> > matter of curiosity.
> 
> I think this is a bad way of doing that. You would introduce a whole bunch
> of types
> at the top level which are meaningless (just like Trilinos). If you want
> another solver,
> just wrap it up in the PETSc PCShell object (two calls at most). Its an
> easier to write
> wrapper, which also fits in with all the debugging and profiling. We wrap a
> bunch of
> things this way like Hypre (70+ packages last time I checked).
> 
>    Matt


OK, I was not aware of PCShell (I'm new to PETSc). I don't know Trilinos well 
enough to judge whether it's good from software engineering point of view or 
not, but allow me one last question. What would happen if I wrap all 'other' 
solvers in PCShell and then for some reason, PETSc is not available. None of 
the other solvers would be accessible (unless I modify the source code), so 
wrapping everything using PCShell creates a strong dependency on one 
particular library (PETSc), doesn't it?

Martin


> 
> > Best regards,
> > 
> >  Martin Vymazal
> > 
> > On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote:
> > > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal <
> > 
> > martin.vymazal at vki.ac.be>
> > 
> > > wrote:
> > > > Hello,
> > > > 
> > > >  I'm trying to create a small C++ class to wrap the 'Vec' object. This
> > > > 
> > > > class
> > > > has an internal pointer to a member variable of type Vec, and in its
> > > > destructor, it calls VecDestroy. Unfortunately, my test program
> > 
> > segfaults
> > 
> > > > and
> > > > this seems to be due to the fact that the destructor of the wrapper
> > 
> > class
> > 
> > > > is
> > > > called after main() calls PetscFinalize(). Apparently VecDestroy
> > 
> > performs
> > 
> > > > some
> > > > collective communication, so calling it after PetscFinalize() is too
> > 
> > late.
> > 
> > > > How
> > > > can I fix this?
> > > 
> > > 1) Declare your C++ in a scope, so that it goes out of scope before
> > > PetscFinalize()
> > > 
> > > 2) Is there any utility to this wrapper since everything can be called
> > > directly from C++?
> > > 
> > >   Thanks,
> > >   
> > >      Matt
> > > > 
> > > > Thank you,
> > > > 
> > > >  Martin Vymazal


From bsmith at mcs.anl.gov  Tue Aug  4 13:09:04 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 13:09:04 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
Message-ID: <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>


  From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be created.

  Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX.

  Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant.

  Barry

> On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
> 
> Thanks very much. This answer is very helpful.
> And I have a following question.
> If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. 
> PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C)
> should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement. 
> 
> Thanks
> 
> Cong Li
> 
> On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > I am sorry that I should have explained it more clearly.
> > Actually I want to compute a recurrence.
> >
> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on.
> > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...]
> 
>    First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create
> B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc.
> 
>    Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix.
> 
>   Barry
> 
> 
> 
> >
> > Is there any way to do this efficiently.
> >
> >
> >
> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > Thanks for your reply.
> > >
> > > I have an other question.
> > > I want to do SPMM several times and combine result matrices into one bigger
> > > matrix.
> > > for example
> > > I firstly calculate AX1=B1, AX2=B2 ...
> > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > >
> > > Could you please suggest a way of how to do this.
> > This is just linear algebra, nothing to do with PETSc specifically.
> > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > >
> > > > Cong Li <solvercorleone at gmail.com> writes:
> > > >
> > > > > Hello,
> > > > >
> > > > > I am a PhD student using PETsc for my research.
> > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > > > > multiplication) by using PETSc.
> > > >
> > > >
> > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > >
> >
> 
> 


From bsmith at mcs.anl.gov  Tue Aug  4 13:19:33 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 13:19:33 -0500
Subject: [petsc-users] C++ wrapper for petsc vector
In-Reply-To: <CAMYG4Gmu0PStcfxxsStxANj1VgzoOdy3qgMvMPBETWGR3AA0fw@mail.gmail.com>
References: <1585215.z8oGCl3ZR4@tinlaptop>
	<CAMYG4Gk3gudapOP9F7nVDfJdNJ61DUOkf4Gyo0mpOTrQFgVd=Q@mail.gmail.com>
	<3775667.cNPH29TcoF@tinlaptop>
	<CAMYG4Gmu0PStcfxxsStxANj1VgzoOdy3qgMvMPBETWGR3AA0fw@mail.gmail.com>
Message-ID: <644E3374-62A7-4CF7-B116-3AAA95C46F20@mcs.anl.gov>


> On Aug 4, 2015, at 12:34 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Aug 4, 2015 at 12:30 PM, Martin Vymazal <martin.vymazal at vki.ac.be> wrote:
> Hello,
> 
>  1) thank you for the suggestion.
>  2) suppose you want to be able to switch between solver implementations
> provided by different libraries (e.g. petsc/trilinos). One obvious approach is
> through inheritance, but in order to keep child interfaces conforming to base
> class signatures, I need to wrap the solvers. If you can think of a better
> approach that would keep switching between solvers easy, I'm open to
> suggestions. I don't really need both trilinos and petsc, this is just a
> matter of curiosity.
> 
> I think this is a bad way of doing that. You would introduce a whole bunch of types
> at the top level which are meaningless (just like Trilinos). If you want another solver,
> just wrap it up in the PETSc PCShell object (two calls at most). Its an easier to write
> wrapper, which also fits in with all the debugging and profiling. We wrap a bunch of
> things this way like Hypre (70+ packages last time I checked).

   As Matt points out PETSc is already a wrapper library in that it is designed to easily wrap around other solver libraries to use the common PETSc API. So what you are doing is writing a wrapper library around a wrapper library, certainly possible but of questionable value.

   Note also that what makes particular solvers powerful is their use of "extra information" to obtain fast convergence over the only information being the "matrix values"; so for example the near null space for some algebraic multigrid methods, the geometric information for geometric multigrid, the "block structure" for "block preconditioners (what we call PCFIELDSPLIT in PETSc), etc. Do you really want to handle all of this "extra information" in your wrapper class? We already understand these details and provide APIs for them, it would be a huge thankless project for you to reproduce them all in your API.

 Barry


> 
>    Matt
>  
> Best regards,
> 
>  Martin Vymazal
> 
> 
> On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote:
> > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal <martin.vymazal at vki.ac.be>
> >
> > wrote:
> > > Hello,
> > >
> > >  I'm trying to create a small C++ class to wrap the 'Vec' object. This
> > >
> > > class
> > > has an internal pointer to a member variable of type Vec, and in its
> > > destructor, it calls VecDestroy. Unfortunately, my test program segfaults
> > > and
> > > this seems to be due to the fact that the destructor of the wrapper class
> > > is
> > > called after main() calls PetscFinalize(). Apparently VecDestroy performs
> > > some
> > > collective communication, so calling it after PetscFinalize() is too late.
> > > How
> > > can I fix this?
> >
> > 1) Declare your C++ in a scope, so that it goes out of scope before
> > PetscFinalize()
> >
> > 2) Is there any utility to this wrapper since everything can be called
> > directly from C++?
> >
> >   Thanks,
> >
> >      Matt
> >
> > > Thank you,
> > >
> > >  Martin Vymazal
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From martin.vymazal at vki.ac.be  Tue Aug  4 13:48:02 2015
From: martin.vymazal at vki.ac.be (Martin Vymazal)
Date: Tue, 04 Aug 2015 19:48:02 +0100
Subject: [petsc-users] C++ wrapper for petsc vector
In-Reply-To: <644E3374-62A7-4CF7-B116-3AAA95C46F20@mcs.anl.gov>
References: <1585215.z8oGCl3ZR4@tinlaptop>
	<CAMYG4Gmu0PStcfxxsStxANj1VgzoOdy3qgMvMPBETWGR3AA0fw@mail.gmail.com>
	<644E3374-62A7-4CF7-B116-3AAA95C46F20@mcs.anl.gov>
Message-ID: <1568148.yQyVaWRmv2@tinlaptop>

On Tuesday, August 04, 2015 01:19:33 PM Barry Smith wrote:
> > On Aug 4, 2015, at 12:34 PM, Matthew Knepley <knepley at gmail.com> wrote:
> > 
> > On Tue, Aug 4, 2015 at 12:30 PM, Martin Vymazal <martin.vymazal at vki.ac.be>
> > wrote: Hello,
> > 
> >  1) thank you for the suggestion.
> >  2) suppose you want to be able to switch between solver implementations
> > 
> > provided by different libraries (e.g. petsc/trilinos). One obvious
> > approach is through inheritance, but in order to keep child interfaces
> > conforming to base class signatures, I need to wrap the solvers. If you
> > can think of a better approach that would keep switching between solvers
> > easy, I'm open to suggestions. I don't really need both trilinos and
> > petsc, this is just a matter of curiosity.
> > 
> > I think this is a bad way of doing that. You would introduce a whole bunch
> > of types at the top level which are meaningless (just like Trilinos). If
> > you want another solver, just wrap it up in the PETSc PCShell object (two
> > calls at most). Its an easier to write wrapper, which also fits in with
> > all the debugging and profiling. We wrap a bunch of things this way like
> > Hypre (70+ packages last time I checked).
> 
>    As Matt points out PETSc is already a wrapper library in that it is
> designed to easily wrap around other solver libraries to use the common
> PETSc API. So what you are doing is writing a wrapper library around a
> wrapper library, certainly possible but of questionable value.
> 
>    Note also that what makes particular solvers powerful is their use of
> "extra information" to obtain fast convergence over the only information
> being the "matrix values"; so for example the near null space for some
> algebraic multigrid methods, the geometric information for geometric
> multigrid, the "block structure" for "block preconditioners (what we call
> PCFIELDSPLIT in PETSc), etc. Do you really want to handle all of this
> "extra information" in your wrapper class? We already understand these
> details and provide APIs for them, it would be a huge thankless project for
> you to reproduce them all in your API.
> 
>  Barry


Of course I prefer to rely on other people's expertise in the domain instead 
of doing the job over again (with probably worse result). What you say about 
PETSc being a wrapper for other libraries makes sense.

Martin


> 
> >    Matt
> > 
> > Best regards,
> > 
> >  Martin Vymazal
> > 
> > On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote:
> > > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal
> > > <martin.vymazal at vki.ac.be>
> > > 
> > > wrote:
> > > > Hello,
> > > > 
> > > >  I'm trying to create a small C++ class to wrap the 'Vec' object. This
> > > > 
> > > > class
> > > > has an internal pointer to a member variable of type Vec, and in its
> > > > destructor, it calls VecDestroy. Unfortunately, my test program
> > > > segfaults
> > > > and
> > > > this seems to be due to the fact that the destructor of the wrapper
> > > > class
> > > > is
> > > > called after main() calls PetscFinalize(). Apparently VecDestroy
> > > > performs
> > > > some
> > > > collective communication, so calling it after PetscFinalize() is too
> > > > late.
> > > > How
> > > > can I fix this?
> > > 
> > > 1) Declare your C++ in a scope, so that it goes out of scope before
> > > PetscFinalize()
> > > 
> > > 2) Is there any utility to this wrapper since everything can be called
> > > directly from C++?
> > > 
> > >   Thanks,
> > >   
> > >      Matt
> > > > 
> > > > Thank you,
> > > > 
> > > >  Martin Vymazal


From bsmith at mcs.anl.gov  Tue Aug  4 14:33:31 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 14:33:31 -0500
Subject: [petsc-users] configure error with --with-shared-libraries=0
	--download-elemental
In-Reply-To: <CABUTOumM-jTcKRX2R5RksjVQfhc+YD0WCKcY0Y+vBSTNy1LAZQ@mail.gmail.com>
References: <CABUTOumM-jTcKRX2R5RksjVQfhc+YD0WCKcY0Y+vBSTNy1LAZQ@mail.gmail.com>
Message-ID: <BB466FC6-FE4E-45CC-A758-A2B6DDDC96FB@mcs.anl.gov>


  Aghh, looks like CMake does not have a universal standard for indicating shared libraries or not.

  Please try the attached elemental.py file and see if that resolves your difficulties. 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: elemental.py
Type: text/x-python-script
Size: 2705 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/3a141556/attachment.bin>
-------------- next part --------------


  Barry

> On Aug 4, 2015, at 12:22 PM, George Pau <gpau at lbl.gov> wrote:
> 
> Hi,
> 
> I am configuring petsc on NERSC/Edison with the following configure arguments:
> 
> --with-debugging=1 --with-shared-libraries=0 --prefix=/global/homes/g/gpau/clm-rom/install/t
> pls --with-cxx-dialect=C++11 --download-elemental --download-mumps --download-scalapack --do
> wnload-parmetis --download-metis --download-hdf5 --download-netcdf --with-x=0 --with-cc=/opt
> /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC --with-fc=/opt/cray/crayp
> e/2.3.1/bin/ftn
> 
> but it seems like the --with-shared-libraries=0 is not propagated when building elemental.  In the end I get the following error:
> 
> gmake[3]: Leaving directory `/global/u1/g/gpau/clm-rom/build/tpl-build/petsc/petsc-3.6.1-sou
> rce/arch-linux2-c-debug/externalpackages/Elemental-0.85-p1/build'/usr/bin/ld: /usr/common/us
> g/darshan/2.3.0/lib/libdarshan-mpi-io.a(darshan-mpi-io.o): relocation R_X86_64_32 against `.
> rodata' can not be used when making a shared object; recompile with -fPIC
> 
> Any help will be appreciated.  Attached is the configure log file.
> 
> Thanks,
> George
> 
> 
> -- 
> George Pau
> Earth Sciences Division
> Lawrence Berkeley National Laboratory
> One Cyclotron, MS 74-120
> Berkeley, CA 94720
> 
> (510) 486-7196
> gpau at lbl.gov
> http://esd.lbl.gov/about/staff/georgepau/
> <petsc-configure-out.log>


From gpau at lbl.gov  Tue Aug  4 17:06:55 2015
From: gpau at lbl.gov (George Pau)
Date: Tue, 4 Aug 2015 15:06:55 -0700
Subject: [petsc-users] configure error with --with-shared-libraries=0
	--download-elemental
In-Reply-To: <BB466FC6-FE4E-45CC-A758-A2B6DDDC96FB@mcs.anl.gov>
References: <CABUTOumM-jTcKRX2R5RksjVQfhc+YD0WCKcY0Y+vBSTNy1LAZQ@mail.gmail.com>
	<BB466FC6-FE4E-45CC-A758-A2B6DDDC96FB@mcs.anl.gov>
Message-ID: <CABUTOu=Kc3xyJmSJ1adtHP3dzuqh0nFnFR+fmNMzmBwSB6Oiwg@mail.gmail.com>

Barry,

Thanks.  The patch works.

George


On Tue, Aug 4, 2015 at 12:33 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Aghh, looks like CMake does not have a universal standard for indicating
> shared libraries or not.
>
>   Please try the attached elemental.py file and see if that resolves your
> difficulties.
>
>
>
>   Barry
>
> > On Aug 4, 2015, at 12:22 PM, George Pau <gpau at lbl.gov> wrote:
> >
> > Hi,
> >
> > I am configuring petsc on NERSC/Edison with the following configure
> arguments:
> >
> > --with-debugging=1 --with-shared-libraries=0
> --prefix=/global/homes/g/gpau/clm-rom/install/t
> > pls --with-cxx-dialect=C++11 --download-elemental --download-mumps
> --download-scalapack --do
> > wnload-parmetis --download-metis --download-hdf5 --download-netcdf
> --with-x=0 --with-cc=/opt
> > /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC
> --with-fc=/opt/cray/crayp
> > e/2.3.1/bin/ftn
> >
> > but it seems like the --with-shared-libraries=0 is not propagated when
> building elemental.  In the end I get the following error:
> >
> > gmake[3]: Leaving directory
> `/global/u1/g/gpau/clm-rom/build/tpl-build/petsc/petsc-3.6.1-sou
> >
> rce/arch-linux2-c-debug/externalpackages/Elemental-0.85-p1/build'/usr/bin/ld:
> /usr/common/us
> > g/darshan/2.3.0/lib/libdarshan-mpi-io.a(darshan-mpi-io.o): relocation
> R_X86_64_32 against `.
> > rodata' can not be used when making a shared object; recompile with -fPIC
> >
> > Any help will be appreciated.  Attached is the configure log file.
> >
> > Thanks,
> > George
> >
> >
> > --
> > George Pau
> > Earth Sciences Division
> > Lawrence Berkeley National Laboratory
> > One Cyclotron, MS 74-120
> > Berkeley, CA 94720
> >
> > (510) 486-7196
> > gpau at lbl.gov
> > http://esd.lbl.gov/about/staff/georgepau/
> > <petsc-configure-out.log>
>
>
>


-- 
George Pau
Earth Sciences Division
Lawrence Berkeley National Laboratory
One Cyclotron, MS 74-120
Berkeley, CA 94720

(510) 486-7196
gpau at lbl.gov
http://esd.lbl.gov/about/staff/georgepau/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/72f224f1/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue Aug  4 17:08:36 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 17:08:36 -0500
Subject: [petsc-users] Get CPU time from events
In-Reply-To: <EB15697B-84BD-4A2F-A3D4-A614CBEDED5A@mcs.anl.gov>
References: <C638F8EEDA1EC642B0B13905A2ADAC5C31EE8AF5@SS10175.sancorseguros.net>
	<CAMYG4G=CAzVjdwbcyy7mgi2NVPb+xKNZHemv+Cg9osC6RmKKnw@mail.gmail.com>
	<C638F8EEDA1EC642B0B13905A2ADAC5C31EEA073@SS10175.sancorseguros.net>
	<CAMYG4GmE=7=xFfyLU+CVZoLhjniHiAZdXx6uBhQg0Zyey3Jk-Q@mail.gmail.com>
	<C638F8EEDA1EC642B0B13905A2ADAC5C31EEA0A7@SS10175.sancorseguros.net>
	<EB15697B-84BD-4A2F-A3D4-A614CBEDED5A@mcs.anl.gov>
Message-ID: <53676E55-FEA5-4B0A-B85B-DBC0DF8421C7@mcs.anl.gov>


   I have added error checking in the branches maint, master, and next so no else will waste their time trying to figure out why the routine is returning nothing useful.

   Thanks for reporting the issue,

  Barry


> On Aug 3, 2015, at 12:00 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>> 
>> On Aug 3, 2015, at 11:52 AM, Theler German Guillermo <gtheler at cites-gss.com> wrote:
>> 
>> 
>>>       I get empty PetscEventPerfInfo structures after calling
>>>       PetscLogEventGetPerfInfo(), i.e. both integers and floats are
>>>       zero, as
>>> If you do not pass -log_summary, you have to call PetscLogBegin()
>>> after PetscInitialize() to
>>> get it to start logging.
>> 
>> Got it! Thanks.
>> Maybe that sentence should be added to the description of
>> PetscLogEventGetPerfInfo() and friends.
> 
>   We should probably trigger an error, with a very helpful error message, if these are called but the initialization was never done.
> 
>  Barry
> 
>> 
>> --
>> jeremy
>> ________________________________
>> Imprima este mensaje s?lo si es absolutamente necesario.
>> Para imprimir, en lo posible utilice el papel de ambos lados.
>> El Grupo Sancor Seguros se compromete con el cuidado del medioambiente.
>> 
>> 
>> 
>> ************AVISO DE CONFIDENCIALIDAD************
>> 
>> El Grupo Sancor Seguros comunica que:
>> 
>> Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por ley. Si usted recibi? este mensaje err?neamente, por favor notif?quenos respondiendo al remitente, borre el mensaje original y destruya las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje. La publicaci?n, uso, copia o impresi?n total o parcial de este mensaje o documentos adjuntos queda prohibida.
>> 
>> Disposici?n DNDP 10-2008. El titular de los datos personales tiene la facultad de ejercer el derecho de acceso a los mismos en forma gratuita a intervalos no inferiores a seis meses, salvo que acredite un inter?s leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las denuncias y reclamos que se interpongan con relaci?n al incumplimiento de las normas sobre la protecci?n de datos personales.


From jychang48 at gmail.com  Tue Aug  4 17:09:11 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Tue, 4 Aug 2015 17:09:11 -0500
Subject: [petsc-users] Profiling/checkpoints
Message-ID: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>

Hi all,

Not sure what to title this mail, but let me begin with an analogy of what
I am looking for:

In MATLAB, we could insert breakpoints into the code, such that when we run
the program, we could pause the execution and see what the variables
contain and what is going on exactly within your function calls. Is there a
way to do something like this within PETSc?

I want to see what's going on within certain PETSc functions within
KSPSolve. For instance, -log_summary says that my solver invokes calls to
functions like VecMDot and VecMAXPY but I would like to know exactly how
many vectors each of these functions are working with. Morever, I would
also like to get a general overview of the properties of the matrices MatPtAP
and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc).

Or

Above functions happen to be invoked from gamg, so is it possible to tell
just from the parameters fed into PETSc what the answers to the above may
be?

Thanks,
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/70dbfa88/attachment.html>

From bsmith at mcs.anl.gov  Tue Aug  4 17:09:59 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 17:09:59 -0500
Subject: [petsc-users] configure error with --with-shared-libraries=0
	--download-elemental
In-Reply-To: <CABUTOu=Kc3xyJmSJ1adtHP3dzuqh0nFnFR+fmNMzmBwSB6Oiwg@mail.gmail.com>
References: <CABUTOumM-jTcKRX2R5RksjVQfhc+YD0WCKcY0Y+vBSTNy1LAZQ@mail.gmail.com>
	<BB466FC6-FE4E-45CC-A758-A2B6DDDC96FB@mcs.anl.gov>
	<CABUTOu=Kc3xyJmSJ1adtHP3dzuqh0nFnFR+fmNMzmBwSB6Oiwg@mail.gmail.com>
Message-ID: <2387A54D-8E42-497C-94AF-21D484B87FF8@mcs.anl.gov>


  George,

    Thanks for letting us know. Now fixed in maint, master, next and will be in our next patch release

  Barry

commit 5219fea8a3666c8b7c80c852d85e563864a43d28
Author: Barry Smith <bsmith at mcs.anl.gov>
Date:   Tue Aug 4 14:33:41 2015 -0500

    Turn off elemental shared libraries if --with-shared-libraries=0 is used
    
    Reported-by: George Pau <gpau at lbl.gov>


> On Aug 4, 2015, at 5:06 PM, George Pau <gpau at lbl.gov> wrote:
> 
> Barry,
> 
> Thanks.  The patch works.  
> 
> George
> 
> 
> On Tue, Aug 4, 2015 at 12:33 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   Aghh, looks like CMake does not have a universal standard for indicating shared libraries or not.
> 
>   Please try the attached elemental.py file and see if that resolves your difficulties. 
> 
> 
> 
>   Barry
> 
> > On Aug 4, 2015, at 12:22 PM, George Pau <gpau at lbl.gov> wrote:
> >
> > Hi,
> >
> > I am configuring petsc on NERSC/Edison with the following configure arguments:
> >
> > --with-debugging=1 --with-shared-libraries=0 --prefix=/global/homes/g/gpau/clm-rom/install/t
> > pls --with-cxx-dialect=C++11 --download-elemental --download-mumps --download-scalapack --do
> > wnload-parmetis --download-metis --download-hdf5 --download-netcdf --with-x=0 --with-cc=/opt
> > /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC --with-fc=/opt/cray/crayp
> > e/2.3.1/bin/ftn
> >
> > but it seems like the --with-shared-libraries=0 is not propagated when building elemental.  In the end I get the following error:
> >
> > gmake[3]: Leaving directory `/global/u1/g/gpau/clm-rom/build/tpl-build/petsc/petsc-3.6.1-sou
> > rce/arch-linux2-c-debug/externalpackages/Elemental-0.85-p1/build'/usr/bin/ld: /usr/common/us
> > g/darshan/2.3.0/lib/libdarshan-mpi-io.a(darshan-mpi-io.o): relocation R_X86_64_32 against `.
> > rodata' can not be used when making a shared object; recompile with -fPIC
> >
> > Any help will be appreciated.  Attached is the configure log file.
> >
> > Thanks,
> > George
> >
> >
> > --
> > George Pau
> > Earth Sciences Division
> > Lawrence Berkeley National Laboratory
> > One Cyclotron, MS 74-120
> > Berkeley, CA 94720
> >
> > (510) 486-7196
> > gpau at lbl.gov
> > http://esd.lbl.gov/about/staff/georgepau/
> > <petsc-configure-out.log>
> 
> 
> 
> 
> 
> -- 
> George Pau
> Earth Sciences Division
> Lawrence Berkeley National Laboratory
> One Cyclotron, MS 74-120
> Berkeley, CA 94720
> 
> (510) 486-7196
> gpau at lbl.gov
> http://esd.lbl.gov/about/staff/georgepau/


From jed at jedbrown.org  Tue Aug  4 17:14:01 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 04 Aug 2015 16:14:01 -0600
Subject: [petsc-users] Profiling/checkpoints
In-Reply-To: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>
References: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>
Message-ID: <87wpxazopi.fsf@jedbrown.org>

Justin Chang <jychang48 at gmail.com> writes:

> Hi all,
>
> Not sure what to title this mail, but let me begin with an analogy of what
> I am looking for:
>
> In MATLAB, we could insert breakpoints into the code, such that when we run
> the program, we could pause the execution and see what the variables
> contain and what is going on exactly within your function calls. Is there a
> way to do something like this within PETSc?

Yes, they're called breakpoints and available with any debugger.

http://www.sourceware.org/gdb/onlinedocs/gdb/Set-Breaks.html

Compile with debugging symbols.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/59e8f961/attachment.pgp>

From bsmith at mcs.anl.gov  Tue Aug  4 17:22:48 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 17:22:48 -0500
Subject: [petsc-users] Profiling/checkpoints
In-Reply-To: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>
References: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>
Message-ID: <76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov>


  I do this by running in the debugger and putting in breakpoints. At the breakpoint you can look directly at variables like the n in call to VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it will print out the information about the object right then. Calling VecView() or MatView() directly will of course cause it to print the entire object which generally you don't want but you can do PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView to have it print size information etc about the object instead of the full object. In parallel instead of passing 0 for the viewer you need to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that share the object call the routine in the debugger but it is possible.

  Let us know how it goes and we can try to improve the experience with your suggestions,

  Barry

> On Aug 4, 2015, at 5:09 PM, Justin Chang <jychang48 at gmail.com> wrote:
> 
> Hi all,
> 
> Not sure what to title this mail, but let me begin with an analogy of what I am looking for:
> 
> In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc?
> 
> I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc).
> 
> Or
> 
> Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be?
> 
> Thanks,
> Justin


From bsmith at mcs.anl.gov  Tue Aug  4 17:36:08 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 17:36:08 -0500
Subject: [petsc-users] Profiling/checkpoints
In-Reply-To: <76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov>
References: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>
	<76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov>
Message-ID: <DD95C428-9DC1-4465-B575-B5249F2462C2@mcs.anl.gov>


  Correction, even in parallel you should be able to use a 0 for the viewer for calls to KSPView() etc; just make sure you do the same call on each process that shares the object. 

   To change the viewer format you do need to use PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD), PETSC_VIEWER_ASCII_INFO)  to change the format for parallel objects that live on PETSC_COMM_WORLD.


  Barry

PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the format of the sequential ASCII viewer.

> On Aug 4, 2015, at 5:22 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>  I do this by running in the debugger and putting in breakpoints. At the breakpoint you can look directly at variables like the n in call to VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it will print out the information about the object right then. Calling VecView() or MatView() directly will of course cause it to print the entire object which generally you don't want but you can do PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView to have it print size information etc about the object instead of the full object. In parallel instead of passing 0 for the viewer you need to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that share the object call the routine in the debugger but it is possible.
> 
>  Let us know how it goes and we can try to improve the experience with your suggestions,
> 
>  Barry
> 
>> On Aug 4, 2015, at 5:09 PM, Justin Chang <jychang48 at gmail.com> wrote:
>> 
>> Hi all,
>> 
>> Not sure what to title this mail, but let me begin with an analogy of what I am looking for:
>> 
>> In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc?
>> 
>> I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc).
>> 
>> Or
>> 
>> Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be?
>> 
>> Thanks,
>> Justin
> 


From patrick.sanan at gmail.com  Tue Aug  4 18:20:22 2015
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Wed, 5 Aug 2015 01:20:22 +0200
Subject: [petsc-users] Profiling/checkpoints
In-Reply-To: <DD95C428-9DC1-4465-B575-B5249F2462C2@mcs.anl.gov>
References: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>
	<76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov>
	<DD95C428-9DC1-4465-B575-B5249F2462C2@mcs.anl.gov>
Message-ID: <8CD18807-1074-4FE1-A6C5-181C550480E9@gmail.com>

And note that it is possible to run gdb/lldb on each of several MPI processes, useful when you hit a bug that only appears in parallel.  For example, this FAQ describes a couple of ways to do this:

https://www.open-mpi.org/faq/?category=debugging#serial-debuggers


> Am 05.08.2015 um 00:36 schrieb Barry Smith <bsmith at mcs.anl.gov>:
> 
> 
>  Correction, even in parallel you should be able to use a 0 for the viewer for calls to KSPView() etc; just make sure you do the same call on each process that shares the object. 
> 
>   To change the viewer format you do need to use PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD), PETSC_VIEWER_ASCII_INFO)  to change the format for parallel objects that live on PETSC_COMM_WORLD.
> 
> 
>  Barry
> 
> PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the format of the sequential ASCII viewer.
> 
>> On Aug 4, 2015, at 5:22 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>> I do this by running in the debugger and putting in breakpoints. At the breakpoint you can look directly at variables like the n in call to VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it will print out the information about the object right then. Calling VecView() or MatView() directly will of course cause it to print the entire object which generally you don't want but you can do PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView to have it print size information etc about the object instead of the full object. In parallel instead of passing 0 for the viewer you need to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that share the object call the routine in the debugger but it is possible.
>> 
>> Let us know how it goes and we can try to improve the experience with your suggestions,
>> 
>> Barry
>> 
>>> On Aug 4, 2015, at 5:09 PM, Justin Chang <jychang48 at gmail.com> wrote:
>>> 
>>> Hi all,
>>> 
>>> Not sure what to title this mail, but let me begin with an analogy of what I am looking for:
>>> 
>>> In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc?
>>> 
>>> I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc).
>>> 
>>> Or
>>> 
>>> Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be?
>>> 
>>> Thanks,
>>> Justin
> 

From bsmith at mcs.anl.gov  Tue Aug  4 18:33:34 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Aug 2015 18:33:34 -0500
Subject: [petsc-users] Profiling/checkpoints
In-Reply-To: <8CD18807-1074-4FE1-A6C5-181C550480E9@gmail.com>
References: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>
	<76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov>
	<DD95C428-9DC1-4465-B575-B5249F2462C2@mcs.anl.gov>
	<8CD18807-1074-4FE1-A6C5-181C550480E9@gmail.com>
Message-ID: <2C727A93-1276-4CB2-AC6D-C8D97139CB77@mcs.anl.gov>


> On Aug 4, 2015, at 6:20 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> 
> And note that it is possible to run gdb/lldb on each of several MPI processes, useful when you hit a bug that only appears in parallel.  For example, this FAQ describes a couple of ways to do this:
> 
> https://www.open-mpi.org/faq/?category=debugging#serial-debuggers

   You can also use the PETSc option -start_in_debugger which can work under some circumstances (like all MPI processes have access to the X server).

  Barry

> 
> 
>> Am 05.08.2015 um 00:36 schrieb Barry Smith <bsmith at mcs.anl.gov>:
>> 
>> 
>> Correction, even in parallel you should be able to use a 0 for the viewer for calls to KSPView() etc; just make sure you do the same call on each process that shares the object. 
>> 
>>  To change the viewer format you do need to use PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD), PETSC_VIEWER_ASCII_INFO)  to change the format for parallel objects that live on PETSC_COMM_WORLD.
>> 
>> 
>> Barry
>> 
>> PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the format of the sequential ASCII viewer.
>> 
>>> On Aug 4, 2015, at 5:22 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>> I do this by running in the debugger and putting in breakpoints. At the breakpoint you can look directly at variables like the n in call to VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it will print out the information about the object right then. Calling VecView() or MatView() directly will of course cause it to print the entire object which generally you don't want but you can do PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView to have it print size information etc about the object instead of the full object. In parallel instead of passing 0 for the viewer you need to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that share the object call the routine in the debugger but it is possible.
>>> 
>>> Let us know how it goes and we can try to improve the experience with your suggestions,
>>> 
>>> Barry
>>> 
>>>> On Aug 4, 2015, at 5:09 PM, Justin Chang <jychang48 at gmail.com> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> Not sure what to title this mail, but let me begin with an analogy of what I am looking for:
>>>> 
>>>> In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc?
>>>> 
>>>> I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc).
>>>> 
>>>> Or
>>>> 
>>>> Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be?
>>>> 
>>>> Thanks,
>>>> Justin
>> 


From knepley at gmail.com  Tue Aug  4 18:43:55 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 4 Aug 2015 18:43:55 -0500
Subject: [petsc-users] Profiling/checkpoints
In-Reply-To: <2C727A93-1276-4CB2-AC6D-C8D97139CB77@mcs.anl.gov>
References: <CAP2=TMjkjQb7QKcTZT4M3YKxrrWtQa9YrR+n2RM-HDZRfAib1w@mail.gmail.com>
	<76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov>
	<DD95C428-9DC1-4465-B575-B5249F2462C2@mcs.anl.gov>
	<8CD18807-1074-4FE1-A6C5-181C550480E9@gmail.com>
	<2C727A93-1276-4CB2-AC6D-C8D97139CB77@mcs.anl.gov>
Message-ID: <CAMYG4Gk-kO5WoxFew-dAFiySN6b4tth1gVe0fV1LNQA+C1bHZw@mail.gmail.com>

On Tue, Aug 4, 2015 at 6:33 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Aug 4, 2015, at 6:20 PM, Patrick Sanan <patrick.sanan at gmail.com>
> wrote:
> >
> > And note that it is possible to run gdb/lldb on each of several MPI
> processes, useful when you hit a bug that only appears in parallel.  For
> example, this FAQ describes a couple of ways to do this:
> >
> > https://www.open-mpi.org/faq/?category=debugging#serial-debuggers
>
>    You can also use the PETSc option -start_in_debugger which can work
> under some circumstances (like all MPI processes have access to the X
> server).


and you can start debuggers on only some processes using -debugger_nodes
1,3,7

  Thanks,

     Matt


>
>   Barry
>
> >
> >
> >> Am 05.08.2015 um 00:36 schrieb Barry Smith <bsmith at mcs.anl.gov>:
> >>
> >>
> >> Correction, even in parallel you should be able to use a 0 for the
> viewer for calls to KSPView() etc; just make sure you do the same call on
> each process that shares the object.
> >>
> >>  To change the viewer format you do need to use
> PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD),
> PETSC_VIEWER_ASCII_INFO)  to change the format for parallel objects that
> live on PETSC_COMM_WORLD.
> >>
> >>
> >> Barry
> >>
> >> PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the
> format of the sequential ASCII viewer.
> >>
> >>> On Aug 4, 2015, at 5:22 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>
> >>>
> >>> I do this by running in the debugger and putting in breakpoints. At
> the breakpoint you can look directly at variables like the n in call to
> VecMDot() you can also call KSPView() etc on any PETSc object (with a
> viewer of 0) and it will print out the information about the object right
> then. Calling VecView() or MatView() directly will of course cause it to
> print the entire object which generally you don't want but you can do
> PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or
> MatView to have it print size information etc about the object instead of
> the full object. In parallel instead of passing 0 for the viewer you need
> to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes
> that share the object call the routine in the debugger but it is possible.
> >>>
> >>> Let us know how it goes and we can try to improve the experience with
> your suggestions,
> >>>
> >>> Barry
> >>>
> >>>> On Aug 4, 2015, at 5:09 PM, Justin Chang <jychang48 at gmail.com> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> Not sure what to title this mail, but let me begin with an analogy of
> what I am looking for:
> >>>>
> >>>> In MATLAB, we could insert breakpoints into the code, such that when
> we run the program, we could pause the execution and see what the variables
> contain and what is going on exactly within your function calls. Is there a
> way to do something like this within PETSc?
> >>>>
> >>>> I want to see what's going on within certain PETSc functions within
> KSPSolve. For instance, -log_summary says that my solver invokes calls to
> functions like VecMDot and VecMAXPY but I would like to know exactly how
> many vectors each of these functions are working with. Morever, I would
> also like to get a general overview of the properties of the matrices
> MatPtAP and MatMatMult are playing with (e.g., dimensions, number of
> nonzeros, etc).
> >>>>
> >>>> Or
> >>>>
> >>>> Above functions happen to be invoked from gamg, so is it possible to
> tell just from the parameters fed into PETSc what the answers to the above
> may be?
> >>>>
> >>>> Thanks,
> >>>> Justin
> >>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/9e22579a/attachment.html>

From solvercorleone at gmail.com  Tue Aug  4 20:53:43 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Wed, 5 Aug 2015 10:53:43 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
Message-ID: <CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>

Thank you very much for your help and suggestions.
With your help, finally I could continue my project.

Regards

Cong Li


On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
> created.
>
>   Since you want to use the C that is passed in you should use
> MAT_REUSE_MATRIX.
>
>   Note that since your B and C matrices are dense the issue of sparsity
> pattern of C is not relevant.
>
>   Barry
>
> > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > Thanks very much. This answer is very helpful.
> > And I have a following question.
> > If I create B1, B2, .. by the way you suggested and then use MatMatMult
> to do SPMM.
> > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat
> *C)
> > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> >
> > Thanks
> >
> > Cong Li
> >
> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > I am sorry that I should have explained it more clearly.
> > > Actually I want to compute a recurrence.
> > >
> > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
> A*B2=B3 and so on.
> > > Finally I want to combine all these results into a bigger matrix
> C=[B1,B2 ...]
> >
> >    First create C with MatCreateDense(,&C). Then call
> MatDenseGetArray(C,&array); then create B1 with
> MatCreateDense(....,array,&B1); then create
> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the
> number of __local__ rows in B1 times the number of columns in B1, then
> create B3 with a larger shift etc.
> >
> >    Note that you are "sharing" the array space of C with B1, B2, B3,
> ..., each Bi contains its columns of the C matrix.
> >
> >   Barry
> >
> >
> >
> > >
> > > Is there any way to do this efficiently.
> > >
> > >
> > >
> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
> wrote:
> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > Thanks for your reply.
> > > >
> > > > I have an other question.
> > > > I want to do SPMM several times and combine result matrices into one
> bigger
> > > > matrix.
> > > > for example
> > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > >
> > > > Could you please suggest a way of how to do this.
> > > This is just linear algebra, nothing to do with PETSc specifically.
> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > > >
> > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I am a PhD student using PETsc for my research.
> > > > > > I am wondering if there is a way to implement SPMM (Sparse
> matrix-matrix
> > > > > > multiplication) by using PETSc.
> > > > >
> > > > >
> > > > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > >
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/45600e59/attachment.html>

From rongliang.chan at gmail.com  Tue Aug  4 22:33:28 2015
From: rongliang.chan at gmail.com (Rongliang Chen)
Date: Wed, 05 Aug 2015 11:33:28 +0800
Subject: [petsc-users] Fail to Configure petsc-3.6.1
Message-ID: <55C18408.5040500@gmail.com>

Hi there,

I tried to configure the petsc-3.6.1 on my laptop but failed with the 
following error. The configure.log is attached. Any suggestions? Thanks.

configure: error: Can't find or link to the hdf5 library. Use 
--disable-netcdf-4, or see config.log for errors.

Best,
Rongliang

-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: text/x-log
Size: 2927981 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/b83742ff/attachment-0001.bin>

From jed at jedbrown.org  Tue Aug  4 23:26:35 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 04 Aug 2015 22:26:35 -0600
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <55C18408.5040500@gmail.com>
References: <55C18408.5040500@gmail.com>
Message-ID: <87r3niz7gk.fsf@jedbrown.org>

Rongliang Chen <rongliang.chan at gmail.com> writes:

> Hi there,
>
> I tried to configure the petsc-3.6.1 on my laptop but failed with the 
> following error. The configure.log is attached. Any suggestions? Thanks.
>
> configure: error: Can't find or link to the hdf5 library. Use 
> --disable-netcdf-4, or see config.log for errors.

Looks like you'll have to check NetCDF's config.log for the details.
Either something is wrong with the HDF5 install or the wrong options are
being passed to NetCDF configure.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/78bd0359/attachment.pgp>

From gbisht at lbl.gov  Tue Aug  4 23:27:24 2015
From: gbisht at lbl.gov (Gautam Bisht)
Date: Tue, 4 Aug 2015 21:27:24 -0700
Subject: [petsc-users] Error running DMPlex example
Message-ID: <CAPz1Tneo=KcNRNVWj-5Debs2wPg2TUT27sO8o6bH_whTet49ug@mail.gmail.com>

Hi,

I'm getting the following error while running the following DMPlex example.
Any suggestion what is going wrong? Attached are example.log and
configure.log.


python2.7 ./config/builder2.py check src/snes/examples/tutorials/ex12.c
Namespace(args=[], files=['src/snes/examples/tutorials/ex12.c'],
func=<function check at 0x10d874c80>, numProcs=None, regParams=None,
replace=False, retain=False, testnum=None)
Running 52 tests
Building
['/Users/gbisht/projects/petsc/petsc_f0284fa/src/snes/examples/tutorials/ex12.c']
Running #0: /opt/local/bin/mpiexec-mpich-gcc49 -host localhost -n 1
darwin-gnu-fort-debug/lib/ex12-obj/ex12 -run_type test -refinement_limit
0.0    -bc_type dirichlet -interpolate 0 -petscspace_order 1 -s
TEST ERROR: Failed to execute darwin-gnu-fort-debug/lib/ex12-obj/ex12
===================================================================================


=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES


=   PID 69451 RUNNING AT localhost


=   EXIT CODE: 59


=   CLEANING UP REMAINING PROCESSES


=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES


===================================================================================
[0]PETSC ERROR:
------------------------------------------------------------------------

[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range

[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger


[0]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind


[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X
to find memory corruption errors

[0]PETSC ERROR: likely location of problem given in stack below


[0]PETSC ERROR: ---------------------  Stack Frames
------------------------------------


[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
available,

[0]PETSC ERROR:       INSTEAD the line number of the start of the function


[0]PETSC ERROR:       is given.


[0]PETSC ERROR: [0] DMPlexGenerate_Triangle line 217
src/dm/impls/plex/plexgenerate.c

[0]PETSC ERROR: [0] DMPlexGenerate line 1056
src/dm/impls/plex/plexgenerate.c


[0]PETSC ERROR: [0] DMPlexCreateBoxMesh line 897
src/dm/impls/plex/plexcreate.c


[0]PETSC ERROR: [0] CreateMesh line 347 src/snes/examples/tutorials/ex12.c


[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------


[0]PETSC ERROR: Signal received


[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[0]PETSC ERROR: Petsc Development GIT revision: v3.6.1-238-gf0284fa  GIT
Date: 2015-07-27 13:34:26 -0500

[0]PETSC ERROR: darwin-gnu-fort-debug/lib/ex12-obj/ex12 on a
darwin-gnu-fort-debug named gautam-laptop by gbisht Tue Aug  4 21:09:37
2015
[0]PETSC ERROR: Configure options --download-hdf5=1
--with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate
--download-parmetis=yes --download-metis=yes --with-c
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file


application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0


TEST RUN FAILED (check example.log for details)


-Gautam.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/d99712c2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 4688020 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/d99712c2/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example.log
Type: application/octet-stream
Size: 7189 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150804/d99712c2/attachment-0003.obj>

From rongliang.chan at gmail.com  Wed Aug  5 01:05:55 2015
From: rongliang.chan at gmail.com (Rongliang Chen)
Date: Wed, 05 Aug 2015 14:05:55 +0800
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <87r3niz7gk.fsf@jedbrown.org>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
Message-ID: <55C1A7C3.7030209@gmail.com>

Hi Jed,

Thanks for your reply.

I checked the netcdf and hdf5's config.log and could not find any 
possible solutions. Can you help me check these two files again? The two 
files are attached. Thanks.

Best regards,
Rongliang

On 08/05/2015 12:26 PM, Jed Brown wrote:
> Rongliang Chen <rongliang.chan at gmail.com> writes:
>
>> Hi there,
>>
>> I tried to configure the petsc-3.6.1 on my laptop but failed with the
>> following error. The configure.log is attached. Any suggestions? Thanks.
>>
>> configure: error: Can't find or link to the hdf5 library. Use
>> --disable-netcdf-4, or see config.log for errors.
> Looks like you'll have to check NetCDF's config.log for the details.
> Either something is wrong with the HDF5 install or the wrong options are
> being passed to NetCDF configure.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: config-hdf5.log
Type: text/x-log
Size: 1804340 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/77fb9058/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config-netcdf.log
Type: text/x-log
Size: 127324 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/77fb9058/attachment-0003.bin>

From solvercorleone at gmail.com  Wed Aug  5 01:23:14 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Wed, 5 Aug 2015 15:23:14 +0900
Subject: [petsc-users] Questions about creation of matrix and setting its
	values
Message-ID: <CALSmn-kC=WuLwz7tLQ5JTM09P-gz7LhtaOj2qY1LVZ79-zu9uQ@mail.gmail.com>

Hi,

I am wondering if it is necessary to call
MatAssemblyBegin() and MatAssemblyEnd() after MatDuplicate() with the
option of MAT_DO_NOT_COPY_VALUES.
For example, if I have an assembled matrix A, and I call MatDuplicate() to
create B, which is a duplication of A.
Do I need to call MatAssemblyBegin() and MatAssemblyEnd() for B.

And 2nd question is :
just after the MatCreateDense() call and before MatAssemblyBegin()
and MatAssemblyEnd() calls, can I use MatGetArray() ?

The 3rd question is:
before the MatAssemblyBegin() and MatAssemblyEnd() calls, should I use
INSERT_VALUES or ADD_VALUES for MatSetValues call? And why ?
Actually I have read the manual, but I still feel confused about the means
of INSERT_VALUES and ADD_VALUES.

Thanks

Cong Li
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/eca707c4/attachment.html>

From dave.mayhem23 at gmail.com  Wed Aug  5 03:37:41 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Wed, 5 Aug 2015 10:37:41 +0200
Subject: [petsc-users] C++ wrapper for petsc vector
In-Reply-To: <2182692.n8DuiFgnrM@tinlaptop>
References: <1585215.z8oGCl3ZR4@tinlaptop> <3775667.cNPH29TcoF@tinlaptop>
	<CAMYG4Gmu0PStcfxxsStxANj1VgzoOdy3qgMvMPBETWGR3AA0fw@mail.gmail.com>
	<2182692.n8DuiFgnrM@tinlaptop>
Message-ID: <CAJ98EDrjsaeoWtuotsm4LC-qtxOXw1E11YusfA0ub87G5on=Og@mail.gmail.com>

>
> OK, I was not aware of PCShell (I'm new to PETSc). I don't know Trilinos
> well
> enough to judge whether it's good from software engineering point of view
> or
> not, but allow me one last question. What would happen if I wrap all
> 'other'
> solvers in PCShell and then for some reason, PETSc is not available.


This is a fair comment. However, in my experience PETSc builds everywhere.
If PETSc isn't provided as a module on the resource you have access to, it
is relatively
straight forward to built the entire library yourself. The --download-XXX
feature of
PETSc's configure is pretty damn good and also will on most (if not all)
machines.

If configure does fail on your machine of choice, send the configure.log
file to
petsc-maint at mcs.anl.gov. The PETSc guys will sort out the problem.
In 12 years, I haven't found a single machine which I couldn't get petsc
compiled on.

I think you are safe if you wrap everything within PETSc. :D

Cheers
  Dave


> None of
> the other solvers would be accessible (unless I modify the source code), so
> wrapping everything using PCShell creates a strong dependency on one
> particular library (PETSc), doesn't it?
>
> Martin
>
>
> >
> > > Best regards,
> > >
> > >  Martin Vymazal
> > >
> > > On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote:
> > > > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal <
> > >
> > > martin.vymazal at vki.ac.be>
> > >
> > > > wrote:
> > > > > Hello,
> > > > >
> > > > >  I'm trying to create a small C++ class to wrap the 'Vec' object.
> This
> > > > >
> > > > > class
> > > > > has an internal pointer to a member variable of type Vec, and in
> its
> > > > > destructor, it calls VecDestroy. Unfortunately, my test program
> > >
> > > segfaults
> > >
> > > > > and
> > > > > this seems to be due to the fact that the destructor of the wrapper
> > >
> > > class
> > >
> > > > > is
> > > > > called after main() calls PetscFinalize(). Apparently VecDestroy
> > >
> > > performs
> > >
> > > > > some
> > > > > collective communication, so calling it after PetscFinalize() is
> too
> > >
> > > late.
> > >
> > > > > How
> > > > > can I fix this?
> > > >
> > > > 1) Declare your C++ in a scope, so that it goes out of scope before
> > > > PetscFinalize()
> > > >
> > > > 2) Is there any utility to this wrapper since everything can be
> called
> > > > directly from C++?
> > > >
> > > >   Thanks,
> > > >
> > > >      Matt
> > > > >
> > > > > Thank you,
> > > > >
> > > > >  Martin Vymazal
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/21f7df41/attachment.html>

From nicolas.pozin at inria.fr  Wed Aug  5 04:15:16 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Wed, 5 Aug 2015 11:15:16 +0200 (CEST)
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr>
Message-ID: <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr>

Hello, 

I'm trying to solve a system with a matrix free operator and through conjugate gradient method. 
To make ideas clear, I set up the following simple example (I am using petsc-3.6) and I get this error message : 
" 
[0]PETSC ERROR: --------------------- Error Message ------------------------------------ 
[0]PETSC ERROR: Invalid argument! 
[0]PETSC ERROR: Wrong type of object: Parameter # 1! 
[0]PETSC ERROR: ------------------------------------------------------------------------ 
[0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 
[0]PETSC ERROR: See docs/changes/index.html for recent updates. 
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. 
[0]PETSC ERROR: See docs/index.html for manual pages. 
[0]PETSC ERROR: ------------------------------------------------------------------------ 
[0]PETSC ERROR: ./test on a ubuntu_release named pl-59080 by npozin Wed Aug 5 10:55:26 2015 
[0]PETSC ERROR: Libraries linked from /home/npozin/Felisce_libraries/petsc_3.4.3/ubuntu_release/lib 
[0]PETSC ERROR: Configure run at Wed Jul 22 16:18:36 2015 
[0]PETSC ERROR: Configure options PETSC_ARCH=ubuntu_release --with-cxx=g++ --with-fc=gfortran --with-cc=gcc --with-x=0 --download-openmpi --download-f-blas-lapack --download-superlu --download-superlu_dist --with-superlu_dist=1 --download-metis --download-mumps --download-parmetis --with-superlu_dist=1 --download-boost --with-boost=1 --download-scalapack with-external-packages-dir=/home/npozin/Felisce_libraries/petsc_3.4.3/packages 
[0]PETSC ERROR: ------------------------------------------------------------------------ 
[0]PETSC ERROR: MatShellGetContext() line 202 in /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/impls/shell/shell.c 
End userMult 
[0]PETSC ERROR: MatMult() line 2179 in /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/interface/matrix.c 
[0]PETSC ERROR: KSP_MatMult() line 204 in /home/npozin/Felisce_libraries/petsc_3.4.3/include/petsc-private/kspimpl.h 
[0]PETSC ERROR: KSPSolve_CG() line 219 in /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/impls/cg/cg.c 
[0]PETSC ERROR: KSPSolve() line 441 in /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/interface/itfunc.c 
" 

I don't understand where the problem comes from with the matrix argument of MatShellGetContext. 
Any idea on what I do wrong? 

Thanks a lot, 
Nicolas 


#include <iostream> 
#include <petscksp.h> 

using namespace std; 


typedef struct { 
int val; 
} MyCtx; 


class ShellClass { 
Mat matShell; 
KSP ksp; 
PC pc; 
Vec x; 
Vec b; 

public: 
void userMult(Mat Amat, Vec x, Vec y) { 
cout << "Inside userMult" << endl; 

MyCtx *ctx; 
MatShellGetContext(Amat, (void *) ctx); 

cout << "End userMult" << endl; 
} 

void solveShell() { 
// context 
MyCtx *ctx = new MyCtx; 
ctx->val = 42; 

// pc 
PCCreate(PETSC_COMM_WORLD, &pc); 
PCSetType(pc, PCNONE); 

// ksp 
KSPCreate(PETSC_COMM_WORLD, &ksp); 
KSPSetType(ksp, KSPCG); 
KSPSetPC(ksp, pc); 
KSPSetFromOptions(ksp); 

// matshell 
int m = 10; 
int n = 10; 
MatCreateShell(PETSC_COMM_WORLD, m, n, PETSC_DETERMINE, PETSC_DETERMINE, ctx, &matShell); 
MatShellSetOperation(matShell, MATOP_MULT, (void(*)(void))&ShellClass::userMult); 


// create vectors 
MatCreateVecs(matShell, &x, 0); 
VecDuplicate(x, &b); 
VecSet(b, 1.); 

// set operators 
KSPSetOperators(ksp, matShell, matShell); 

// solve (call to userMult) 
KSPSolve(ksp, b, x); 
} 
}; 


int main(int argc, char** argv) { 
PetscInitialize(&argc, &argv, NULL, NULL); 

ShellClass foo; 
foo.solveShell(); 

PetscFinalize(); 
return 0; 
} 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/11680491/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: makefile
Type: text/x-makefile
Size: 171 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/11680491/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.cpp
Type: text/x-c++src
Size: 1372 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/11680491/attachment.cpp>

From solvercorleone at gmail.com  Wed Aug  5 04:42:16 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Wed, 5 Aug 2015 18:42:16 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
Message-ID: <CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>

Hi

I tried the method you suggested. However, I got the error message.
My code and message are below.

K is the big matrix containing column matrices.

code:

call MatGetArray(K,KArray,KArrayOffset,ierr)

call MatGetLocalSize(R,local_RRow,local_RCol)

call MatGetArray(R,RArray,RArrayOffset,ierr)

call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &

PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)

  localRsize = local_RRow * local_RCol
  do genIdx= 1, localRsize
    KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
  end do

  call MatRestoreArray(R,RArray,RArrayOffset,ierr)

  call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)

  do stepIdx= 2, step_k

    blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)

    call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &

PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
    call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
    call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
  end do

  call MatRestoreArray(K,KArray,KArrayOffset,ierr)

   do stepIdx= 2, step_k


call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
ierr)
  end do


And I got the error message as below:


[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR:
or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24
CDT 2013
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: --------------------[1]PETSC ERROR:
------------------------------------------------------------------------
[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
----------------------------------------------------
[0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
Aug  5 18:24:40 2015
[0]PETSC ERROR: Libraries linked from
/volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
[0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
[0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
--known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
--known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
--known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
--known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
--known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
--known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
--known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
--COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
--CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
--FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
--with-x=0 --with-c++-support --with-batch=1 --with-info=1
--with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory
unknown file
--------------------------------------------------------------------------
[mpi::mpi-api::mpi-abort]
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[p01-024:26516]
/opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
[0xffffffff0091f684]
[p01-024:26516]
/opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
[0xffffffff006c389c]
[p01-024:26516]
/opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
[0xffffffff006db3ac]
[p01-024:26516]
/opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
[0xffffffff00281bf0]
[p01-024:26516] ./kmath.bcbcg [0x1bf620]
[p01-024:26516] ./kmath.bcbcg [0x1bf20c]
[p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
[p01-024:26516] [(nil)]
[p01-024:26516] ./kmath.bcbcg [0x1a2054]
[p01-024:26516] ./kmath.bcbcg [0x1064f8]
[p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
[p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
[p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
[0xffffffff02d3b81c]
[p01-024:26516] ./kmath.bcbcg [0x1051ec]
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR:
or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24
CDT 2013
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
Aug  5 18:24:40 2015
[0]PETSC ERROR: Libraries linked from
/volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
[0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
[0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
--known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
--known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
--known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
--known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
--known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
--known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
--known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
--COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
--CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
--FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
--with-x=0 --with-c++-support --with-batch=1 --with-info=1
--with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory
unknown file
[ERR.] PLE 0019 plexec One of MPI processes was
aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)

However, if I change from
call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
ierr)
to
call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX
,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)

everything is fine.

could you please suggest some way to solve this?

Thanks

Cong Li

On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com> wrote:

> Thank you very much for your help and suggestions.
> With your help, finally I could continue my project.
>
> Regards
>
> Cong Li
>
>
>
> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
>> created.
>>
>>   Since you want to use the C that is passed in you should use
>> MAT_REUSE_MATRIX.
>>
>>   Note that since your B and C matrices are dense the issue of sparsity
>> pattern of C is not relevant.
>>
>>   Barry
>>
>> > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
>> >
>> > Thanks very much. This answer is very helpful.
>> > And I have a following question.
>> > If I create B1, B2, .. by the way you suggested and then use MatMatMult
>> to do SPMM.
>> > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
>> fill,Mat *C)
>> > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>> >
>> > Thanks
>> >
>> > Cong Li
>> >
>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >
>> > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
>> > >
>> > > I am sorry that I should have explained it more clearly.
>> > > Actually I want to compute a recurrence.
>> > >
>> > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
>> A*B2=B3 and so on.
>> > > Finally I want to combine all these results into a bigger matrix
>> C=[B1,B2 ...]
>> >
>> >    First create C with MatCreateDense(,&C). Then call
>> MatDenseGetArray(C,&array); then create B1 with
>> MatCreateDense(....,array,&B1); then create
>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the
>> number of __local__ rows in B1 times the number of columns in B1, then
>> create B3 with a larger shift etc.
>> >
>> >    Note that you are "sharing" the array space of C with B1, B2, B3,
>> ..., each Bi contains its columns of the C matrix.
>> >
>> >   Barry
>> >
>> >
>> >
>> > >
>> > > Is there any way to do this efficiently.
>> > >
>> > >
>> > >
>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>> patrick.sanan at gmail.com> wrote:
>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>> > > > Thanks for your reply.
>> > > >
>> > > > I have an other question.
>> > > > I want to do SPMM several times and combine result matrices into
>> one bigger
>> > > > matrix.
>> > > > for example
>> > > > I firstly calculate AX1=B1, AX2=B2 ...
>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>> > > >
>> > > > Could you please suggest a way of how to do this.
>> > > This is just linear algebra, nothing to do with PETSc specifically.
>> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>> > > >
>> > > > Thanks
>> > > >
>> > > > Cong Li
>> > > >
>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
>> > > >
>> > > > > Cong Li <solvercorleone at gmail.com> writes:
>> > > > >
>> > > > > > Hello,
>> > > > > >
>> > > > > > I am a PhD student using PETsc for my research.
>> > > > > > I am wondering if there is a way to implement SPMM (Sparse
>> matrix-matrix
>> > > > > > multiplication) by using PETSc.
>> > > > >
>> > > > >
>> > > > >
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>> > > > >
>> > >
>> >
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/2e0a6f81/attachment-0001.html>

From solvercorleone at gmail.com  Wed Aug  5 04:47:53 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Wed, 5 Aug 2015 18:47:53 +0900
Subject: [petsc-users] Questions about creation of matrix and setting
	its values
In-Reply-To: <FCA42CED-E98C-4954-A7BE-3236AB6B6F05@gmail.com>
References: <CALSmn-kC=WuLwz7tLQ5JTM09P-gz7LhtaOj2qY1LVZ79-zu9uQ@mail.gmail.com>
	<FCA42CED-E98C-4954-A7BE-3236AB6B6F05@gmail.com>
Message-ID: <CALSmn-kdmqpvnDQWGfzCtqvsoEwNFZ55vD4d76K-roVf9Zv+-w@mail.gmail.com>

Thanks, Patrick.
I think I got it now.

Cong Li

On Wed, Aug 5, 2015 at 3:45 PM, Patrick Sanan <patrick.sanan at gmail.com>
wrote:

>
>
>
>
> Am 05.08.2015 um 08:23 schrieb Cong Li <solvercorleone at gmail.com>:
>
> Hi,
>
> I am wondering if it is necessary to call
> MatAssemblyBegin() and MatAssemblyEnd() after MatDuplicate() with the
> option of MAT_DO_NOT_COPY_VALUES.
> For example, if I have an assembled matrix A, and I call MatDuplicate() to
> create B, which is a duplication of A.
> Do I need to call MatAssemblyBegin() and MatAssemblyEnd() for B.
>
> And 2nd question is :
> just after the MatCreateDense() call and before MatAssemblyBegin()
> and MatAssemblyEnd() calls, can I use MatGetArray() ?
>
> The 3rd question is:
> before the MatAssemblyBegin() and MatAssemblyEnd() calls, should I use
> INSERT_VALUES or ADD_VALUES for MatSetValues call? And why ?
> Actually I have read the manual, but I still feel confused about the means
> of INSERT_VALUES and ADD_VALUES.
>
> There are a couple of reasons that you need to make these
> MatAssemblyBegin/End calls:
> - entries can be set which should be stored on a different process, so
> these need to be communicated
> - for compressed formats like CSR (as used in MATAIJ and others) the
> entries need to be processed into their compressed form
> In general, the entries of the matrix are not stored in their "usable"
> forms until you make the MatAssembleEnd call. Rather they are kept in some
> easy-to-insert-into intermediate storage. INSERT_VALUES means that old
> values that might be in the matrix are overwritten , and ADD_VALUES means
> that the new entries from intermediate storage are added to the old values.
>
>
>
> Thanks
>
> Cong Li
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/9bba1e91/attachment.html>

From knepley at gmail.com  Wed Aug  5 06:38:20 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 5 Aug 2015 06:38:20 -0500
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr>
References: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr>
	<624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr>
Message-ID: <CAMYG4GmUVFrv99q7hcd=1CG-vrU7_KXkyDEKBd6A2sGh8Zj3Tw@mail.gmail.com>

On Wed, Aug 5, 2015 at 4:15 AM, Nicolas Pozin <nicolas.pozin at inria.fr>
wrote:

> Hello,
>
> I'm trying to solve a system with a matrix free operator and through
> conjugate gradient method.
> To make ideas clear, I set up the following simple example (I am using
> petsc-3.6) and I get this error message :
>

Yes, you are passing a C++ function userMult, so the compiler sticks "this"
in as the first argument. We do not
recommend this kind of wrapping.

  Thanks,

    Matt


> "
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Wrong type of object: Parameter # 1!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./test on a ubuntu_release named pl-59080 by npozin Wed
> Aug  5 10:55:26 2015
> [0]PETSC ERROR: Libraries linked from
> /home/npozin/Felisce_libraries/petsc_3.4.3/ubuntu_release/lib
> [0]PETSC ERROR: Configure run at Wed Jul 22 16:18:36 2015
> [0]PETSC ERROR: Configure options PETSC_ARCH=ubuntu_release --with-cxx=g++
> --with-fc=gfortran --with-cc=gcc --with-x=0 --download-openmpi
> --download-f-blas-lapack --download-superlu --download-superlu_dist
> --with-superlu_dist=1 --download-metis --download-mumps --download-parmetis
> --with-superlu_dist=1 --download-boost --with-boost=1 --download-scalapack
> with-external-packages-dir=/home/npozin/Felisce_libraries/petsc_3.4.3/packages
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: MatShellGetContext() line 202 in
> /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/impls/shell/shell.c
> End userMult
> [0]PETSC ERROR: MatMult() line 2179 in
> /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/interface/matrix.c
> [0]PETSC ERROR: KSP_MatMult() line 204 in
> /home/npozin/Felisce_libraries/petsc_3.4.3/include/petsc-private/kspimpl.h
> [0]PETSC ERROR: KSPSolve_CG() line 219 in
> /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/impls/cg/cg.c
> [0]PETSC ERROR: KSPSolve() line 441 in
> /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/interface/itfunc.c
> "
>
> I don't understand where the problem comes from with the matrix argument
> of MatShellGetContext.
> Any idea on what I do wrong?
>
> Thanks a lot,
> Nicolas
>
>
>
> #include <iostream>
> #include <petscksp.h>
>
> using namespace std;
>
>
> typedef struct {
>   int val;
> } MyCtx;
>
>
> class ShellClass {
>   Mat matShell;
>   KSP ksp;
>   PC pc;
>   Vec x;
>   Vec b;
>
> public:
>   void userMult(Mat Amat, Vec x, Vec y) {
>     cout << "Inside userMult" << endl;
>
>     MyCtx *ctx;
>     MatShellGetContext(Amat, (void *) ctx);
>
>     cout << "End userMult" << endl;
>   }
>
>   void solveShell() {
>     // context
>     MyCtx *ctx = new MyCtx;
>     ctx->val = 42;
>
>     // pc
>     PCCreate(PETSC_COMM_WORLD, &pc);
>     PCSetType(pc, PCNONE);
>
>     // ksp
>     KSPCreate(PETSC_COMM_WORLD, &ksp);
>     KSPSetType(ksp, KSPCG);
>     KSPSetPC(ksp, pc);
>     KSPSetFromOptions(ksp);
>
>     // matshell
>     int m = 10;
>     int n = 10;
>     MatCreateShell(PETSC_COMM_WORLD, m, n, PETSC_DETERMINE,
> PETSC_DETERMINE, ctx, &matShell);
>     MatShellSetOperation(matShell, MATOP_MULT,
> (void(*)(void))&ShellClass::userMult);
>
>
>     // create vectors
>     MatCreateVecs(matShell, &x, 0);
>     VecDuplicate(x, &b);
>     VecSet(b, 1.);
>
>     // set operators
>     KSPSetOperators(ksp, matShell, matShell);
>
>     // solve (call to userMult)
>     KSPSolve(ksp, b, x);
>   }
> };
>
>
>
> int main(int argc, char** argv) {
>   PetscInitialize(&argc, &argv, NULL, NULL);
>
>   ShellClass foo;
>   foo.solveShell();
>
>   PetscFinalize();
>   return 0;
> }
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/c947aec7/attachment.html>

From knepley at gmail.com  Wed Aug  5 06:45:02 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 5 Aug 2015 06:45:02 -0500
Subject: [petsc-users] Error running DMPlex example
In-Reply-To: <CAPz1Tneo=KcNRNVWj-5Debs2wPg2TUT27sO8o6bH_whTet49ug@mail.gmail.com>
References: <CAPz1Tneo=KcNRNVWj-5Debs2wPg2TUT27sO8o6bH_whTet49ug@mail.gmail.com>
Message-ID: <CAMYG4Gnozz2GQ04GAtX+xMV=MUVG_Qanzntb4t21g+-=ANYOSA@mail.gmail.com>

On Tue, Aug 4, 2015 at 11:27 PM, Gautam Bisht <gbisht at lbl.gov> wrote:

> Hi,
>
> I'm getting the following error while running the following DMPlex
> example. Any suggestion what is going wrong? Attached are example.log and
> configure.log.
>

Can you run the test by hand either with the debugger and get a stack trace:

/opt/local/bin/mpiexec-mpich-gcc49 -host localhost -n 1
darwin-gnu-fort-debug/lib/ex12-obj/ex12 -run_type test -refinement_limit
0.0    -bc_type dirichlet -interpolate 0 -petscspace_order 1 -show_initial
-dm_plex_print_fem 1 -start_in_debugger

or using valgrind? I cannot reproduce the problem here.

  Thanks,

     Matt


> python2.7 ./config/builder2.py check src/snes/examples/tutorials/ex12.c
> Namespace(args=[], files=['src/snes/examples/tutorials/ex12.c'],
> func=<function check at 0x10d874c80>, numProcs=None, regParams=None,
> replace=False, retain=False, testnum=None)
> Running 52 tests
> Building
> ['/Users/gbisht/projects/petsc/petsc_f0284fa/src/snes/examples/tutorials/ex12.c']
> Running #0: /opt/local/bin/mpiexec-mpich-gcc49 -host localhost -n 1
> darwin-gnu-fort-debug/lib/ex12-obj/ex12 -run_type test -refinement_limit
> 0.0    -bc_type dirichlet -interpolate 0 -petscspace_order 1 -s
> TEST ERROR: Failed to execute darwin-gnu-fort-debug/lib/ex12-obj/ex12
> ===================================================================================
>
>
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>
>
> =   PID 69451 RUNNING AT localhost
>
>
> =   EXIT CODE: 59
>
>
> =   CLEANING UP REMAINING PROCESSES
>
>
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
>
>
> ===================================================================================
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: likely location of problem given in stack below
>
>
> [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
>
>
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
>
> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>
>
> [0]PETSC ERROR:       is given.
>
>
> [0]PETSC ERROR: [0] DMPlexGenerate_Triangle line 217
> src/dm/impls/plex/plexgenerate.c
>
> [0]PETSC ERROR: [0] DMPlexGenerate line 1056
> src/dm/impls/plex/plexgenerate.c
>
>
> [0]PETSC ERROR: [0] DMPlexCreateBoxMesh line 897
> src/dm/impls/plex/plexcreate.c
>
>
> [0]PETSC ERROR: [0] CreateMesh line 347 src/snes/examples/tutorials/ex12.c
>
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
>
> [0]PETSC ERROR: Signal received
>
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Development GIT revision: v3.6.1-238-gf0284fa  GIT
> Date: 2015-07-27 13:34:26 -0500
>
> [0]PETSC ERROR: darwin-gnu-fort-debug/lib/ex12-obj/ex12 on a
> darwin-gnu-fort-debug named gautam-laptop by gbisht Tue Aug  4 21:09:37
> 2015
> [0]PETSC ERROR: Configure options --download-hdf5=1
> --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate
> --download-parmetis=yes --download-metis=yes --with-c
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
>
>
>
>
> TEST RUN FAILED (check example.log for details)
>
>
> -Gautam.
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/ad9e8b6d/attachment-0001.html>

From hzhang at mcs.anl.gov  Wed Aug  5 09:23:04 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 5 Aug 2015 09:23:04 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
Message-ID: <CAGCphBu7XbPJoUkEbPVycpOCqJaKT+EKdEsfrTTpt1Rbkac-BQ@mail.gmail.com>

Cong:
You cannot use "MAT_REUSE_MATRIX" on arbitrary matrix product.
The correct process is

call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_
DEFAULT_INTEGER,C, ierr)
call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,C,
ierr)
i.e., C has data structure of A*Km(stepIdx-1) and is created in the first
call. C can be reused in the 2nd call when A or Km(stepIdx-1) changed
values, but not the structures.

In your case, Km(stepIdx) = A*Km(stepIdx-1). You should do
'call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX
,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)'
directly.

Hong

On Wed, Aug 5, 2015 at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:

> Hi
>
> I tried the method you suggested. However, I got the error message.
> My code and message are below.
>
> K is the big matrix containing column matrices.
>
> code:
>
> call MatGetArray(K,KArray,KArrayOffset,ierr)
>
> call MatGetLocalSize(R,local_RRow,local_RCol)
>
> call MatGetArray(R,RArray,RArrayOffset,ierr)
>
> call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>
> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
>
>   localRsize = local_RRow * local_RCol
>   do genIdx= 1, localRsize
>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>   end do
>
>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>
>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>
>   do stepIdx= 2, step_k
>
>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>
>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>
> PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>   end do
>
>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>
>    do stepIdx= 2, step_k
>
>
> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>   end do
>
>
> And I got the error message as below:
>
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24
> CDT 2013
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: --------------------[1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> ----------------------------------------------------
> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
> Aug  5 18:24:40 2015
> [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> --------------------------------------------------------------------------
> [mpi::mpi-api::mpi-abort]
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
> [0xffffffff0091f684]
> [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
> [0xffffffff006c389c]
> [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
> [0xffffffff006db3ac]
> [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
> [0xffffffff00281bf0]
> [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> [p01-024:26516] [(nil)]
> [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
> [0xffffffff02d3b81c]
> [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
> batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24
> CDT 2013
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
> Aug  5 18:24:40 2015
> [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> [ERR.] PLE 0019 plexec One of MPI processes was
> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
>
> However, if I change from
>
> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> to
> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX
> ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>
> everything is fine.
>
> could you please suggest some way to solve this?
>
> Thanks
>
> Cong Li
>
> On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com> wrote:
>
>> Thank you very much for your help and suggestions.
>> With your help, finally I could continue my project.
>>
>> Regards
>>
>> Cong Li
>>
>>
>>
>> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>>   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
>>> created.
>>>
>>>   Since you want to use the C that is passed in you should use
>>> MAT_REUSE_MATRIX.
>>>
>>>   Note that since your B and C matrices are dense the issue of sparsity
>>> pattern of C is not relevant.
>>>
>>>   Barry
>>>
>>> > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>> >
>>> > Thanks very much. This answer is very helpful.
>>> > And I have a following question.
>>> > If I create B1, B2, .. by the way you suggested and then use
>>> MatMatMult to do SPMM.
>>> > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
>>> fill,Mat *C)
>>> > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>>> >
>>> > Thanks
>>> >
>>> > Cong Li
>>> >
>>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> >
>>> > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
>>> wrote:
>>> > >
>>> > > I am sorry that I should have explained it more clearly.
>>> > > Actually I want to compute a recurrence.
>>> > >
>>> > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
>>> A*B2=B3 and so on.
>>> > > Finally I want to combine all these results into a bigger matrix
>>> C=[B1,B2 ...]
>>> >
>>> >    First create C with MatCreateDense(,&C). Then call
>>> MatDenseGetArray(C,&array); then create B1 with
>>> MatCreateDense(....,array,&B1); then create
>>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the
>>> number of __local__ rows in B1 times the number of columns in B1, then
>>> create B3 with a larger shift etc.
>>> >
>>> >    Note that you are "sharing" the array space of C with B1, B2, B3,
>>> ..., each Bi contains its columns of the C matrix.
>>> >
>>> >   Barry
>>> >
>>> >
>>> >
>>> > >
>>> > > Is there any way to do this efficiently.
>>> > >
>>> > >
>>> > >
>>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>>> patrick.sanan at gmail.com> wrote:
>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>> > > > Thanks for your reply.
>>> > > >
>>> > > > I have an other question.
>>> > > > I want to do SPMM several times and combine result matrices into
>>> one bigger
>>> > > > matrix.
>>> > > > for example
>>> > > > I firstly calculate AX1=B1, AX2=B2 ...
>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>> > > >
>>> > > > Could you please suggest a way of how to do this.
>>> > > This is just linear algebra, nothing to do with PETSc specifically.
>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>> > > >
>>> > > > Thanks
>>> > > >
>>> > > > Cong Li
>>> > > >
>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>>> wrote:
>>> > > >
>>> > > > > Cong Li <solvercorleone at gmail.com> writes:
>>> > > > >
>>> > > > > > Hello,
>>> > > > > >
>>> > > > > > I am a PhD student using PETsc for my research.
>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse
>>> matrix-matrix
>>> > > > > > multiplication) by using PETSc.
>>> > > > >
>>> > > > >
>>> > > > >
>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>> > > > >
>>> > >
>>> >
>>> >
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/b92fb497/attachment.html>

From solvercorleone at gmail.com  Wed Aug  5 09:43:51 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Wed, 5 Aug 2015 23:43:51 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAGCphBu7XbPJoUkEbPVycpOCqJaKT+EKdEsfrTTpt1Rbkac-BQ@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<CAGCphBu7XbPJoUkEbPVycpOCqJaKT+EKdEsfrTTpt1Rbkac-BQ@mail.gmail.com>
Message-ID: <CALSmn-=+OzDvd-Ys_tJx0-7BDr=AXzD2_SWM7woT6Ti3uvd3Uw@mail.gmail.com>

Hong,

Thanks for your answer.
However, in my problem, I have a pre-allocated matrix K, and its columns
are associated with Km(1), .. Km(step_k) respectively. What I want to do is
to update Km(2) by using the result of A*Km(1), and then to update Km(3) by
using the product of A and updated Km(2) and so on.

So, I think I need to use MAT_REUSE_MATRIX from the beginning, since even
when it is the first time  I call
MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_
DEFAULT_INTEGER,Km(stepIdx), ierr)',

Km(stepIdx) have actually already been allocated (in K).

Do you think it is possible that I can do this, and could you please
suggest some possible ways.

Thanks

Cong Li

On Wed, Aug 5, 2015 at 11:23 PM, Hong <hzhang at mcs.anl.gov> wrote:

> Cong:
> You cannot use "MAT_REUSE_MATRIX" on arbitrary matrix product.
> The correct process is
>
> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_
> DEFAULT_INTEGER,C, ierr)
> call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_
> DEFAULT_INTEGER,C, ierr)
> i.e., C has data structure of A*Km(stepIdx-1) and is created in the first
> call. C can be reused in the 2nd call when A or Km(stepIdx-1) changed
> values, but not the structures.
>
> In your case, Km(stepIdx) = A*Km(stepIdx-1). You should do
> 'call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX
> ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)'
> directly.
>
> Hong
>
> On Wed, Aug 5, 2015 at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
>
>> Hi
>>
>> I tried the method you suggested. However, I got the error message.
>> My code and message are below.
>>
>> K is the big matrix containing column matrices.
>>
>> code:
>>
>> call MatGetArray(K,KArray,KArrayOffset,ierr)
>>
>> call MatGetLocalSize(R,local_RRow,local_RCol)
>>
>> call MatGetArray(R,RArray,RArrayOffset,ierr)
>>
>> call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>
>> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
>>
>>   localRsize = local_RRow * local_RCol
>>   do genIdx= 1, localRsize
>>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>   end do
>>
>>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>
>>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>
>>   do stepIdx= 2, step_k
>>
>>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>>
>>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>
>> PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>   end do
>>
>>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>
>>    do stepIdx= 2, step_k
>>
>>
>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>   end do
>>
>>
>> And I got the error message as below:
>>
>>
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>> find memory corruption errors
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>> [0]PETSC ERROR: to get more information on the crash.
>> [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [0]PETSC ERROR: Signal received!
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24
>> CDT 2013
>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [0]PETSC ERROR: See docs/index.html for manual pages.
>> [0]PETSC ERROR: --------------------[1]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> ----------------------------------------------------
>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
>> Aug  5 18:24:40 2015
>> [0]PETSC ERROR: Libraries linked from
>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: User provided function() line 0 in unknown directory
>> unknown file
>> --------------------------------------------------------------------------
>> [mpi::mpi-api::mpi-abort]
>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>> with errorcode 59.
>>
>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> You may or may not see output from other processes, depending on
>> exactly when Open MPI kills them.
>> --------------------------------------------------------------------------
>> [p01-024:26516]
>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>> [0xffffffff0091f684]
>> [p01-024:26516]
>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>> [0xffffffff006c389c]
>> [p01-024:26516]
>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
>> [0xffffffff006db3ac]
>> [p01-024:26516]
>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>> [0xffffffff00281bf0]
>> [p01-024:26516] ./kmath.bcbcg [0x1bf620]
>> [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
>> [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
>> [p01-024:26516] [(nil)]
>> [p01-024:26516] ./kmath.bcbcg [0x1a2054]
>> [p01-024:26516] ./kmath.bcbcg [0x1064f8]
>> [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
>> [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
>> [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
>> [0xffffffff02d3b81c]
>> [p01-024:26516] ./kmath.bcbcg [0x1051ec]
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
>> batch system) has told this process to end
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>> find memory corruption errors
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>> [0]PETSC ERROR: to get more information on the crash.
>> [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [0]PETSC ERROR: Signal received!
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24
>> CDT 2013
>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [0]PETSC ERROR: See docs/index.html for manual pages.
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
>> Aug  5 18:24:40 2015
>> [0]PETSC ERROR: Libraries linked from
>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: User provided function() line 0 in unknown directory
>> unknown file
>> [ERR.] PLE 0019 plexec One of MPI processes was
>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
>>
>> However, if I change from
>>
>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>> to
>> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX
>> ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>
>> everything is fine.
>>
>> could you please suggest some way to solve this?
>>
>> Thanks
>>
>> Cong Li
>>
>> On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
>> wrote:
>>
>>> Thank you very much for your help and suggestions.
>>> With your help, finally I could continue my project.
>>>
>>> Regards
>>>
>>> Cong Li
>>>
>>>
>>>
>>> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>>>
>>>>   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
>>>> created.
>>>>
>>>>   Since you want to use the C that is passed in you should use
>>>> MAT_REUSE_MATRIX.
>>>>
>>>>   Note that since your B and C matrices are dense the issue of sparsity
>>>> pattern of C is not relevant.
>>>>
>>>>   Barry
>>>>
>>>> > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
>>>> wrote:
>>>> >
>>>> > Thanks very much. This answer is very helpful.
>>>> > And I have a following question.
>>>> > If I create B1, B2, .. by the way you suggested and then use
>>>> MatMatMult to do SPMM.
>>>> > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
>>>> fill,Mat *C)
>>>> > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>>>> >
>>>> > Thanks
>>>> >
>>>> > Cong Li
>>>> >
>>>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
>>>> wrote:
>>>> >
>>>> > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
>>>> wrote:
>>>> > >
>>>> > > I am sorry that I should have explained it more clearly.
>>>> > > Actually I want to compute a recurrence.
>>>> > >
>>>> > > Like, I want to firstly compute A*X1=B1, and then calculate
>>>> A*B1=B2, A*B2=B3 and so on.
>>>> > > Finally I want to combine all these results into a bigger matrix
>>>> C=[B1,B2 ...]
>>>> >
>>>> >    First create C with MatCreateDense(,&C). Then call
>>>> MatDenseGetArray(C,&array); then create B1 with
>>>> MatCreateDense(....,array,&B1); then create
>>>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
>>>> the number of __local__ rows in B1 times the number of columns in B1, then
>>>> create B3 with a larger shift etc.
>>>> >
>>>> >    Note that you are "sharing" the array space of C with B1, B2, B3,
>>>> ..., each Bi contains its columns of the C matrix.
>>>> >
>>>> >   Barry
>>>> >
>>>> >
>>>> >
>>>> > >
>>>> > > Is there any way to do this efficiently.
>>>> > >
>>>> > >
>>>> > >
>>>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>>>> patrick.sanan at gmail.com> wrote:
>>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>>> > > > Thanks for your reply.
>>>> > > >
>>>> > > > I have an other question.
>>>> > > > I want to do SPMM several times and combine result matrices into
>>>> one bigger
>>>> > > > matrix.
>>>> > > > for example
>>>> > > > I firstly calculate AX1=B1, AX2=B2 ...
>>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>>> > > >
>>>> > > > Could you please suggest a way of how to do this.
>>>> > > This is just linear algebra, nothing to do with PETSc specifically.
>>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>>> > > >
>>>> > > > Thanks
>>>> > > >
>>>> > > > Cong Li
>>>> > > >
>>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>>>> wrote:
>>>> > > >
>>>> > > > > Cong Li <solvercorleone at gmail.com> writes:
>>>> > > > >
>>>> > > > > > Hello,
>>>> > > > > >
>>>> > > > > > I am a PhD student using PETsc for my research.
>>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse
>>>> matrix-matrix
>>>> > > > > > multiplication) by using PETSc.
>>>> > > > >
>>>> > > > >
>>>> > > > >
>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>>> > > > >
>>>> > >
>>>> >
>>>> >
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/05296c4e/attachment-0001.html>

From Mahir.Ulker-Kaustell at tyrens.se  Wed Aug  5 09:46:24 2015
From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se)
Date: Wed, 5 Aug 2015 14:46:24 +0000
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
Message-ID: <e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>

Hong,

If I set parsymbfact:

$ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[63679,1],0]
  Exit code:    255
--------------------------------------------------------------------------

Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view.

If I do not set it, I get a serial run even if I specify ?n 2:

mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
?
KSP Object: 1 MPI processes
  type: preonly
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using NONE norm type for convergence test
PC Object: 1 MPI processes
  type: lu
    LU: out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: nd
    factor fill ratio given 0, needed 0
      Factored matrix follows:
        Mat Object:         1 MPI processes
          type: seqaij
          rows=954, cols=954
          package used to perform factorization: superlu_dist
          total: nonzeros=0, allocated nonzeros=0
          total number of mallocs used during MatSetValues calls =0
            SuperLU_DIST run parameters:
              Process grid nprow 1 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 0
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 1 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm
  linear system matrix = precond matrix:
  Mat Object:   1 MPI processes
    type: seqaij
    rows=954, cols=954
    total: nonzeros=34223, allocated nonzeros=34223
    total number of mallocs used during MatSetValues calls =0
      using I-node routines: found 668 nodes, limit used is 5

I am running PETSc via Cygwin on a windows machine.
When I installed PETSc the tests with different numbers of processes ran well.

Mahir


From: Hong [mailto:hzhang at mcs.anl.gov]
Sent: den 3 augusti 2015 19:06
To: ?lker-Kaustell, Mahir
Cc: Hong; Xiaoye S. Li; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir,


I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs.

If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1:
mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1

The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact.

Please run it with '-ksp_view' and see what
'SuperLU_DIST run parameters:' are being used, e.g.
petsc/src/ksp/ksp/examples/tutorials (maint)
$ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view

...
  SuperLU_DIST run parameters:
              Process grid nprow 2 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 1
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 2 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm

I do not understand why your code uses matrix input mode = global.

Hong


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 3 augusti 2015 16:46
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list

Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry found the culprit. I can reproduce it:
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact

Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
...

PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes.
Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run?

I'll add an error flag for these use cases.

Hong

On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I think I know the problem.   Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal().  This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).

That's why you get the following error:
Invalid ISPEC at line 484 in file get_perm_c.c

You need to use distributed matrix input interface pzgssvx() (without ABglobal)

Sherry


On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Hong and Sherry,

I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:

If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c

Mahir

From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 30 juli 2015 02:58
To: ?lker-Kaustell, Mahir
Cc: Xiaoye Li; PETSc users list

Subject: Fwd: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry fixed several bugs in superlu_dist-v4.1.
The current petsc-release interfaces with superlu_dist-v4.0.
We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?

Here is how to do it:
1. download superlu_dist v4.1
2. remove existing PETSC_ARCH directory, then configure petsc with
'--download-superlu_dist=superlu_dist_4.1.tar.gz'
3. build petsc

Let us know if the issue remains.

Hong


---------- Forwarded message ----------
From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Date: Wed, Jul 29, 2015 at 2:24 PM
Subject: Fwd: [petsc-users] SuperLU MPI-problem
To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>
Hong,
I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:

Invalid ISPEC at line 484 in file get_perm_c.c
This has nothing to do with my bug fix.
?  Shall we ask him to try the new version, or try to get him matrix?
Sherry
?

---------- Forwarded message ----------
From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
Date: Wed, Jul 22, 2015 at 1:32 PM
Subject: RE: [petsc-users] SuperLU MPI-problem
To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?

If i use -mat_superlu_dist_parsymbfact the program crashes with

Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------

If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with

Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
col block 3006 -------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------


/Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 22 juli 2015 21:34
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; petsc-users

Subject: Re: [petsc-users] SuperLU MPI-problem

In Petsc/superlu_dist interface, we set default

options.ParSymbFact = NO;

When user raises the flag "-mat_superlu_dist_parsymbfact",
we set

    options.ParSymbFact = YES;
    options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */

We do not change anything else.

Hong

On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.

The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.

I don't understand why you get the following error when you use
?-mat_superlu_dist_parsymbfact?.

Invalid ISPEC at line 484 in file get_perm_c.c

Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.

?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
?-mat_superlu_dist_parsymbfact?
? ?  (the default is to use  sequential symbolic factorization.)


Sherry

On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Thank you for your reply.

As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.

I am working in a Windows-environment and have installed PETSc through Cygwin.
Apparently, there is no support for Valgrind in this OS.

If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?


Best regards,
Mahir

______________________________________________
Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
______________________________________________

-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
Sent: den 22 juli 2015 02:57
To: ?lker-Kaustell, Mahir
Cc: Xiaoye S. Li; petsc-users
Subject: Re: [petsc-users] SuperLU MPI-problem


   Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)

   Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.

  Barry


==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42048== Syscall param write(buf) points to uninitialised byte(s)
==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Address 0x104810704 is on thread 1's stack
==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049== Conditional jump or move depends on uninitialised value(s)
==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049== Conditional jump or move depends on uninitialised value(s)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==


> On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Ok. So I have been creating the full factorization on each process. That gives me some hope!
>
> I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> However, now the program crashes with:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> And so on?
>
> From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
>
> Mahir
>
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
>
> From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> Sent: den 20 juli 2015 18:12
> To: ?lker-Kaustell, Mahir
> Cc: Hong; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
>
> The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
>
> You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
>
> Sherry Li
>
>
> On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Hong:
>
> Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
>
> The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
>
> Mahir
>
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 20 juli 2015 17:39
> To: ?lker-Kaustell, Mahir
> Cc: petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir:
> Direct solvers consume large amount of memory. Suggest to try followings:
>
> 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
>
> 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> Do you get memory crash in the 1st symbolic factorization?
> In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
>
> 3. Use a machine that gives larger memory.
>
> Hong
>
> Dear Petsc-Users,
>
> I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> The frequency dependency of the problem requires that the system
>
>                              [-omega^2M + K]u = F
>
> where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> K is a complex matrix, including material damping.
>
> I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
>
> The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
>
> I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
>
> Mahir


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/a9ba840f/attachment-0001.html>

From hzhang at mcs.anl.gov  Wed Aug  5 10:10:58 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 5 Aug 2015 10:10:58 -0500
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
	<e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
Message-ID: <CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>

Mahir:
As you noticed, you ran the code in serial mode, not parallel.
Check your code on input communicator, e.g., what input communicator do you
use in
KSPCreate(comm,&ksp)?

I have added error flag to superlu_dist interface (released version). When
user uses '-mat_superlu_dist_parsymbfact'
in serial mode, this option is ignored with a warning.

Hong

Hong,
>
>
>
> If I set parsymbfact:
>
>
>
> $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> mpiexec detected that one or more processes exited with non-zero status,
> thus causing
>
> the job to be terminated. The first process to do so was:
>
>
>
>   Process name: [[63679,1],0]
>
>   Exit code:    255
>
> --------------------------------------------------------------------------
>
>
>
> Since the program does not finish the call to KSPSolve(), we do not get
> any information about the KSP from ?ksp_view.
>
>
>
> If I do not set it, I get a serial run even if I specify ?n 2:
>
>
>
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -ksp_view
>
> ?
>
> KSP Object: 1 MPI processes
>
>   type: preonly
>
>   maximum iterations=10000, initial guess is zero
>
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>
>   left preconditioning
>
>   using NONE norm type for convergence test
>
> PC Object: 1 MPI processes
>
>   type: lu
>
>     LU: out-of-place factorization
>
>     tolerance for zero pivot 2.22045e-14
>
>     matrix ordering: nd
>
>     factor fill ratio given 0, needed 0
>
>       Factored matrix follows:
>
>         Mat Object:         1 MPI processes
>
>           type: seqaij
>
>           rows=954, cols=954
>
>           package used to perform factorization: superlu_dist
>
>           total: nonzeros=0, allocated nonzeros=0
>
>           total number of mallocs used during MatSetValues calls =0
>
>             SuperLU_DIST run parameters:
>
>               Process grid nprow 1 x npcol 1
>
>               Equilibrate matrix TRUE
>
>               Matrix input mode 0
>
>               Replace tiny pivots TRUE
>
>               Use iterative refinement FALSE
>
>               Processors in row 1 col partition 1
>
>               Row permutation LargeDiag
>
>               Column permutation METIS_AT_PLUS_A
>
>               Parallel symbolic factorization FALSE
>
>               Repeated factorization SamePattern_SameRowPerm
>
>   linear system matrix = precond matrix:
>
>   Mat Object:   1 MPI processes
>
>     type: seqaij
>
>     rows=954, cols=954
>
>     total: nonzeros=34223, allocated nonzeros=34223
>
>     total number of mallocs used during MatSetValues calls =0
>
>       using I-node routines: found 668 nodes, limit used is 5
>
>
>
> I am running PETSc via Cygwin on a windows machine.
>
> When I installed PETSc the tests with different numbers of processes ran
> well.
>
>
>
> Mahir
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 3 augusti 2015 19:06
> *To:* ?lker-Kaustell, Mahir
> *Cc:* Hong; Xiaoye S. Li; PETSc users list
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
>
>
> I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for
> parallel runs.
>
>
>
> If I use 2 processors, the program runs if I use
> *?mat_superlu_dist_parsymbfact=1*:
>
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> GLOBAL -mat_superlu_dist_parsymbfact=1
>
>
>
> The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so
> your code runs well without parsymbfact.
>
>
>
> Please run it with '-ksp_view' and see what
>
> 'SuperLU_DIST run parameters:' are being used, e.g.
>
> petsc/src/ksp/ksp/examples/tutorials (maint)
>
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package
> superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view
>
>
>
> ...
>
>   SuperLU_DIST run parameters:
>
>               Process grid nprow 2 x npcol 1
>
>               Equilibrate matrix TRUE
>
>               Matrix input mode 1
>
>               Replace tiny pivots TRUE
>
>               Use iterative refinement FALSE
>
>               Processors in row 2 col partition 1
>
>               Row permutation LargeDiag
>
>               Column permutation METIS_AT_PLUS_A
>
>               Parallel symbolic factorization FALSE
>
>               Repeated factorization SamePattern_SameRowPerm
>
>
>
> I do not understand why your code uses matrix input mode = global.
>
>
>
> Hong
>
>
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 3 augusti 2015 16:46
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; Hong; PETSc users list
>
>
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry found the culprit. I can reproduce it:
>
> petsc/src/ksp/ksp/examples/tutorials
>
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist
> -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> ...
>
>
>
> PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when
> using more than one processes.
>
> Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or
> set matinput=GLOBAL for parallel run?
>
>
>
> I'll add an error flag for these use cases.
>
>
>
> Hong
>
>
>
> On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I think I know the problem.   Since zdistribute.c is called, I guess you
> are using the global (replicated) matrix input interface,
> pzgssvx_ABglobal().  This interface does not allow you to use parallel
> symbolic factorization (since matrix is centralized).
>
>
>
> That's why you get the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> You need to use distributed matrix input interface pzgssvx() (without
> ABglobal)
>
> Sherry
>
>
>
>
>
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Hong and Sherry,
>
>
>
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
>
>
>
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid
> ISPEC at line 484 in file get_perm_c.c
>
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the
> program crashes with:  Calloc fails for SPA dense[]. at line 438 in file
> zdistribute.c
>
>
>
> Mahir
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 30 juli 2015 02:58
> *To:* ?lker-Kaustell, Mahir
> *Cc:* Xiaoye Li; PETSc users list
>
>
> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry fixed several bugs in superlu_dist-v4.1.
>
> The current petsc-release interfaces with superlu_dist-v4.0.
>
> We do not know whether the reported issue (attached below) has been
> resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
>
>
>
> Here is how to do it:
>
> 1. download superlu_dist v4.1
>
> 2. remove existing PETSC_ARCH directory, then configure petsc with
>
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
>
> 3. build petsc
>
>
>
> Let us know if the issue remains.
>
>
>
> Hong
>
>
>
>
>
> ---------- Forwarded message ----------
> From: *Xiaoye S. Li* <xsli at lbl.gov>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <hzhang at mcs.anl.gov>
>
> Hong,
>
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure
> whether the new fix to parallel symbolic factorization solves the problem.
> What bothers be is that he is getting the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> This has nothing to do with my bug fix.
>
> ?  Shall we ask him to try the new version, or try to get him matrix?
>
> Sherry
> ?
>
>
>
> ---------- Forwarded message ----------
> From: *Mahir.Ulker-Kaustell at tyrens.se <Mahir.Ulker-Kaustell at tyrens.se>* <
> Mahir.Ulker-Kaustell at tyrens.se>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <hzhang at mcs.anl.gov>, "Xiaoye S. Li" <xsli at lbl.gov>
> Cc: petsc-users <petsc-users at mcs.anl.gov>
>
> The 1000 was just a conservative guess. The number of non-zeros per row is
> in the tens in general but certain constraints lead to non-diagonal streaks
> in the sparsity-pattern.
>
> Is it the reordering of the matrix that is killing me here? How can I set
> options.ColPerm?
>
>
>
> If i use -mat_superlu_dist_parsymbfact the program crashes with
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:23 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat
> later) with
>
>
>
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
>
> col block 3006 -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> col block 1924 [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:58 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
>
>
> /Mahir
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>
> *Sent:* den 22 juli 2015 21:34
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; petsc-users
>
>
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> In Petsc/superlu_dist interface, we set default
>
>
>
> options.ParSymbFact = NO;
>
>
>
> When user raises the flag "-mat_superlu_dist_parsymbfact",
>
> we set
>
>
>
>     options.ParSymbFact = YES;
>
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for
> ParSymbFact regardless of user ordering setting */
>
>
>
> We do not change anything else.
>
>
>
> Hong
>
>
>
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I am trying to understand your problem. You said you are solving Naviers
> equation (elastodynamics) in the frequency domain, using finite element
> discretization.  I wonder why you have about 1000 nonzeros per row.
> Usually in many PDE discretized matrices, the number of nonzeros per row is
> in the tens (even for 3D problems), not in the thousands.   So, your matrix
> is quite a bit denser than many sparse matrices we deal with.
>
>
>
> The number of nonzeros in the L and U factors is much more than that in
> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be
> as bad as 50-100x fill ratio for 3D.  But since your matrix starts much
> denser (i.e., the underlying graph has many connections), it may not lend
> to any good ordering strategy to preserve sparsity of L and U; that is, the
> L and U fill ratio may be large.
>
>
>
> I don't understand why you get the following error when you use
>
> ?-mat_superlu_dist_parsymbfact?.
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
>
>
>
> ?Hong -- in order to use parallel symbolic factorization, is it sufficient
> to specify only
>
> ?-mat_superlu_dist_parsymbfact?
>
> ? ?  (the default is to use  sequential symbolic factorization.)
>
>
>
>
>
> Sherry
>
>
>
> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Thank you for your reply.
>
> As you have probably figured out already, I am not a computational
> scientist. I am a researcher in civil engineering (railways for high-speed
> traffic), trying to produce some, from my perspective, fairly large
> parametric studies based on finite element discretizations.
>
> I am working in a Windows-environment and have installed PETSc through
> Cygwin.
> Apparently, there is no support for Valgrind in this OS.
>
> If I have understood you correct, the memory issues are related to superLU
> and given my background, there is not much I can do. Is this correct?
>
>
> Best regards,
> Mahir
>
> ______________________________________________
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> ______________________________________________
>
>
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
> Sent: den 22 juli 2015 02:57
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
>
>    Run the program under valgrind
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use
> the option -mat_superlu_dist_parsymbfact I get many scary memory problems
> some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>
>    Note that I consider it unacceptable for running programs to EVER use
> uninitialized values; until these are all cleaned up I won't trust any runs
> like this.
>
>   Barry
>
>
>
>
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis
> (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis
> (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size
> 131,072 alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
>
>
> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That
> gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option
> ?-mat_superlu_dist_parsymbfact?.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on?
> >
> > From the SuperLU manual; I should give the option either YES or NO,
> however -mat_superlu_dist_parsymbfact YES makes the program crash in the
> same way as above.
> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the
> PETSc documentation
> >
> > Mahir
> >
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> >
> > From: Xiaoye S. Li [mailto:xsli at lbl.gov]
> > Sent: den 20 juli 2015 18:12
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization.
> Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to
> {L\U} data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option:
> '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very
> difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6
> degrees of freedom. The matrices are derived from finite elements so they
> are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory
> needed to less than 20GB, so if the solver needs twice or even three times
> as much, it should still work well. Or have I completely misunderstood
> something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov]
> > Sent: den 20 juli 2015 17:39
> > To: ?lker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too
> ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix
> orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so
> you only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from
> Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and
> stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small
> (1000 degrees of freedom) test problem on one or several processors, but it
> keeps crashing when I try it on my full scale (in the order of 10^6 degrees
> of freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error
> messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so
> I am hoping that someone can push  me in the right direction?
> >
> > Mahir
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/f5ed5f25/attachment-0001.html>

From nicolas.pozin at inria.fr  Wed Aug  5 10:20:41 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Wed, 5 Aug 2015 17:20:41 +0200 (CEST)
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <CAMYG4GmUVFrv99q7hcd=1CG-vrU7_KXkyDEKBd6A2sGh8Zj3Tw@mail.gmail.com>
References: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr>
	<624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr>
	<CAMYG4GmUVFrv99q7hcd=1CG-vrU7_KXkyDEKBd6A2sGh8Zj3Tw@mail.gmail.com>
Message-ID: <417177684.6699042.1438788041611.JavaMail.zimbra@inria.fr>

Thank you! 

----- Mail original -----

> De: "Matthew Knepley" <knepley at gmail.com>
> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
> Cc: "PETSc" <petsc-users at mcs.anl.gov>
> Envoy?: Mercredi 5 Ao?t 2015 13:38:20
> Objet: Re: [petsc-users] problem with MatShellGetContext

> On Wed, Aug 5, 2015 at 4:15 AM, Nicolas Pozin < nicolas.pozin at inria.fr >
> wrote:

> > Hello,
> 

> > I'm trying to solve a system with a matrix free operator and through
> > conjugate gradient method.
> 
> > To make ideas clear, I set up the following simple example (I am using
> > petsc-3.6) and I get this error message :
> 

> Yes, you are passing a C++ function userMult, so the compiler sticks "this"
> in as the first argument. We do not
> recommend this kind of wrapping.

> Thanks,

> Matt

> > "
> 
> > [0]PETSC ERROR: --------------------- Error Message
> > ------------------------------------
> 
> > [0]PETSC ERROR: Invalid argument!
> 
> > [0]PETSC ERROR: Wrong type of object: Parameter # 1!
> 
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> 
> > [0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013
> 
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> 
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> 
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> 
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> 
> > [0]PETSC ERROR: ./test on a ubuntu_release named pl-59080 by npozin Wed Aug
> > 5
> > 10:55:26 2015
> 
> > [0]PETSC ERROR: Libraries linked from
> > /home/npozin/Felisce_libraries/petsc_3.4.3/ubuntu_release/lib
> 
> > [0]PETSC ERROR: Configure run at Wed Jul 22 16:18:36 2015
> 
> > [0]PETSC ERROR: Configure options PETSC_ARCH=ubuntu_release --with-cxx=g++
> > --with-fc=gfortran --with-cc=gcc --with-x=0 --download-openmpi
> > --download-f-blas-lapack --download-superlu --download-superlu_dist
> > --with-superlu_dist=1 --download-metis --download-mumps --download-parmetis
> > --with-superlu_dist=1 --download-boost --with-boost=1 --download-scalapack
> > with-external-packages-dir=/home/npozin/Felisce_libraries/petsc_3.4.3/packages
> 
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> 
> > [0]PETSC ERROR: MatShellGetContext() line 202 in
> > /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/impls/shell/shell.c
> 
> > End userMult
> 
> > [0]PETSC ERROR: MatMult() line 2179 in
> > /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/interface/matrix.c
> 
> > [0]PETSC ERROR: KSP_MatMult() line 204 in
> > /home/npozin/Felisce_libraries/petsc_3.4.3/include/petsc-private/kspimpl.h
> 
> > [0]PETSC ERROR: KSPSolve_CG() line 219 in
> > /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/impls/cg/cg.c
> 
> > [0]PETSC ERROR: KSPSolve() line 441 in
> > /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/interface/itfunc.c
> 
> > "
> 

> > I don't understand where the problem comes from with the matrix argument of
> > MatShellGetContext.
> 
> > Any idea on what I do wrong?
> 

> > Thanks a lot,
> 
> > Nicolas
> 

> > #include <iostream>
> 
> > #include <petscksp.h>
> 

> > using namespace std;
> 

> > typedef struct {
> 
> > int val;
> 
> > } MyCtx;
> 

> > class ShellClass {
> 
> > Mat matShell;
> 
> > KSP ksp;
> 
> > PC pc;
> 
> > Vec x;
> 
> > Vec b;
> 

> > public:
> 
> > void userMult(Mat Amat, Vec x, Vec y) {
> 
> > cout << "Inside userMult" << endl;
> 

> > MyCtx *ctx;
> 
> > MatShellGetContext(Amat, (void *) ctx);
> 

> > cout << "End userMult" << endl;
> 
> > }
> 

> > void solveShell() {
> 
> > // context
> 
> > MyCtx *ctx = new MyCtx;
> 
> > ctx->val = 42;
> 

> > // pc
> 
> > PCCreate(PETSC_COMM_WORLD, &pc);
> 
> > PCSetType(pc, PCNONE);
> 

> > // ksp
> 
> > KSPCreate(PETSC_COMM_WORLD, &ksp);
> 
> > KSPSetType(ksp, KSPCG);
> 
> > KSPSetPC(ksp, pc);
> 
> > KSPSetFromOptions(ksp);
> 

> > // matshell
> 
> > int m = 10;
> 
> > int n = 10;
> 
> > MatCreateShell(PETSC_COMM_WORLD, m, n, PETSC_DETERMINE, PETSC_DETERMINE,
> > ctx,
> > &matShell);
> 
> > MatShellSetOperation(matShell, MATOP_MULT,
> > (void(*)(void))&ShellClass::userMult);
> 

> > // create vectors
> 
> > MatCreateVecs(matShell, &x, 0);
> 
> > VecDuplicate(x, &b);
> 
> > VecSet(b, 1.);
> 

> > // set operators
> 
> > KSPSetOperators(ksp, matShell, matShell);
> 

> > // solve (call to userMult)
> 
> > KSPSolve(ksp, b, x);
> 
> > }
> 
> > };
> 

> > int main(int argc, char** argv) {
> 
> > PetscInitialize(&argc, &argv, NULL, NULL);
> 

> > ShellClass foo;
> 
> > foo.solveShell();
> 

> > PetscFinalize();
> 
> > return 0;
> 
> > }
> 

> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/5edd0b7b/attachment.html>

From hzhang at mcs.anl.gov  Wed Aug  5 10:28:34 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 5 Aug 2015 10:28:34 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-=+OzDvd-Ys_tJx0-7BDr=AXzD2_SWM7woT6Ti3uvd3Uw@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<CAGCphBu7XbPJoUkEbPVycpOCqJaKT+EKdEsfrTTpt1Rbkac-BQ@mail.gmail.com>
	<CALSmn-=+OzDvd-Ys_tJx0-7BDr=AXzD2_SWM7woT6Ti3uvd3Uw@mail.gmail.com>
Message-ID: <CAGCphBvW9JkkH-EhuKcmQpG52mMRgoobPa5Chb49f49BUETZag@mail.gmail.com>

Cong,
For the first loop:

do stepIdx= 2, step_k

    blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)

    call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &

PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
    ...
  end do

Do you use Km(stepIdx) here?
If not, replace MatCreateDense() with
MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,...).
Is matrix A dense or sparse?

Hong


On Wed, Aug 5, 2015 at 9:43 AM, Cong Li <solvercorleone at gmail.com> wrote:

> Hong,
>
> Thanks for your answer.
> However, in my problem, I have a pre-allocated matrix K, and its columns
> are associated with Km(1), .. Km(step_k) respectively. What I want to do is
> to update Km(2) by using the result of A*Km(1), and then to update Km(3) by
> using the product of A and updated Km(2) and so on.
>
> So, I think I need to use MAT_REUSE_MATRIX from the beginning, since even
> when it is the first time  I call
> MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_
> DEFAULT_INTEGER,Km(stepIdx), ierr)',
>
> Km(stepIdx) have actually already been allocated (in K).
>
> Do you think it is possible that I can do this, and could you please
> suggest some possible ways.
>
> Thanks
>
> Cong Li
>
> On Wed, Aug 5, 2015 at 11:23 PM, Hong <hzhang at mcs.anl.gov> wrote:
>
>> Cong:
>> You cannot use "MAT_REUSE_MATRIX" on arbitrary matrix product.
>> The correct process is
>>
>> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_
>> DEFAULT_INTEGER,C, ierr)
>> call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_
>> DEFAULT_INTEGER,C, ierr)
>> i.e., C has data structure of A*Km(stepIdx-1) and is created in the
>> first call. C can be reused in the 2nd call when A or Km(stepIdx-1)
>> changed values, but not the structures.
>>
>> In your case, Km(stepIdx) = A*Km(stepIdx-1). You should do
>> 'call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX
>> ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)'
>> directly.
>>
>> Hong
>>
>> On Wed, Aug 5, 2015 at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>
>>> Hi
>>>
>>> I tried the method you suggested. However, I got the error message.
>>> My code and message are below.
>>>
>>> K is the big matrix containing column matrices.
>>>
>>> code:
>>>
>>> call MatGetArray(K,KArray,KArrayOffset,ierr)
>>>
>>> call MatGetLocalSize(R,local_RRow,local_RCol)
>>>
>>> call MatGetArray(R,RArray,RArrayOffset,ierr)
>>>
>>> call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>
>>> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
>>>
>>>   localRsize = local_RRow * local_RCol
>>>   do genIdx= 1, localRsize
>>>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>>   end do
>>>
>>>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>>
>>>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   do stepIdx= 2, step_k
>>>
>>>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>>>
>>>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>
>>> PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>>>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>   end do
>>>
>>>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>>
>>>    do stepIdx= 2, step_k
>>>
>>>
>>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>>   end do
>>>
>>>
>>> And I got the error message as below:
>>>
>>>
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>> [0]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> [0]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>>> find memory corruption errors
>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>>> and run
>>> [0]PETSC ERROR: to get more information on the crash.
>>> [0]PETSC ERROR: --------------------- Error Message
>>> ------------------------------------
>>> [0]PETSC ERROR: Signal received!
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>>> 22:15:24 CDT 2013
>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>> [0]PETSC ERROR: See docs/index.html for manual pages.
>>> [0]PETSC ERROR: --------------------[1]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>> ----------------------------------------------------
>>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
>>> Aug  5 18:24:40 2015
>>> [0]PETSC ERROR: Libraries linked from
>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>> unknown file
>>>
>>> --------------------------------------------------------------------------
>>> [mpi::mpi-api::mpi-abort]
>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>> with errorcode 59.
>>>
>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>> You may or may not see output from other processes, depending on
>>> exactly when Open MPI kills them.
>>>
>>> --------------------------------------------------------------------------
>>> [p01-024:26516]
>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>> [0xffffffff0091f684]
>>> [p01-024:26516]
>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>> [0xffffffff006c389c]
>>> [p01-024:26516]
>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
>>> [0xffffffff006db3ac]
>>> [p01-024:26516]
>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>>> [0xffffffff00281bf0]
>>> [p01-024:26516] ./kmath.bcbcg [0x1bf620]
>>> [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
>>> [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
>>> [p01-024:26516] [(nil)]
>>> [p01-024:26516] ./kmath.bcbcg [0x1a2054]
>>> [p01-024:26516] ./kmath.bcbcg [0x1064f8]
>>> [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
>>> [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
>>> [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
>>> [0xffffffff02d3b81c]
>>> [p01-024:26516] ./kmath.bcbcg [0x1051ec]
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
>>> batch system) has told this process to end
>>> [0]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> [0]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>>> find memory corruption errors
>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>>> and run
>>> [0]PETSC ERROR: to get more information on the crash.
>>> [0]PETSC ERROR: --------------------- Error Message
>>> ------------------------------------
>>> [0]PETSC ERROR: Signal received!
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>>> 22:15:24 CDT 2013
>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>> [0]PETSC ERROR: See docs/index.html for manual pages.
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
>>> Aug  5 18:24:40 2015
>>> [0]PETSC ERROR: Libraries linked from
>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>> unknown file
>>> [ERR.] PLE 0019 plexec One of MPI processes was
>>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
>>>
>>> However, if I change from
>>>
>>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>> to
>>> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX
>>> ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>>
>>> everything is fine.
>>>
>>> could you please suggest some way to solve this?
>>>
>>> Thanks
>>>
>>> Cong Li
>>>
>>> On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
>>> wrote:
>>>
>>>> Thank you very much for your help and suggestions.
>>>> With your help, finally I could continue my project.
>>>>
>>>> Regards
>>>>
>>>> Cong Li
>>>>
>>>>
>>>>
>>>> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>
>>>>>
>>>>>   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
>>>>> created.
>>>>>
>>>>>   Since you want to use the C that is passed in you should use
>>>>> MAT_REUSE_MATRIX.
>>>>>
>>>>>   Note that since your B and C matrices are dense the issue of
>>>>> sparsity pattern of C is not relevant.
>>>>>
>>>>>   Barry
>>>>>
>>>>> > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Thanks very much. This answer is very helpful.
>>>>> > And I have a following question.
>>>>> > If I create B1, B2, .. by the way you suggested and then use
>>>>> MatMatMult to do SPMM.
>>>>> > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
>>>>> fill,Mat *C)
>>>>> > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>>>>> >
>>>>> > Thanks
>>>>> >
>>>>> > Cong Li
>>>>> >
>>>>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
>>>>> wrote:
>>>>> >
>>>>> > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
>>>>> wrote:
>>>>> > >
>>>>> > > I am sorry that I should have explained it more clearly.
>>>>> > > Actually I want to compute a recurrence.
>>>>> > >
>>>>> > > Like, I want to firstly compute A*X1=B1, and then calculate
>>>>> A*B1=B2, A*B2=B3 and so on.
>>>>> > > Finally I want to combine all these results into a bigger matrix
>>>>> C=[B1,B2 ...]
>>>>> >
>>>>> >    First create C with MatCreateDense(,&C). Then call
>>>>> MatDenseGetArray(C,&array); then create B1 with
>>>>> MatCreateDense(....,array,&B1); then create
>>>>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
>>>>> the number of __local__ rows in B1 times the number of columns in B1, then
>>>>> create B3 with a larger shift etc.
>>>>> >
>>>>> >    Note that you are "sharing" the array space of C with B1, B2, B3,
>>>>> ..., each Bi contains its columns of the C matrix.
>>>>> >
>>>>> >   Barry
>>>>> >
>>>>> >
>>>>> >
>>>>> > >
>>>>> > > Is there any way to do this efficiently.
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>>>>> patrick.sanan at gmail.com> wrote:
>>>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>>>> > > > Thanks for your reply.
>>>>> > > >
>>>>> > > > I have an other question.
>>>>> > > > I want to do SPMM several times and combine result matrices into
>>>>> one bigger
>>>>> > > > matrix.
>>>>> > > > for example
>>>>> > > > I firstly calculate AX1=B1, AX2=B2 ...
>>>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>>>> > > >
>>>>> > > > Could you please suggest a way of how to do this.
>>>>> > > This is just linear algebra, nothing to do with PETSc specifically.
>>>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>>>> > > >
>>>>> > > > Thanks
>>>>> > > >
>>>>> > > > Cong Li
>>>>> > > >
>>>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>>>>> wrote:
>>>>> > > >
>>>>> > > > > Cong Li <solvercorleone at gmail.com> writes:
>>>>> > > > >
>>>>> > > > > > Hello,
>>>>> > > > > >
>>>>> > > > > > I am a PhD student using PETsc for my research.
>>>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse
>>>>> matrix-matrix
>>>>> > > > > > multiplication) by using PETSc.
>>>>> > > > >
>>>>> > > > >
>>>>> > > > >
>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>>>> > > > >
>>>>> > >
>>>>> >
>>>>> >
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/e9c6e8cb/attachment-0001.html>

From jed at jedbrown.org  Wed Aug  5 10:29:22 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 05 Aug 2015 09:29:22 -0600
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <55C1A7C3.7030209@gmail.com>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com>
Message-ID: <87lhdpzrcd.fsf@jedbrown.org>

Rongliang Chen <rongliang.chan at gmail.com> writes:

> Hi Jed,
>
> Thanks for your reply.
>
> I checked the netcdf and hdf5's config.log and could not find any 
> possible solutions. Can you help me check these two files again? The two 
> files are attached. Thanks.

It looks to me like libhdf5.a needs to be linked with -ldl, which partly
defeats the intent of static linking.  PETSc folks, do we blame this on
HDF5 with --disable-shared not being a truly static build?  Should we
pass LDLIBS=-ldl so that NetCDF can link?

This likely all works if you use shared libraries.  (I can't believe
this is still a debate in 2015.)

configure:16585: mpicc -o conftest -g3 -O0 -I/home/rlchen/soft/petsc-3.6.1/64bit-debug/include  conftest.c -lhdf5  -lm -Wl,-rpath,/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -L/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz >&5
/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__open':
/home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:535: undefined reference to `dlopen'
/home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:536: undefined reference to `dlerror'
/home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:544: undefined reference to `dlsym'
/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__search_table':
/home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:627: undefined reference to `dlsym'
/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__close':
/home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:661: undefined reference to `dlclose'
collect2: error: ld returned 1 exit status
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/c200a640/attachment.pgp>

From jed at jedbrown.org  Wed Aug  5 11:35:05 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 05 Aug 2015 10:35:05 -0600
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <CAMYG4GmUVFrv99q7hcd=1CG-vrU7_KXkyDEKBd6A2sGh8Zj3Tw@mail.gmail.com>
References: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr>
	<624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr>
	<CAMYG4GmUVFrv99q7hcd=1CG-vrU7_KXkyDEKBd6A2sGh8Zj3Tw@mail.gmail.com>
Message-ID: <87d1z1zoau.fsf@jedbrown.org>

Matthew Knepley <knepley at gmail.com> writes:
> Yes, you are passing a C++ function userMult, so the compiler sticks "this"
> in as the first argument. We do not
> recommend this kind of wrapping.

I.e., either make it a stand-alone function or make the class function static.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/7a27163e/attachment.pgp>

From solvercorleone at gmail.com  Wed Aug  5 11:50:59 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Thu, 6 Aug 2015 01:50:59 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAGCphBvW9JkkH-EhuKcmQpG52mMRgoobPa5Chb49f49BUETZag@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<CAGCphBu7XbPJoUkEbPVycpOCqJaKT+EKdEsfrTTpt1Rbkac-BQ@mail.gmail.com>
	<CALSmn-=+OzDvd-Ys_tJx0-7BDr=AXzD2_SWM7woT6Ti3uvd3Uw@mail.gmail.com>
	<CAGCphBvW9JkkH-EhuKcmQpG52mMRgoobPa5Chb49f49BUETZag@mail.gmail.com>
Message-ID: <0F56C97B-8D7E-470C-A82E-3EBE05824F83@gmail.com>

Hong,

A is a sparse matrix. 
In the first loop, I don't use Km(stepIdx) here. However, I want to let Km(stepIdx) matrix be associated with the some of columns of K here. In the second loop, I want to update Km(stepIdx) by using A*Km(stepIdx-1) so that the corresponding columns of K can be updated simultaneously . 
If I use MAT_INITIAL_MATRIX, I guess I have to copy the values of updated Km(stepIdx) back to the corresponding columns of K after SPMM call. But this copy phrase costs bandwidth, I think.

Do you have any idea by which I can do SPMM as well as remove the copy phrase.

Thanks


Cong Li

iPhone????

2015/08/06 0:28?Hong <hzhang at mcs.anl.gov> ??????:

> Cong,
> For the first loop:
> 
> do stepIdx= 2, step_k
> 
>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> 
>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>     ...
>   end do
> 
> Do you use Km(stepIdx) here?
> If not, replace MatCreateDense() with
> MatMatMult(A,Km(stepIdx-1),.MAT_INITIAL_MATRIX,..).
> Is matrix A dense or sparse?
> 
> Hong
> 
> 
>> On Wed, Aug 5, 2015 at 9:43 AM, Cong Li <solvercorleone at gmail.com> wrote:
>> Hong,
>> 
>> Thanks for your answer.
>> However, in my problem, I have a pre-allocated matrix K, and its columns are associated with Km(1), .. Km(step_k) respectively. What I want to do is to update Km(2) by using the result of A*Km(1), and then to update Km(3) by using the product of A and updated Km(2) and so on. 
>> 
>> So, I think I need to use MAT_REUSE_MATRIX from the beginning, since even when it is the first time  I call
>> MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)',
>> 
>> Km(stepIdx) have actually already been allocated (in K).
>> 
>> Do you think it is possible that I can do this, and could you please suggest some possible ways.
>> 
>> Thanks
>> 
>> Cong Li
>> 
>>> On Wed, Aug 5, 2015 at 11:23 PM, Hong <hzhang at mcs.anl.gov> wrote:
>>> Cong:
>>> You cannot use "MAT_REUSE_MATRIX" on arbitrary matrix product.
>>> The correct process is
>>> 
>>> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,C, ierr)
>>> call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,C, ierr)
>>> i.e., C has data structure of A*Km(stepIdx-1) and is created in the first call. C can be reused in the 2nd call when A or Km(stepIdx-1) changed values, but not the structures.
>>> 
>>> In your case, Km(stepIdx) = A*Km(stepIdx-1). You should do 
>>> 'call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)'
>>> directly.
>>> 
>>> Hong
>>> 
>>>> On Wed, Aug 5, 2015 at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>>> Hi
>>>> 
>>>> I tried the method you suggested. However, I got the error message.
>>>> My code and message are below.
>>>> 
>>>> K is the big matrix containing column matrices.
>>>> 
>>>> code: 
>>>> 
>>>> call MatGetArray(K,KArray,KArrayOffset,ierr)
>>>> 
>>>> call MatGetLocalSize(R,local_RRow,local_RCol)
>>>> 
>>>> call MatGetArray(R,RArray,RArrayOffset,ierr)
>>>> 
>>>> call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>>                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
>>>> 
>>>>   localRsize = local_RRow * local_RCol
>>>>   do genIdx= 1, localRsize
>>>>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>>>   end do
>>>> 
>>>>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>>> 
>>>>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>> 
>>>>   do stepIdx= 2, step_k
>>>> 
>>>>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>>>> 
>>>>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>>                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>>>>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>>   end do
>>>> 
>>>>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>>> 
>>>>    do stepIdx= 2, step_k
>>>> 
>>>>     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>>>   end do
>>>> 
>>>> 
>>>> And I got the error message as below:
>>>> 
>>>> 
>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
>>>> [0]PETSC ERROR: to get more information on the crash.
>>>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
>>>> [0]PETSC ERROR: Signal received!
>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 
>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>> [0]PETSC ERROR: See docs/index.html for manual pages.
>>>> [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------
>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>> ----------------------------------------------------
>>>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
>>>> [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
>>>> --------------------------------------------------------------------------
>>>> [mpi::mpi-api::mpi-abort]
>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
>>>> with errorcode 59.
>>>> 
>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>> You may or may not see output from other processes, depending on
>>>> exactly when Open MPI kills them.
>>>> --------------------------------------------------------------------------
>>>> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684]
>>>> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c]
>>>> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac]
>>>> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0]
>>>> [p01-024:26516] ./kmath.bcbcg [0x1bf620]
>>>> [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
>>>> [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
>>>> [p01-024:26516] [(nil)]
>>>> [p01-024:26516] ./kmath.bcbcg [0x1a2054]
>>>> [p01-024:26516] ./kmath.bcbcg [0x1064f8]
>>>> [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
>>>> [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
>>>> [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c]
>>>> [p01-024:26516] ./kmath.bcbcg [0x1051ec]
>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end
>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
>>>> [0]PETSC ERROR: to get more information on the crash.
>>>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
>>>> [0]PETSC ERROR: Signal received!
>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 
>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>> [0]PETSC ERROR: See docs/index.html for manual pages.
>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
>>>> [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
>>>> [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
>>>> 
>>>> However, if I change from 
>>>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>>> to 
>>>> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>>> 
>>>> everything is fine.
>>>> 
>>>> could you please suggest some way to solve this?
>>>> 
>>>> Thanks
>>>> 
>>>> Cong Li
>>>> 
>>>>> On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>>>> Thank you very much for your help and suggestions.
>>>>> With your help, finally I could continue my project.
>>>>> 
>>>>> Regards
>>>>> 
>>>>> Cong Li
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>> 
>>>>>>   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be created.
>>>>>> 
>>>>>>   Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX.
>>>>>> 
>>>>>>   Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant.
>>>>>> 
>>>>>>   Barry
>>>>>> 
>>>>>> > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>>>>> >
>>>>>> > Thanks very much. This answer is very helpful.
>>>>>> > And I have a following question.
>>>>>> > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM.
>>>>>> > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C)
>>>>>> > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>>>>>> >
>>>>>> > Thanks
>>>>>> >
>>>>>> > Cong Li
>>>>>> >
>>>>>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>> >
>>>>>> > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>>>>> > >
>>>>>> > > I am sorry that I should have explained it more clearly.
>>>>>> > > Actually I want to compute a recurrence.
>>>>>> > >
>>>>>> > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on.
>>>>>> > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...]
>>>>>> >
>>>>>> >    First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create
>>>>>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc.
>>>>>> >
>>>>>> >    Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix.
>>>>>> >
>>>>>> >   Barry
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > >
>>>>>> > > Is there any way to do this efficiently.
>>>>>> > >
>>>>>> > >
>>>>>> > >
>>>>>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
>>>>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>>>>> > > > Thanks for your reply.
>>>>>> > > >
>>>>>> > > > I have an other question.
>>>>>> > > > I want to do SPMM several times and combine result matrices into one bigger
>>>>>> > > > matrix.
>>>>>> > > > for example
>>>>>> > > > I firstly calculate AX1=B1, AX2=B2 ...
>>>>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>>>>> > > >
>>>>>> > > > Could you please suggest a way of how to do this.
>>>>>> > > This is just linear algebra, nothing to do with PETSc specifically.
>>>>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>>>>> > > >
>>>>>> > > > Thanks
>>>>>> > > >
>>>>>> > > > Cong Li
>>>>>> > > >
>>>>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>>>> > > >
>>>>>> > > > > Cong Li <solvercorleone at gmail.com> writes:
>>>>>> > > > >
>>>>>> > > > > > Hello,
>>>>>> > > > > >
>>>>>> > > > > > I am a PhD student using PETsc for my research.
>>>>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
>>>>>> > > > > > multiplication) by using PETSc.
>>>>>> > > > >
>>>>>> > > > >
>>>>>> > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>>>>> > > > >
>>>>>> > >
>>>>>> >
>>>>>> >
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/188eeae5/attachment-0001.html>

From knepley at gmail.com  Wed Aug  5 12:35:52 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 5 Aug 2015 12:35:52 -0500
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <87lhdpzrcd.fsf@jedbrown.org>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
Message-ID: <CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>

On Wed, Aug 5, 2015 at 10:29 AM, Jed Brown <jed at jedbrown.org> wrote:

> Rongliang Chen <rongliang.chan at gmail.com> writes:
>
> > Hi Jed,
> >
> > Thanks for your reply.
> >
> > I checked the netcdf and hdf5's config.log and could not find any
> > possible solutions. Can you help me check these two files again? The two
> > files are attached. Thanks.
>
> It looks to me like libhdf5.a needs to be linked with -ldl, which partly
> defeats the intent of static linking.  PETSc folks, do we blame this on
> HDF5 with --disable-shared not being a truly static build?  Should we
>

Yes, this is an error in the HDF5 buildsystem.


> pass LDLIBS=-ldl so that NetCDF can link?
>

That would work I think, but looks very strange for a static build (as you
said). It appears to me
that HDF5 is not suitable for a static build, and I would reconsider this
strategy.

  Thanks,

     Matt


> This likely all works if you use shared libraries.  (I can't believe
> this is still a debate in 2015.)
>
> configure:16585: mpicc -o conftest -g3 -O0
> -I/home/rlchen/soft/petsc-3.6.1/64bit-debug/include  conftest.c -lhdf5  -lm
> -Wl,-rpath,/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib
> -L/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -lhdf5hl_fortran
> -lhdf5_fortran -lhdf5_hl -lhdf5 -lz >&5
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In
> function `H5PL__open':
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:535:
> undefined reference to `dlopen'
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:536:
> undefined reference to `dlerror'
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:544:
> undefined reference to `dlsym'
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In
> function `H5PL__search_table':
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:627:
> undefined reference to `dlsym'
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In
> function `H5PL__close':
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:661:
> undefined reference to `dlclose'
> collect2: error: ld returned 1 exit status

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/c92f9837/attachment.html>

From bsmith at mcs.anl.gov  Wed Aug  5 13:01:44 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 5 Aug 2015 13:01:44 -0500
Subject: [petsc-users] Questions about creation of matrix and setting
	its values
In-Reply-To: <CALSmn-kdmqpvnDQWGfzCtqvsoEwNFZ55vD4d76K-roVf9Zv+-w@mail.gmail.com>
References: <CALSmn-kC=WuLwz7tLQ5JTM09P-gz7LhtaOj2qY1LVZ79-zu9uQ@mail.gmail.com>
	<FCA42CED-E98C-4954-A7BE-3236AB6B6F05@gmail.com>
	<CALSmn-kdmqpvnDQWGfzCtqvsoEwNFZ55vD4d76K-roVf9Zv+-w@mail.gmail.com>
Message-ID: <1AD73AA4-8437-40BC-AFD2-EF1471B27E34@mcs.anl.gov>


> On Aug 5, 2015, at 4:47 AM, Cong Li <solvercorleone at gmail.com> wrote:
> 
>> Hi,
>> 
>> I am wondering if it is necessary to call 
>> MatAssemblyBegin() and MatAssemblyEnd() after MatDuplicate() with the option of MAT_DO_NOT_COPY_VALUES.
>> For example, if I have an assembled matrix A, and I call MatDuplicate() to create B, which is a duplication of A. 
>> Do I need to call MatAssemblyBegin() and MatAssemblyEnd() for B.

  You should not need to. But note if you use the flag MAT_DO_NOT_COPY_VALUES the new matrix will have zero for all the numerical entries.
> 
>> 
>> And 2nd question is : 
>> just after the MatCreateDense() call and before MatAssemblyBegin() and MatAssemblyEnd() calls, can I use MatGetArray() ?

   Dense matrices are a special case because room is always allocated for all the matrix entries and one can use MatDenseGetArray() to either access or set any local value. So if you are only setting/accessing local values you don't actually need to use MatSetValues() (though you can) you can just access the locations directly after using MatDenseGetArray().  There is no harm in calling the MatAssemblyBegin/End() "extra" times for dense matrices.

>> 
>> The 3rd question is: 
>> before the MatAssemblyBegin() and MatAssemblyEnd() calls, should I use INSERT_VALUES or ADD_VALUES for MatSetValues call? And why ?
>> Actually I have read the manual, but I still feel confused about the means of INSERT_VALUES and ADD_VALUES.
> There are a couple of reasons that you need to make these MatAssemblyBegin/End calls:
> - entries can be set which should be stored on a different process, so these need to be communicated
> - for compressed formats like CSR (as used in MATAIJ and others) the entries need to be processed into their compressed form
> In general, the entries of the matrix are not stored in their "usable" forms until you make the MatAssembleEnd call. Rather they are kept in some easy-to-insert-into intermediate storage. INSERT_VALUES means that old values that might be in the matrix are overwritten , and ADD_VALUES means that the new entries from intermediate storage are added to the old values.
> 
> 
>> 
>> Thanks 
>> 
>> Cong Li
> 


From bsmith at mcs.anl.gov  Wed Aug  5 13:30:58 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 5 Aug 2015 13:30:58 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
Message-ID: <F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>


   Send the entire code so that we can compile it and run it ourselves to see what is going wrong.

  Barry

> On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> 
> Hi
> 
> I tried the method you suggested. However, I got the error message.
> My code and message are below.
> 
> K is the big matrix containing column matrices.
> 
> code: 
> 
> call MatGetArray(K,KArray,KArrayOffset,ierr)
> 
> call MatGetLocalSize(R,local_RRow,local_RCol)
> 
> call MatGetArray(R,RArray,RArrayOffset,ierr)
> 
> call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> 
>   localRsize = local_RRow * local_RCol
>   do genIdx= 1, localRsize
>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>   end do
> 
>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> 
>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> 
>   do stepIdx= 2, step_k
> 
>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> 
>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>   end do
> 
>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> 
>    do stepIdx= 2, step_k
> 
>     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>   end do
> 
> 
> And I got the error message as below:
> 
> 
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> ----------------------------------------------------
> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> --------------------------------------------------------------------------
> [mpi::mpi-api::mpi-abort]
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
> with errorcode 59.
> 
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684]
> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c]
> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac]
> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0]
> [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> [p01-024:26516] [(nil)]
> [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c]
> [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> 
> However, if I change from 
> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> to 
> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> 
> everything is fine.
> 
> could you please suggest some way to solve this?
> 
> Thanks
> 
> Cong Li
> 
> On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com> wrote:
> Thank you very much for your help and suggestions.
> With your help, finally I could continue my project.
> 
> Regards
> 
> Cong Li
> 
> 
> 
> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be created.
> 
>   Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX.
> 
>   Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant.
> 
>   Barry
> 
> > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > Thanks very much. This answer is very helpful.
> > And I have a following question.
> > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM.
> > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C)
> > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> >
> > Thanks
> >
> > Cong Li
> >
> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > I am sorry that I should have explained it more clearly.
> > > Actually I want to compute a recurrence.
> > >
> > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on.
> > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...]
> >
> >    First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create
> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc.
> >
> >    Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix.
> >
> >   Barry
> >
> >
> >
> > >
> > > Is there any way to do this efficiently.
> > >
> > >
> > >
> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > Thanks for your reply.
> > > >
> > > > I have an other question.
> > > > I want to do SPMM several times and combine result matrices into one bigger
> > > > matrix.
> > > > for example
> > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > >
> > > > Could you please suggest a way of how to do this.
> > > This is just linear algebra, nothing to do with PETSc specifically.
> > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > > >
> > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I am a PhD student using PETsc for my research.
> > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > > > > > multiplication) by using PETSc.
> > > > >
> > > > >
> > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > >
> > >
> >
> >
> 
> 
> 


From gpau at lbl.gov  Wed Aug  5 13:35:46 2015
From: gpau at lbl.gov (George Pau)
Date: Wed, 5 Aug 2015 11:35:46 -0700
Subject: [petsc-users] mumps compile error
Message-ID: <CABUTOumCKc=-cPvADVVmnm-6KXn-MwwuQOe2_mMkGU5gxKxn9Q@mail.gmail.com>

Hi,

I am now having issues with mumps.  Similar to my  configure options in my
previous email:

--with-debugging=1 --with-shared-libraries=0
--prefix=/global/homes/g/gpau/clm-rom/install/t
pls --with-cxx-dialect=C++11 --download-elemental --download-mumps
--download-scalapack --do
wnload-parmetis --download-metis --download-hdf5 --download-netcdf
--with-x=0 --with-cc=/opt
/cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC
--with-fc=/opt/cray/crayp
e/2.3.1/bin/ftn

I am having now having problem with mumps but I couldn't figure out what is
wrong.  I have this problem on both NERSC/Edison (using Intel compiler) and
on Ubuntu (using gcc compiler):

mumps_c.c(136): error: identifier "MUMPS_INT8" is undefined

             MUMPS_INT8     *keep8,

             ^


mumps_c.c(284): error: identifier "MUMPS_INT8" is undefined
      MUMPS_INT8 *keep8;
      ^

mumps_c.c(284): error: identifier "keep8" is undefined
      MUMPS_INT8 *keep8;

The error messages are longer and can be found in the attached log file.

However, if I leave out the --prefix  option, then everything is fine.
MUMPS will configure correctly. It seems like a linking issue.


-- 
George Pau
Earth Sciences Division
Lawrence Berkeley National Laboratory
One Cyclotron, MS 74-120
Berkeley, CA 94720

(510) 486-7196
gpau at lbl.gov
http://esd.lbl.gov/about/staff/georgepau/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/bcf8a2ab/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 7357916 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/bcf8a2ab/attachment-0001.obj>

From bsmith at mcs.anl.gov  Wed Aug  5 14:13:40 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 5 Aug 2015 14:13:40 -0500
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
Message-ID: <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>


> On Aug 5, 2015, at 12:35 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Wed, Aug 5, 2015 at 10:29 AM, Jed Brown <jed at jedbrown.org> wrote:
> Rongliang Chen <rongliang.chan at gmail.com> writes:
> 
> > Hi Jed,
> >
> > Thanks for your reply.
> >
> > I checked the netcdf and hdf5's config.log and could not find any
> > possible solutions. Can you help me check these two files again? The two
> > files are attached. Thanks.
> 
> It looks to me like libhdf5.a needs to be linked with -ldl, which partly
> defeats the intent of static linking.  PETSc folks, do we blame this on
> HDF5 with --disable-shared not being a truly static build?  Should we
> 
> Yes, this is an error in the HDF5 buildsystem.
>  
> pass LDLIBS=-ldl so that NetCDF can link?
> 
> That would work I think, but looks very strange for a static build (as you said). It appears to me
> that HDF5 is not suitable for a static build, and I would reconsider this strategy.

   Our approach is always to work around bugs and stupidity in other packages design, so if HDF5 needs to link against -ldl (and -lm it looks like) even with static libraries then we just make that a dependency in hdf5.py  We sure don't require people to know that they should "pass LDLIBS=-ldl so that NetCDF can link?"

   Why is my answer not obvious?

  Barry

BTW: needsmath should probably eliminated and handled properly where math is just another package that some packages depend on. 

> 
>   Thanks,
> 
>      Matt
>  
> This likely all works if you use shared libraries.  (I can't believe
> this is still a debate in 2015.)
> 
> configure:16585: mpicc -o conftest -g3 -O0 -I/home/rlchen/soft/petsc-3.6.1/64bit-debug/include  conftest.c -lhdf5  -lm -Wl,-rpath,/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -L/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz >&5
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__open':
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:535: undefined reference to `dlopen'
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:536: undefined reference to `dlerror'
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:544: undefined reference to `dlsym'
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__search_table':
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:627: undefined reference to `dlsym'
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__close':
> /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:661: undefined reference to `dlclose'
> collect2: error: ld returned 1 exit status
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From bsmith at mcs.anl.gov  Wed Aug  5 14:23:57 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 5 Aug 2015 14:23:57 -0500
Subject: [petsc-users] mumps compile error
In-Reply-To: <CABUTOumCKc=-cPvADVVmnm-6KXn-MwwuQOe2_mMkGU5gxKxn9Q@mail.gmail.com>
References: <CABUTOumCKc=-cPvADVVmnm-6KXn-MwwuQOe2_mMkGU5gxKxn9Q@mail.gmail.com>
Message-ID: <B827DCB7-D8DC-4839-93B8-67D57A4C055C@mcs.anl.gov>


   George,

   Try running with a completely empty directory for the --prefix (perhaps it is picking up some incorrect/outdated stuff there). 

    Also send us the configure.log file from running without a prefix (so we can see what the differences are).

   I ran a --prefix configure build with MUMPS just now and it was fine.

  Barry

> On Aug 5, 2015, at 1:35 PM, George Pau <gpau at lbl.gov> wrote:
> 
> Hi, 
> 
> I am now having issues with mumps.  Similar to my  configure options in my previous email:
> 
> --with-debugging=1 --with-shared-libraries=0 --prefix=/global/homes/g/gpau/clm-rom/install/t
> pls --with-cxx-dialect=C++11 --download-elemental --download-mumps --download-scalapack --do
> wnload-parmetis --download-metis --download-hdf5 --download-netcdf --with-x=0 --with-cc=/opt
> /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC --with-fc=/opt/cray/crayp
> e/2.3.1/bin/ftn
> 
> I am having now having problem with mumps but I couldn't figure out what is wrong.  I have this problem on both NERSC/Edison (using Intel compiler) and on Ubuntu (using gcc compiler):
> 
> mumps_c.c(136): error: identifier "MUMPS_INT8" is undefined                                 
>              MUMPS_INT8     *keep8,                                                         
>              ^                                                                              
> 
> mumps_c.c(284): error: identifier "MUMPS_INT8" is undefined
>       MUMPS_INT8 *keep8;                                   
>       ^                                                    
> 
> mumps_c.c(284): error: identifier "keep8" is undefined
>       MUMPS_INT8 *keep8;
> 
> The error messages are longer and can be found in the attached log file.  
> 
> However, if I leave out the --prefix  option, then everything is fine. MUMPS will configure correctly. It seems like a linking issue.
> 
> 
> 
> -- 
> George Pau
> Earth Sciences Division
> Lawrence Berkeley National Laboratory
> One Cyclotron, MS 74-120
> Berkeley, CA 94720
> 
> (510) 486-7196
> gpau at lbl.gov
> http://esd.lbl.gov/about/staff/georgepau/
> <configure.log>


From jed at jedbrown.org  Wed Aug  5 14:26:43 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 05 Aug 2015 13:26:43 -0600
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
	<49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
Message-ID: <87pp31y1sc.fsf@jedbrown.org>

Barry Smith <bsmith at mcs.anl.gov> writes:
>    Our approach is always to work around bugs and stupidity in other packages design,

Do we report it to them as a bug?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/3bf9a45d/attachment.pgp>

From bsmith at mcs.anl.gov  Wed Aug  5 15:11:42 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 5 Aug 2015 15:11:42 -0500
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <87pp31y1sc.fsf@jedbrown.org>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
	<49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
	<87pp31y1sc.fsf@jedbrown.org>
Message-ID: <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>


> On Aug 5, 2015, at 2:26 PM, Jed Brown <jed at jedbrown.org> wrote:
> 
> Barry Smith <bsmith at mcs.anl.gov> writes:
>>   Our approach is always to work around bugs and stupidity in other packages design,
> 
> Do we report it to them as a bug?

  When there is a place to report them then we should and sometimes do.

  Barry


From jychang48 at gmail.com  Wed Aug  5 15:43:26 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Wed, 5 Aug 2015 15:43:26 -0500
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
	<49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
	<87pp31y1sc.fsf@jedbrown.org>
	<37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
Message-ID: <CAP2=TMhEjmpk7f-dRb3Ct8A4wfcoeCe11LpojjcKYE-An2S-cA@mail.gmail.com>

Hi everyone,

Not sure how related this may be, but I am also having trouble installing
petsc 3.6.1 with hdf5. In fact ./configure hangs at "Running make on HDF5;
this may take several minutes". I grew impatient after 30 minutes so I had
to kill it.

Attached is the configure.log. Can y'all figure out what's going on here?

Thanks,
Justin

On Wed, Aug 5, 2015 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Aug 5, 2015, at 2:26 PM, Jed Brown <jed at jedbrown.org> wrote:
> >
> > Barry Smith <bsmith at mcs.anl.gov> writes:
> >>   Our approach is always to work around bugs and stupidity in other
> packages design,
> >
> > Do we report it to them as a bug?
>
>   When there is a place to report them then we should and sometimes do.
>
>   Barry
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/512d7815/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 2401571 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/512d7815/attachment-0001.obj>

From gpau at lbl.gov  Wed Aug  5 15:44:37 2015
From: gpau at lbl.gov (George Pau)
Date: Wed, 5 Aug 2015 13:44:37 -0700
Subject: [petsc-users] mumps compile error
In-Reply-To: <B827DCB7-D8DC-4839-93B8-67D57A4C055C@mcs.anl.gov>
References: <CABUTOumCKc=-cPvADVVmnm-6KXn-MwwuQOe2_mMkGU5gxKxn9Q@mail.gmail.com>
	<B827DCB7-D8DC-4839-93B8-67D57A4C055C@mcs.anl.gov>
Message-ID: <CABUTOumnx7o42QgRhVi9fj1QSLOqfvOt2-gr_1um-A9weqA7Lg@mail.gmail.com>

Hi Barry,

Thanks.  That is indeed the reason.  Once I deleted the old directory,
everything is configured correctly.

Thanks,
George


On Wed, Aug 5, 2015 at 12:23 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    George,
>
>    Try running with a completely empty directory for the --prefix (perhaps
> it is picking up some incorrect/outdated stuff there).
>
>     Also send us the configure.log file from running without a prefix (so
> we can see what the differences are).
>
>    I ran a --prefix configure build with MUMPS just now and it was fine.
>
>   Barry
>
> > On Aug 5, 2015, at 1:35 PM, George Pau <gpau at lbl.gov> wrote:
> >
> > Hi,
> >
> > I am now having issues with mumps.  Similar to my  configure options in
> my previous email:
> >
> > --with-debugging=1 --with-shared-libraries=0
> --prefix=/global/homes/g/gpau/clm-rom/install/t
> > pls --with-cxx-dialect=C++11 --download-elemental --download-mumps
> --download-scalapack --do
> > wnload-parmetis --download-metis --download-hdf5 --download-netcdf
> --with-x=0 --with-cc=/opt
> > /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC
> --with-fc=/opt/cray/crayp
> > e/2.3.1/bin/ftn
> >
> > I am having now having problem with mumps but I couldn't figure out what
> is wrong.  I have this problem on both NERSC/Edison (using Intel compiler)
> and on Ubuntu (using gcc compiler):
> >
> > mumps_c.c(136): error: identifier "MUMPS_INT8" is undefined
> >              MUMPS_INT8     *keep8,
> >              ^
> >
> > mumps_c.c(284): error: identifier "MUMPS_INT8" is undefined
> >       MUMPS_INT8 *keep8;
> >       ^
> >
> > mumps_c.c(284): error: identifier "keep8" is undefined
> >       MUMPS_INT8 *keep8;
> >
> > The error messages are longer and can be found in the attached log file.
> >
> > However, if I leave out the --prefix  option, then everything is fine.
> MUMPS will configure correctly. It seems like a linking issue.
> >
> >
> >
> > --
> > George Pau
> > Earth Sciences Division
> > Lawrence Berkeley National Laboratory
> > One Cyclotron, MS 74-120
> > Berkeley, CA 94720
> >
> > (510) 486-7196
> > gpau at lbl.gov
> > http://esd.lbl.gov/about/staff/georgepau/
> > <configure.log>
>
>


-- 
George Pau
Earth Sciences Division
Lawrence Berkeley National Laboratory
One Cyclotron, MS 74-120
Berkeley, CA 94720

(510) 486-7196
gpau at lbl.gov
http://esd.lbl.gov/about/staff/georgepau/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/9b6756f4/attachment.html>

From knepley at gmail.com  Wed Aug  5 15:49:35 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 5 Aug 2015 15:49:35 -0500
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <CAP2=TMhEjmpk7f-dRb3Ct8A4wfcoeCe11LpojjcKYE-An2S-cA@mail.gmail.com>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
	<49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
	<87pp31y1sc.fsf@jedbrown.org>
	<37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
	<CAP2=TMhEjmpk7f-dRb3Ct8A4wfcoeCe11LpojjcKYE-An2S-cA@mail.gmail.com>
Message-ID: <CAMYG4G=o48dczBuQbNJPU25OLqUnmXjFgvtJhAG5gSEBK=0pXw@mail.gmail.com>

On Wed, Aug 5, 2015 at 3:43 PM, Justin Chang <jychang48 at gmail.com> wrote:

> Hi everyone,
>
> Not sure how related this may be, but I am also having trouble installing
> petsc 3.6.1 with hdf5. In fact ./configure hangs at "Running make on HDF5;
> this may take several minutes". I grew impatient after 30 minutes so I had
> to kill it.
>
> Attached is the configure.log. Can y'all figure out what's going on here?
>

Is this being built on a system with nonlocal disk? This can make builds
take forever. The timeout on this operation is
100 minutes.

  Thanks,

    Matt


> Thanks,
> Justin
>
> On Wed, Aug 5, 2015 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> > On Aug 5, 2015, at 2:26 PM, Jed Brown <jed at jedbrown.org> wrote:
>> >
>> > Barry Smith <bsmith at mcs.anl.gov> writes:
>> >>   Our approach is always to work around bugs and stupidity in other
>> packages design,
>> >
>> > Do we report it to them as a bug?
>>
>>   When there is a place to report them then we should and sometimes do.
>>
>>   Barry
>>
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/25f58059/attachment.html>

From jed at jedbrown.org  Wed Aug  5 15:49:52 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 05 Aug 2015 14:49:52 -0600
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
	<49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
	<87pp31y1sc.fsf@jedbrown.org>
	<37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
Message-ID: <87egjhxxxr.fsf@jedbrown.org>

Barry Smith <bsmith at mcs.anl.gov> writes:

>> On Aug 5, 2015, at 2:26 PM, Jed Brown <jed at jedbrown.org> wrote:
>> 
>> Barry Smith <bsmith at mcs.anl.gov> writes:
>>>   Our approach is always to work around bugs and stupidity in other packages design,
>> 
>> Do we report it to them as a bug?
>
>   When there is a place to report them then we should and sometimes do.

Nominally, help at hdfgroup.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/a3f9513e/attachment.pgp>

From jychang48 at gmail.com  Wed Aug  5 15:51:25 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Wed, 5 Aug 2015 15:51:25 -0500
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <CAMYG4G=o48dczBuQbNJPU25OLqUnmXjFgvtJhAG5gSEBK=0pXw@mail.gmail.com>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
	<49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
	<87pp31y1sc.fsf@jedbrown.org>
	<37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
	<CAP2=TMhEjmpk7f-dRb3Ct8A4wfcoeCe11LpojjcKYE-An2S-cA@mail.gmail.com>
	<CAMYG4G=o48dczBuQbNJPU25OLqUnmXjFgvtJhAG5gSEBK=0pXw@mail.gmail.com>
Message-ID: <CAP2=TMiRy-ib1=Z9GRMvV29TNy5ays8onYq3Dh=FofOu8_A58g@mail.gmail.com>

no this is being built on my macbook laptop. The difference between
yesterday and today is that I updated my gcc compiler (downloaded via brew)
to gcc-5.0 and reinstalled openmpi accordingly

On Wed, Aug 5, 2015 at 3:49 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Aug 5, 2015 at 3:43 PM, Justin Chang <jychang48 at gmail.com> wrote:
>
>> Hi everyone,
>>
>> Not sure how related this may be, but I am also having trouble installing
>> petsc 3.6.1 with hdf5. In fact ./configure hangs at "Running make on HDF5;
>> this may take several minutes". I grew impatient after 30 minutes so I had
>> to kill it.
>>
>> Attached is the configure.log. Can y'all figure out what's going on here?
>>
>
> Is this being built on a system with nonlocal disk? This can make builds
> take forever. The timeout on this operation is
> 100 minutes.
>
>   Thanks,
>
>     Matt
>
>
>> Thanks,
>> Justin
>>
>> On Wed, Aug 5, 2015 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>> > On Aug 5, 2015, at 2:26 PM, Jed Brown <jed at jedbrown.org> wrote:
>>> >
>>> > Barry Smith <bsmith at mcs.anl.gov> writes:
>>> >>   Our approach is always to work around bugs and stupidity in other
>>> packages design,
>>> >
>>> > Do we report it to them as a bug?
>>>
>>>   When there is a place to report them then we should and sometimes do.
>>>
>>>   Barry
>>>
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/b5e3b2d1/attachment.html>

From knepley at gmail.com  Wed Aug  5 15:56:19 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 5 Aug 2015 15:56:19 -0500
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <CAP2=TMiRy-ib1=Z9GRMvV29TNy5ays8onYq3Dh=FofOu8_A58g@mail.gmail.com>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
	<49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
	<87pp31y1sc.fsf@jedbrown.org>
	<37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
	<CAP2=TMhEjmpk7f-dRb3Ct8A4wfcoeCe11LpojjcKYE-An2S-cA@mail.gmail.com>
	<CAMYG4G=o48dczBuQbNJPU25OLqUnmXjFgvtJhAG5gSEBK=0pXw@mail.gmail.com>
	<CAP2=TMiRy-ib1=Z9GRMvV29TNy5ays8onYq3Dh=FofOu8_A58g@mail.gmail.com>
Message-ID: <CAMYG4G=e4ZJDs3adooQ=F5nxLBamdkQRw6gyq2e3d0fkyuHc=g@mail.gmail.com>

On Wed, Aug 5, 2015 at 3:51 PM, Justin Chang <jychang48 at gmail.com> wrote:

> no this is being built on my macbook laptop. The difference between
> yesterday and today is that I updated my gcc compiler (downloaded via brew)
> to gcc-5.0 and reinstalled openmpi accordingly
>

Can you go to the directory and execute it manually?

  cd
/Users/justin/Software/petsc/arch-darwin-c-opt-firedrake/externalpackages/hdf5-1.8.12
&& /usr/bin/make -j 4

Maybe there is a problem using multiple make threads on this machine...

  Thanks,

     Matt


> On Wed, Aug 5, 2015 at 3:49 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Wed, Aug 5, 2015 at 3:43 PM, Justin Chang <jychang48 at gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>> Not sure how related this may be, but I am also having trouble
>>> installing petsc 3.6.1 with hdf5. In fact ./configure hangs at "Running
>>> make on HDF5; this may take several minutes". I grew impatient after 30
>>> minutes so I had to kill it.
>>>
>>> Attached is the configure.log. Can y'all figure out what's going on here?
>>>
>>
>> Is this being built on a system with nonlocal disk? This can make builds
>> take forever. The timeout on this operation is
>> 100 minutes.
>>
>>   Thanks,
>>
>>     Matt
>>
>>
>>> Thanks,
>>> Justin
>>>
>>> On Wed, Aug 5, 2015 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>>>
>>>> > On Aug 5, 2015, at 2:26 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>> >
>>>> > Barry Smith <bsmith at mcs.anl.gov> writes:
>>>> >>   Our approach is always to work around bugs and stupidity in other
>>>> packages design,
>>>> >
>>>> > Do we report it to them as a bug?
>>>>
>>>>   When there is a place to report them then we should and sometimes do.
>>>>
>>>>   Barry
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/bd3d148e/attachment-0001.html>

From solvercorleone at gmail.com  Wed Aug  5 20:56:24 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Thu, 6 Aug 2015 10:56:24 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
Message-ID: <CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>

The entire source code files are attached.

Also I copy and paste the here in this email

thanks

program test

  implicit none

#include <finclude/petscsys.h>
#include <finclude/petscvec.h>
#include <finclude/petscmat.h>
#include <finclude/petscviewer.h>


  PetscViewer    :: view
  ! sparse matrix
  Mat            :: A
  ! distributed dense matrix of size n x m
  Mat            :: B, X, R, QDlt, AQDlt
  ! distributed dense matrix of size n x (m x k)
  Mat            :: Q, K, AQ_p, AQ
  ! local dense matrix (every process keep the identical copies), (m x k) x
(m x k)
  Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt

  PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
step_k,bsize
  PetscInt       :: ownRowS,ownRowE
  PetscScalar, allocatable :: XInit(:,:)
  PetscInt       :: XInitI, XInitJ
  PetscScalar    :: v=1.0
  PetscBool      :: flg
  PetscMPIInt    :: size, rank

  character(128) ::  fin, rhsfin


  call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
  call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
  call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)

  ! read binary matrix file
  call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
  call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)

  call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
  call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
  call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)


  call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
  call MatCreate(PETSC_COMM_WORLD,A,ierr)
  call MatSetType(A,MATAIJ,ierr)
  call MatLoad(A,view,ierr)
  call PetscViewerDestroy(view,ierr)
  ! for the time being, assume mDim == nDim is true
  call MatGetSize(A, nDim, mDim, ierr)

  if (rank == 0) then
    print*,'Mat Size = ', nDim, mDim
  end if

  call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
  call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)

  ! create right-and-side matrix
  ! for the time being, choose row-wise decomposition
  ! for the time being, assume nDim%size = 0
  call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
                      bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
  call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view,
ierr)
  call MatLoad(B,view,ierr)
  call PetscViewerDestroy(view,ierr)
  call MatGetSize(B, rhsMDim, rhsNDim, ierr)
  if (rank == 0) then
    print*,'MRHS Size actually are:', rhsMDim, rhsNDim
    print*,'MRHS Size should be:', nDim, bsize
  end if
  call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)

  ! inintial value guses X
  allocate(XInit(nDim,bsize))
  do XInitI=1, nDim
    do XInitJ=1, bsize
      XInit(XInitI,XInitJ) = 1.0
    end do
  end do

  call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
                      bsize, nDim, bsize,XInit, X, ierr)

  call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)


  !  B, X, R, QDlt, AQDlt
  call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
  call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)

  call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
  call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)

  call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
  call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)

! Q, K, AQ_p, AQ of size n x (m x k)
  call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
                      (bsize*step_k), nDim,
(bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
  call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)

  call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
  call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)

  call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
  call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)

  call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
  call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)

! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
  call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
                         PETSC_NULL_SCALAR, QtAQ, ierr)
  call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)

  call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
  call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)

  call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
  call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)

  call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
  call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)

! calculation for R

! call matrix powers kernel
  call mpk_monomial (K, A, R, step_k, rank,size)

! destory matrices
  deallocate(XInit)

  call MatDestroy(B, ierr)
  call MatDestroy(X, ierr)
  call MatDestroy(R, ierr)
  call MatDestroy(QDlt, ierr)
  call MatDestroy(AQDlt, ierr)
  call MatDestroy(Q, ierr)
  call MatDestroy(K, ierr)
  call MatDestroy(AQ_p, ierr)
  call MatDestroy(AQ, ierr)
  call MatDestroy(QtAQ, ierr)
  call MatDestroy(QtAQ_p, ierr)
  call MatDestroy(Dlt, ierr)


  call PetscFinalize(ierr)

  stop

end program test


subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
implicit none

#include <finclude/petscsys.h>
#include <finclude/petscvec.h>
#include <finclude/petscmat.h>
#include <finclude/petscviewer.h>

Mat            :: K, Km(step_k)
Mat            :: A, R
PetscMPIInt    :: sizeMPI, rank
PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
PetscInt       :: ierr
PetscInt       :: stepIdx, blockShift, localRsize
  PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
  PetscOffset    :: KArrayOffset, RArrayOffset

call MatGetSize(R, nDim, bsize, ierr)
  if (rank == 0) then
   print*,'Mat Size = ', nDim, bsize
  end if

  call MatGetArray(K,KArray,KArrayOffset,ierr)

  call MatGetLocalSize(R,local_RRow,local_RCol)
!   print *, "local_RRow,local_RCol", local_RRow,local_RCol

  ! get arry from R to add values to K(1)
  call MatGetArray(R,RArray,RArrayOffset,ierr)

  call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
                        PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset +
1), Km(1), ierr)


!   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
!                  ,local_RRow * local_RCol *
STORAGE_SIZE(PetscScalarSize), ierr)

  localRsize = local_RRow * local_RCol
  do genIdx= 1, localRsize
    KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
  end do


  call MatRestoreArray(R,RArray,RArrayOffset,ierr)

  call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
  call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)

  do stepIdx= 2, step_k

    blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)

    call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
                        PETSC_DECIDE , nDim, bsize,KArray(blockShift+1),
Km(stepIdx), ierr)
    call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
    call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)

  end do

  call MatRestoreArray(K,KArray,KArrayOffset,ierr)

!   do stepIdx= 2, step_k
  do stepIdx= 2,2

    call
MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
ierr)
!     call
MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
ierr)
  end do

!   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)

end subroutine mpk_monomial


Cong Li

On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    Send the entire code so that we can compile it and run it ourselves to
> see what is going wrong.
>
>   Barry
>
> > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > Hi
> >
> > I tried the method you suggested. However, I got the error message.
> > My code and message are below.
> >
> > K is the big matrix containing column matrices.
> >
> > code:
> >
> > call MatGetArray(K,KArray,KArrayOffset,ierr)
> >
> > call MatGetLocalSize(R,local_RRow,local_RCol)
> >
> > call MatGetArray(R,RArray,RArrayOffset,ierr)
> >
> > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset +
> 1), Km(1), ierr)
> >
> >   localRsize = local_RRow * local_RCol
> >   do genIdx= 1, localRsize
> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> >   end do
> >
> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   do stepIdx= 2, step_k
> >
> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> >
> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1),
> Km(stepIdx), ierr)
> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >   end do
> >
> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> >
> >    do stepIdx= 2, step_k
> >
> >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> >   end do
> >
> >
> > And I got the error message as below:
> >
> >
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > [0]PETSC ERROR: Signal received!
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
> ------------------------------------------------------------------------
> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > ----------------------------------------------------
> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
> Aug  5 18:24:40 2015
> > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> >
> --------------------------------------------------------------------------
> > [mpi::mpi-api::mpi-abort]
> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > with errorcode 59.
> >
> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > You may or may not see output from other processes, depending on
> > exactly when Open MPI kills them.
> >
> --------------------------------------------------------------------------
> > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
> [0xffffffff0091f684]
> > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
> [0xffffffff006c389c]
> > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
> [0xffffffff006db3ac]
> > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
> [0xffffffff00281bf0]
> > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > [p01-024:26516] [(nil)]
> > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
> [0xffffffff02d3b81c]
> > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
> batch system) has told this process to end
> > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > [0]PETSC ERROR: Signal received!
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed
> Aug  5 18:24:40 2015
> > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > [ERR.] PLE 0019 plexec One of MPI processes was
> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> >
> > However, if I change from
> > call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > to
> > call MatMatMult(A,Km(stepIdx-1),
> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> >
> > everything is fine.
> >
> > could you please suggest some way to solve this?
> >
> > Thanks
> >
> > Cong Li
> >
> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > Thank you very much for your help and suggestions.
> > With your help, finally I could continue my project.
> >
> > Regards
> >
> > Cong Li
> >
> >
> >
> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
> created.
> >
> >   Since you want to use the C that is passed in you should use
> MAT_REUSE_MATRIX.
> >
> >   Note that since your B and C matrices are dense the issue of sparsity
> pattern of C is not relevant.
> >
> >   Barry
> >
> > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > Thanks very much. This answer is very helpful.
> > > And I have a following question.
> > > If I create B1, B2, .. by the way you suggested and then use
> MatMatMult to do SPMM.
> > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
> fill,Mat *C)
> > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > >
> > > > I am sorry that I should have explained it more clearly.
> > > > Actually I want to compute a recurrence.
> > > >
> > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
> A*B2=B3 and so on.
> > > > Finally I want to combine all these results into a bigger matrix
> C=[B1,B2 ...]
> > >
> > >    First create C with MatCreateDense(,&C). Then call
> MatDenseGetArray(C,&array); then create B1 with
> MatCreateDense(....,array,&B1); then create
> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the
> number of __local__ rows in B1 times the number of columns in B1, then
> create B3 with a larger shift etc.
> > >
> > >    Note that you are "sharing" the array space of C with B1, B2, B3,
> ..., each Bi contains its columns of the C matrix.
> > >
> > >   Barry
> > >
> > >
> > >
> > > >
> > > > Is there any way to do this efficiently.
> > > >
> > > >
> > > >
> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
> patrick.sanan at gmail.com> wrote:
> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > Thanks for your reply.
> > > > >
> > > > > I have an other question.
> > > > > I want to do SPMM several times and combine result matrices into
> one bigger
> > > > > matrix.
> > > > > for example
> > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > >
> > > > > Could you please suggest a way of how to do this.
> > > > This is just linear algebra, nothing to do with PETSc specifically.
> > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > >
> > > > > Thanks
> > > > >
> > > > > Cong Li
> > > > >
> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
> wrote:
> > > > >
> > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > I am wondering if there is a way to implement SPMM (Sparse
> matrix-matrix
> > > > > > > multiplication) by using PETSc.
> > > > > >
> > > > > >
> > > > > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > >
> > > >
> > >
> > >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/d3b3034b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mainprogram.f90
Type: application/octet-stream
Size: 5681 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/d3b3034b/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpk_monomial.f90
Type: application/octet-stream
Size: 2294 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/d3b3034b/attachment-0003.obj>

From solvercorleone at gmail.com  Wed Aug  5 21:00:37 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Thu, 6 Aug 2015 11:00:37 +0900
Subject: [petsc-users] Questions about creation of matrix and setting
	its values
In-Reply-To: <1AD73AA4-8437-40BC-AFD2-EF1471B27E34@mcs.anl.gov>
References: <CALSmn-kC=WuLwz7tLQ5JTM09P-gz7LhtaOj2qY1LVZ79-zu9uQ@mail.gmail.com>
	<FCA42CED-E98C-4954-A7BE-3236AB6B6F05@gmail.com>
	<CALSmn-kdmqpvnDQWGfzCtqvsoEwNFZ55vD4d76K-roVf9Zv+-w@mail.gmail.com>
	<1AD73AA4-8437-40BC-AFD2-EF1471B27E34@mcs.anl.gov>
Message-ID: <CALSmn-mzxb0guDNSybn2141deC2bGw2pWR2ybCPM+QcfFze9Gg@mail.gmail.com>

Barry,

Thanks. I think I understood.

Cong Li

On Thu, Aug 6, 2015 at 3:01 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Aug 5, 2015, at 4:47 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> >> Hi,
> >>
> >> I am wondering if it is necessary to call
> >> MatAssemblyBegin() and MatAssemblyEnd() after MatDuplicate() with the
> option of MAT_DO_NOT_COPY_VALUES.
> >> For example, if I have an assembled matrix A, and I call MatDuplicate()
> to create B, which is a duplication of A.
> >> Do I need to call MatAssemblyBegin() and MatAssemblyEnd() for B.
>
>   You should not need to. But note if you use the flag
> MAT_DO_NOT_COPY_VALUES the new matrix will have zero for all the numerical
> entries.
> >
> >>
> >> And 2nd question is :
> >> just after the MatCreateDense() call and before MatAssemblyBegin() and
> MatAssemblyEnd() calls, can I use MatGetArray() ?
>
>    Dense matrices are a special case because room is always allocated for
> all the matrix entries and one can use MatDenseGetArray() to either access
> or set any local value. So if you are only setting/accessing local values
> you don't actually need to use MatSetValues() (though you can) you can just
> access the locations directly after using MatDenseGetArray().  There is no
> harm in calling the MatAssemblyBegin/End() "extra" times for dense matrices.
>
> >>
> >> The 3rd question is:
> >> before the MatAssemblyBegin() and MatAssemblyEnd() calls, should I use
> INSERT_VALUES or ADD_VALUES for MatSetValues call? And why ?
> >> Actually I have read the manual, but I still feel confused about the
> means of INSERT_VALUES and ADD_VALUES.
> > There are a couple of reasons that you need to make these
> MatAssemblyBegin/End calls:
> > - entries can be set which should be stored on a different process, so
> these need to be communicated
> > - for compressed formats like CSR (as used in MATAIJ and others) the
> entries need to be processed into their compressed form
> > In general, the entries of the matrix are not stored in their "usable"
> forms until you make the MatAssembleEnd call. Rather they are kept in some
> easy-to-insert-into intermediate storage. INSERT_VALUES means that old
> values that might be in the matrix are overwritten , and ADD_VALUES means
> that the new entries from intermediate storage are added to the old values.
> >
> >
> >>
> >> Thanks
> >>
> >> Cong Li
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/4cf9177f/attachment.html>

From bsmith at mcs.anl.gov  Wed Aug  5 21:43:35 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 5 Aug 2015 21:43:35 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
Message-ID: <CD17FABF-17CD-4620-96FA-7698AB9F33E6@mcs.anl.gov>


  Send the input files so I can actually run the thing (and make sure they are not very big, debugging with large data sets is silly and unproductive).

   Thanks

  Barry

> On Aug 5, 2015, at 8:56 PM, Cong Li <solvercorleone at gmail.com> wrote:
> 
> The entire source code files are attached.
> 
> Also I copy and paste the here in this email
> 
> thanks
> 
> program test
> 
>   implicit none
> 
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscmat.h>
> #include <finclude/petscviewer.h>
> 
> 
>   PetscViewer    :: view
>   ! sparse matrix
>   Mat            :: A
>   ! distributed dense matrix of size n x m
>   Mat            :: B, X, R, QDlt, AQDlt
>   ! distributed dense matrix of size n x (m x k)
>   Mat            :: Q, K, AQ_p, AQ
>   ! local dense matrix (every process keep the identical copies), (m x k) x (m x k)
>   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
> 
>   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize
>   PetscInt       :: ownRowS,ownRowE
>   PetscScalar, allocatable :: XInit(:,:)
>   PetscInt       :: XInitI, XInitJ
>   PetscScalar    :: v=1.0
>   PetscBool      :: flg
>   PetscMPIInt    :: size, rank
> 
>   character(128) ::  fin, rhsfin
> 
> 
>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
>   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> 
>   ! read binary matrix file
>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
> 
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
> 
> 
>   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
>   call MatCreate(PETSC_COMM_WORLD,A,ierr)
>   call MatSetType(A,MATAIJ,ierr)
>   call MatLoad(A,view,ierr)
>   call PetscViewerDestroy(view,ierr)
>   ! for the time being, assume mDim == nDim is true
>   call MatGetSize(A, nDim, mDim, ierr)
> 
>   if (rank == 0) then
>     print*,'Mat Size = ', nDim, mDim
>   end if
> 
>   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
>   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
> 
>   ! create right-and-side matrix
>   ! for the time being, choose row-wise decomposition
>   ! for the time being, assume nDim%size = 0
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
>   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
>   call MatLoad(B,view,ierr)
>   call PetscViewerDestroy(view,ierr)
>   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
>   if (rank == 0) then
>     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
>     print*,'MRHS Size should be:', nDim, bsize
>   end if
>   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
> 
>   ! inintial value guses X
>   allocate(XInit(nDim,bsize))
>   do XInitI=1, nDim
>     do XInitJ=1, bsize
>       XInit(XInitI,XInitJ) = 1.0
>     end do
>   end do
> 
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       bsize, nDim, bsize,XInit, X, ierr)
> 
>   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
> 
> 
>   !  B, X, R, QDlt, AQDlt
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
>   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
>   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
>   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> 
> ! Q, K, AQ_p, AQ of size n x (m x k)
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
>   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
>  
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
>   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
>   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
>   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
> 
> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
>   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
>                          PETSC_NULL_SCALAR, QtAQ, ierr)
>   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
>   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
>   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
>   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> 
> ! calculation for R
> 
> ! call matrix powers kernel
>   call mpk_monomial (K, A, R, step_k, rank,size)
> 
> ! destory matrices
>   deallocate(XInit)
> 
>   call MatDestroy(B, ierr)
>   call MatDestroy(X, ierr)
>   call MatDestroy(R, ierr)
>   call MatDestroy(QDlt, ierr)
>   call MatDestroy(AQDlt, ierr)
>   call MatDestroy(Q, ierr)
>   call MatDestroy(K, ierr)
>   call MatDestroy(AQ_p, ierr)
>   call MatDestroy(AQ, ierr)
>   call MatDestroy(QtAQ, ierr)
>   call MatDestroy(QtAQ_p, ierr)
>   call MatDestroy(Dlt, ierr)
> 
> 
>   call PetscFinalize(ierr)
> 
>   stop
> 
> end program test
> 
> 
> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> 	implicit none
> 
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscmat.h>
> #include <finclude/petscviewer.h>
> 
> 	Mat            :: K, Km(step_k)
> 	Mat            :: A, R
> 	PetscMPIInt    :: sizeMPI, rank
> 	PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
> 	PetscInt       :: ierr
> 	PetscInt       :: stepIdx, blockShift, localRsize
>   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
>   PetscOffset    :: KArrayOffset, RArrayOffset
> 
> 	call MatGetSize(R, nDim, bsize, ierr)
>  	if (rank == 0) then
>   	  print*,'Mat Size = ', nDim, bsize
>   	end if
> 
>   call MatGetArray(K,KArray,KArrayOffset,ierr)
> 
>   call MatGetLocalSize(R,local_RRow,local_RCol)
> !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
> 
>   ! get arry from R to add values to K(1)
>   call MatGetArray(R,RArray,RArrayOffset,ierr)
> 
>   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> 
> 
> !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
> !                  ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr)
> 
>   localRsize = local_RRow * local_RCol
>   do genIdx= 1, localRsize
>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>   end do
> 
> 
>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> 
>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>   
>   do stepIdx= 2, step_k
> 
>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> 
>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>     
>   end do
> 
>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> 
> !   do stepIdx= 2, step_k
>   do stepIdx= 2,2 
> 
>     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> !     call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>   end do
> 
> !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
> end subroutine mpk_monomial
> 
> 
> 
> Cong Li
> 
> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>    Send the entire code so that we can compile it and run it ourselves to see what is going wrong.
> 
>   Barry
> 
> > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > Hi
> >
> > I tried the method you suggested. However, I got the error message.
> > My code and message are below.
> >
> > K is the big matrix containing column matrices.
> >
> > code:
> >
> > call MatGetArray(K,KArray,KArrayOffset,ierr)
> >
> > call MatGetLocalSize(R,local_RRow,local_RCol)
> >
> > call MatGetArray(R,RArray,RArrayOffset,ierr)
> >
> > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> >
> >   localRsize = local_RRow * local_RCol
> >   do genIdx= 1, localRsize
> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> >   end do
> >
> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   do stepIdx= 2, step_k
> >
> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> >
> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >   end do
> >
> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> >
> >    do stepIdx= 2, step_k
> >
> >     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> >   end do
> >
> >
> > And I got the error message as below:
> >
> >
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> > [0]PETSC ERROR: Signal received!
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------
> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > ----------------------------------------------------
> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> > --------------------------------------------------------------------------
> > [mpi::mpi-api::mpi-abort]
> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > with errorcode 59.
> >
> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > You may or may not see output from other processes, depending on
> > exactly when Open MPI kills them.
> > --------------------------------------------------------------------------
> > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684]
> > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c]
> > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac]
> > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0]
> > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > [p01-024:26516] [(nil)]
> > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c]
> > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end
> > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> > [0]PETSC ERROR: Signal received!
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> >
> > However, if I change from
> > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > to
> > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> >
> > everything is fine.
> >
> > could you please suggest some way to solve this?
> >
> > Thanks
> >
> > Cong Li
> >
> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > Thank you very much for your help and suggestions.
> > With your help, finally I could continue my project.
> >
> > Regards
> >
> > Cong Li
> >
> >
> >
> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be created.
> >
> >   Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX.
> >
> >   Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant.
> >
> >   Barry
> >
> > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > Thanks very much. This answer is very helpful.
> > > And I have a following question.
> > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM.
> > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C)
> > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > > >
> > > > I am sorry that I should have explained it more clearly.
> > > > Actually I want to compute a recurrence.
> > > >
> > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on.
> > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...]
> > >
> > >    First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create
> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc.
> > >
> > >    Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix.
> > >
> > >   Barry
> > >
> > >
> > >
> > > >
> > > > Is there any way to do this efficiently.
> > > >
> > > >
> > > >
> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > Thanks for your reply.
> > > > >
> > > > > I have an other question.
> > > > > I want to do SPMM several times and combine result matrices into one bigger
> > > > > matrix.
> > > > > for example
> > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > >
> > > > > Could you please suggest a way of how to do this.
> > > > This is just linear algebra, nothing to do with PETSc specifically.
> > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > >
> > > > > Thanks
> > > > >
> > > > > Cong Li
> > > > >
> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > > > >
> > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > > > > > > multiplication) by using PETSc.
> > > > > >
> > > > > >
> > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > >
> > > >
> > >
> > >
> >
> >
> >
> 
> 
> <mainprogram.f90><mpk_monomial.f90>


From gbisht at lbl.gov  Wed Aug  5 22:21:18 2015
From: gbisht at lbl.gov (Gautam Bisht)
Date: Wed, 5 Aug 2015 20:21:18 -0700
Subject: [petsc-users] Error running DMPlex example
In-Reply-To: <CAMYG4Gnozz2GQ04GAtX+xMV=MUVG_Qanzntb4t21g+-=ANYOSA@mail.gmail.com>
References: <CAPz1Tneo=KcNRNVWj-5Debs2wPg2TUT27sO8o6bH_whTet49ug@mail.gmail.com>
	<CAMYG4Gnozz2GQ04GAtX+xMV=MUVG_Qanzntb4t21g+-=ANYOSA@mail.gmail.com>
Message-ID: <CAPz1Tner12g5FLXkLTAbRMUgVd-u7kc=m7hdEO+E=06a=OvX6w@mail.gmail.com>

Hi Matt,

Instead of using gcc4.9, I reinstalled PETSc using clang on mac os x 10.10
and the example runs fine.

Btw, are there any examples that use DMPlex+DMComposite?

Thanks,
-Gautam.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/fc684559/attachment.html>

From hzhang at mcs.anl.gov  Wed Aug  5 22:23:58 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 5 Aug 2015 22:23:58 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
Message-ID: <CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>

Cong,

Can you write out math equations for mpk_monomial (),
list input and output parameters.

Note:
1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End
2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after
    MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..)

Hong


On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com> wrote:

> The entire source code files are attached.
>
> Also I copy and paste the here in this email
>
> thanks
>
> program test
>
>   implicit none
>
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscmat.h>
> #include <finclude/petscviewer.h>
>
>
>   PetscViewer    :: view
>   ! sparse matrix
>   Mat            :: A
>   ! distributed dense matrix of size n x m
>   Mat            :: B, X, R, QDlt, AQDlt
>   ! distributed dense matrix of size n x (m x k)
>   Mat            :: Q, K, AQ_p, AQ
>   ! local dense matrix (every process keep the identical copies), (m x k)
> x (m x k)
>   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
>
>   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
> step_k,bsize
>   PetscInt       :: ownRowS,ownRowE
>   PetscScalar, allocatable :: XInit(:,:)
>   PetscInt       :: XInitI, XInitJ
>   PetscScalar    :: v=1.0
>   PetscBool      :: flg
>   PetscMPIInt    :: size, rank
>
>   character(128) ::  fin, rhsfin
>
>
>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
>   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
>
>   ! read binary matrix file
>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
>
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
>
>
>   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
>   call MatCreate(PETSC_COMM_WORLD,A,ierr)
>   call MatSetType(A,MATAIJ,ierr)
>   call MatLoad(A,view,ierr)
>   call PetscViewerDestroy(view,ierr)
>   ! for the time being, assume mDim == nDim is true
>   call MatGetSize(A, nDim, mDim, ierr)
>
>   if (rank == 0) then
>     print*,'Mat Size = ', nDim, mDim
>   end if
>
>   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
>   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
>
>   ! create right-and-side matrix
>   ! for the time being, choose row-wise decomposition
>   ! for the time being, assume nDim%size = 0
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
>   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view,
> ierr)
>   call MatLoad(B,view,ierr)
>   call PetscViewerDestroy(view,ierr)
>   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
>   if (rank == 0) then
>     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
>     print*,'MRHS Size should be:', nDim, bsize
>   end if
>   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
>
>   ! inintial value guses X
>   allocate(XInit(nDim,bsize))
>   do XInitI=1, nDim
>     do XInitJ=1, bsize
>       XInit(XInitI,XInitJ) = 1.0
>     end do
>   end do
>
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       bsize, nDim, bsize,XInit, X, ierr)
>
>   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
>
>
>   !  B, X, R, QDlt, AQDlt
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
>   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
>
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
>   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
>
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
>   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>
> ! Q, K, AQ_p, AQ of size n x (m x k)
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       (bsize*step_k), nDim,
> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
>   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
>
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
>   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
>
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
>   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
>   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
>
> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
>   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
>                          PETSC_NULL_SCALAR, QtAQ, ierr)
>   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
>   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
>   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
>
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
>   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>
> ! calculation for R
>
> ! call matrix powers kernel
>   call mpk_monomial (K, A, R, step_k, rank,size)
>
> ! destory matrices
>   deallocate(XInit)
>
>   call MatDestroy(B, ierr)
>   call MatDestroy(X, ierr)
>   call MatDestroy(R, ierr)
>   call MatDestroy(QDlt, ierr)
>   call MatDestroy(AQDlt, ierr)
>   call MatDestroy(Q, ierr)
>   call MatDestroy(K, ierr)
>   call MatDestroy(AQ_p, ierr)
>   call MatDestroy(AQ, ierr)
>   call MatDestroy(QtAQ, ierr)
>   call MatDestroy(QtAQ_p, ierr)
>   call MatDestroy(Dlt, ierr)
>
>
>   call PetscFinalize(ierr)
>
>   stop
>
> end program test
>
>
> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> implicit none
>
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscmat.h>
> #include <finclude/petscviewer.h>
>
> Mat            :: K, Km(step_k)
> Mat            :: A, R
> PetscMPIInt    :: sizeMPI, rank
> PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
> PetscInt       :: ierr
> PetscInt       :: stepIdx, blockShift, localRsize
>   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
>   PetscOffset    :: KArrayOffset, RArrayOffset
>
> call MatGetSize(R, nDim, bsize, ierr)
>   if (rank == 0) then
>    print*,'Mat Size = ', nDim, bsize
>   end if
>
>   call MatGetArray(K,KArray,KArrayOffset,ierr)
>
>   call MatGetLocalSize(R,local_RRow,local_RCol)
> !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
>
>   ! get arry from R to add values to K(1)
>   call MatGetArray(R,RArray,RArrayOffset,ierr)
>
>   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset +
> 1), Km(1), ierr)
>
>
> !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
> !                  ,local_RRow * local_RCol *
> STORAGE_SIZE(PetscScalarSize), ierr)
>
>   localRsize = local_RRow * local_RCol
>   do genIdx= 1, localRsize
>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>   end do
>
>
>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>
>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>
>   do stepIdx= 2, step_k
>
>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>
>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1),
> Km(stepIdx), ierr)
>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>
>   end do
>
>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>
> !   do stepIdx= 2, step_k
>   do stepIdx= 2,2
>
>     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> !     call
> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
>   end do
>
> !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
>
> end subroutine mpk_monomial
>
>
>
> Cong Li
>
> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>    Send the entire code so that we can compile it and run it ourselves to
>> see what is going wrong.
>>
>>   Barry
>>
>> > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
>> >
>> > Hi
>> >
>> > I tried the method you suggested. However, I got the error message.
>> > My code and message are below.
>> >
>> > K is the big matrix containing column matrices.
>> >
>> > code:
>> >
>> > call MatGetArray(K,KArray,KArrayOffset,ierr)
>> >
>> > call MatGetLocalSize(R,local_RRow,local_RCol)
>> >
>> > call MatGetArray(R,RArray,RArrayOffset,ierr)
>> >
>> > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset
>> + 1), Km(1), ierr)
>> >
>> >   localRsize = local_RRow * local_RCol
>> >   do genIdx= 1, localRsize
>> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>> >   end do
>> >
>> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>> >
>> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>> >
>> >   do stepIdx= 2, step_k
>> >
>> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>> >
>> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>> >                         PETSC_DECIDE , nDim,
>> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>> >   end do
>> >
>> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>> >
>> >    do stepIdx= 2, step_k
>> >
>> >     call
>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>> ierr)
>> >   end do
>> >
>> >
>> > And I got the error message as below:
>> >
>> >
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> > [0]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> > [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>> find memory corruption errors
>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>> > [0]PETSC ERROR: to get more information on the crash.
>> > [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> > [0]PETSC ERROR: Signal received!
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>> 22:15:24 CDT 2013
>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>> > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> > ----------------------------------------------------
>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
>> Wed Aug  5 18:24:40 2015
>> > [0]PETSC ERROR: Libraries linked from
>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>> unknown file
>> >
>> --------------------------------------------------------------------------
>> > [mpi::mpi-api::mpi-abort]
>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>> > with errorcode 59.
>> >
>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> > You may or may not see output from other processes, depending on
>> > exactly when Open MPI kills them.
>> >
>> --------------------------------------------------------------------------
>> > [p01-024:26516]
>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>> [0xffffffff0091f684]
>> > [p01-024:26516]
>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>> [0xffffffff006c389c]
>> > [p01-024:26516]
>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
>> [0xffffffff006db3ac]
>> > [p01-024:26516]
>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>> [0xffffffff00281bf0]
>> > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
>> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
>> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
>> > [p01-024:26516] [(nil)]
>> > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
>> > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
>> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
>> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
>> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
>> [0xffffffff02d3b81c]
>> > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or
>> the batch system) has told this process to end
>> > [0]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> > [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>> find memory corruption errors
>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>> > [0]PETSC ERROR: to get more information on the crash.
>> > [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> > [0]PETSC ERROR: Signal received!
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>> 22:15:24 CDT 2013
>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
>> Wed Aug  5 18:24:40 2015
>> > [0]PETSC ERROR: Libraries linked from
>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>> unknown file
>> > [ERR.] PLE 0019 plexec One of MPI processes was
>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
>> >
>> > However, if I change from
>> > call
>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>> ierr)
>> > to
>> > call MatMatMult(A,Km(stepIdx-1),
>> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>> >
>> > everything is fine.
>> >
>> > could you please suggest some way to solve this?
>> >
>> > Thanks
>> >
>> > Cong Li
>> >
>> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
>> wrote:
>> > Thank you very much for your help and suggestions.
>> > With your help, finally I could continue my project.
>> >
>> > Regards
>> >
>> > Cong Li
>> >
>> >
>> >
>> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >
>> >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
>> created.
>> >
>> >   Since you want to use the C that is passed in you should use
>> MAT_REUSE_MATRIX.
>> >
>> >   Note that since your B and C matrices are dense the issue of sparsity
>> pattern of C is not relevant.
>> >
>> >   Barry
>> >
>> > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
>> wrote:
>> > >
>> > > Thanks very much. This answer is very helpful.
>> > > And I have a following question.
>> > > If I create B1, B2, .. by the way you suggested and then use
>> MatMatMult to do SPMM.
>> > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
>> fill,Mat *C)
>> > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>> > >
>> > > Thanks
>> > >
>> > > Cong Li
>> > >
>> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > >
>> > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
>> wrote:
>> > > >
>> > > > I am sorry that I should have explained it more clearly.
>> > > > Actually I want to compute a recurrence.
>> > > >
>> > > > Like, I want to firstly compute A*X1=B1, and then calculate
>> A*B1=B2, A*B2=B3 and so on.
>> > > > Finally I want to combine all these results into a bigger matrix
>> C=[B1,B2 ...]
>> > >
>> > >    First create C with MatCreateDense(,&C). Then call
>> MatDenseGetArray(C,&array); then create B1 with
>> MatCreateDense(....,array,&B1); then create
>> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
>> the number of __local__ rows in B1 times the number of columns in B1, then
>> create B3 with a larger shift etc.
>> > >
>> > >    Note that you are "sharing" the array space of C with B1, B2, B3,
>> ..., each Bi contains its columns of the C matrix.
>> > >
>> > >   Barry
>> > >
>> > >
>> > >
>> > > >
>> > > > Is there any way to do this efficiently.
>> > > >
>> > > >
>> > > >
>> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>> patrick.sanan at gmail.com> wrote:
>> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>> > > > > Thanks for your reply.
>> > > > >
>> > > > > I have an other question.
>> > > > > I want to do SPMM several times and combine result matrices into
>> one bigger
>> > > > > matrix.
>> > > > > for example
>> > > > > I firstly calculate AX1=B1, AX2=B2 ...
>> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>> > > > >
>> > > > > Could you please suggest a way of how to do this.
>> > > > This is just linear algebra, nothing to do with PETSc specifically.
>> > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>> > > > >
>> > > > > Thanks
>> > > > >
>> > > > > Cong Li
>> > > > >
>> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>> wrote:
>> > > > >
>> > > > > > Cong Li <solvercorleone at gmail.com> writes:
>> > > > > >
>> > > > > > > Hello,
>> > > > > > >
>> > > > > > > I am a PhD student using PETsc for my research.
>> > > > > > > I am wondering if there is a way to implement SPMM (Sparse
>> matrix-matrix
>> > > > > > > multiplication) by using PETSc.
>> > > > > >
>> > > > > >
>> > > > > >
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>> > > > > >
>> > > >
>> > >
>> > >
>> >
>> >
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150805/f947b098/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed Aug  5 23:29:29 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 5 Aug 2015 23:29:29 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
Message-ID: <FCB0B289-58BD-4AA4-8DCD-9EEB755360D2@mcs.anl.gov>


> On Aug 5, 2015, at 10:23 PM, Hong <hzhang at mcs.anl.gov> wrote:
> 
> Cong,
> 
> Can you write out math equations for mpk_monomial (), 
> list input and output parameters. 
> 
> Note: 
> 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End
> 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after
>     MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..)

  Hong, we want to reuse the space in the Km(stepIdx-1) from which it was created which means that MAT_INITIAL_MATRIX cannot be used. Since the result is always dense it is not the difficult case when a symbolic computation needs to be done initially so, at least in theory, he should not have to use MAT_INITIAL_MATRIX the first time through.

  Barry

> 
> Hong
> 
> 
> On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com> wrote:
> The entire source code files are attached.
> 
> Also I copy and paste the here in this email
> 
> thanks
> 
> program test
> 
>   implicit none
> 
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscmat.h>
> #include <finclude/petscviewer.h>
> 
> 
>   PetscViewer    :: view
>   ! sparse matrix
>   Mat            :: A
>   ! distributed dense matrix of size n x m
>   Mat            :: B, X, R, QDlt, AQDlt
>   ! distributed dense matrix of size n x (m x k)
>   Mat            :: Q, K, AQ_p, AQ
>   ! local dense matrix (every process keep the identical copies), (m x k) x (m x k)
>   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
> 
>   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize
>   PetscInt       :: ownRowS,ownRowE
>   PetscScalar, allocatable :: XInit(:,:)
>   PetscInt       :: XInitI, XInitJ
>   PetscScalar    :: v=1.0
>   PetscBool      :: flg
>   PetscMPIInt    :: size, rank
> 
>   character(128) ::  fin, rhsfin
> 
> 
>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
>   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> 
>   ! read binary matrix file
>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
> 
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
> 
> 
>   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
>   call MatCreate(PETSC_COMM_WORLD,A,ierr)
>   call MatSetType(A,MATAIJ,ierr)
>   call MatLoad(A,view,ierr)
>   call PetscViewerDestroy(view,ierr)
>   ! for the time being, assume mDim == nDim is true
>   call MatGetSize(A, nDim, mDim, ierr)
> 
>   if (rank == 0) then
>     print*,'Mat Size = ', nDim, mDim
>   end if
> 
>   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
>   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
> 
>   ! create right-and-side matrix
>   ! for the time being, choose row-wise decomposition
>   ! for the time being, assume nDim%size = 0
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
>   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
>   call MatLoad(B,view,ierr)
>   call PetscViewerDestroy(view,ierr)
>   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
>   if (rank == 0) then
>     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
>     print*,'MRHS Size should be:', nDim, bsize
>   end if
>   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
> 
>   ! inintial value guses X
>   allocate(XInit(nDim,bsize))
>   do XInitI=1, nDim
>     do XInitJ=1, bsize
>       XInit(XInitI,XInitJ) = 1.0
>     end do
>   end do
> 
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       bsize, nDim, bsize,XInit, X, ierr)
> 
>   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
> 
> 
>   !  B, X, R, QDlt, AQDlt
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
>   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
>   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
>   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> 
> ! Q, K, AQ_p, AQ of size n x (m x k)
>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>                       (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
>   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
>  
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
>   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
>   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
>   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
> 
> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
>   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
>                          PETSC_NULL_SCALAR, QtAQ, ierr)
>   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
>   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
>   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
> 
>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
>   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> 
> ! calculation for R
> 
> ! call matrix powers kernel
>   call mpk_monomial (K, A, R, step_k, rank,size)
> 
> ! destory matrices
>   deallocate(XInit)
> 
>   call MatDestroy(B, ierr)
>   call MatDestroy(X, ierr)
>   call MatDestroy(R, ierr)
>   call MatDestroy(QDlt, ierr)
>   call MatDestroy(AQDlt, ierr)
>   call MatDestroy(Q, ierr)
>   call MatDestroy(K, ierr)
>   call MatDestroy(AQ_p, ierr)
>   call MatDestroy(AQ, ierr)
>   call MatDestroy(QtAQ, ierr)
>   call MatDestroy(QtAQ_p, ierr)
>   call MatDestroy(Dlt, ierr)
> 
> 
>   call PetscFinalize(ierr)
> 
>   stop
> 
> end program test
> 
> 
> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> 	implicit none
> 
> #include <finclude/petscsys.h>
> #include <finclude/petscvec.h>
> #include <finclude/petscmat.h>
> #include <finclude/petscviewer.h>
> 
> 	Mat            :: K, Km(step_k)
> 	Mat            :: A, R
> 	PetscMPIInt    :: sizeMPI, rank
> 	PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
> 	PetscInt       :: ierr
> 	PetscInt       :: stepIdx, blockShift, localRsize
>   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
>   PetscOffset    :: KArrayOffset, RArrayOffset
> 
> 	call MatGetSize(R, nDim, bsize, ierr)
>  	if (rank == 0) then
>   	  print*,'Mat Size = ', nDim, bsize
>   	end if
> 
>   call MatGetArray(K,KArray,KArrayOffset,ierr)
> 
>   call MatGetLocalSize(R,local_RRow,local_RCol)
> !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
> 
>   ! get arry from R to add values to K(1)
>   call MatGetArray(R,RArray,RArrayOffset,ierr)
> 
>   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> 
> 
> !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
> !                  ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr)
> 
>   localRsize = local_RRow * local_RCol
>   do genIdx= 1, localRsize
>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>   end do
> 
> 
>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> 
>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>   
>   do stepIdx= 2, step_k
> 
>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> 
>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>     
>   end do
> 
>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> 
> !   do stepIdx= 2, step_k
>   do stepIdx= 2,2 
> 
>     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> !     call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>   end do
> 
> !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
> end subroutine mpk_monomial
> 
> 
> 
> Cong Li
> 
> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>    Send the entire code so that we can compile it and run it ourselves to see what is going wrong.
> 
>   Barry
> 
> > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > Hi
> >
> > I tried the method you suggested. However, I got the error message.
> > My code and message are below.
> >
> > K is the big matrix containing column matrices.
> >
> > code:
> >
> > call MatGetArray(K,KArray,KArrayOffset,ierr)
> >
> > call MatGetLocalSize(R,local_RRow,local_RCol)
> >
> > call MatGetArray(R,RArray,RArrayOffset,ierr)
> >
> > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> >
> >   localRsize = local_RRow * local_RCol
> >   do genIdx= 1, localRsize
> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> >   end do
> >
> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   do stepIdx= 2, step_k
> >
> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> >
> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >   end do
> >
> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> >
> >    do stepIdx= 2, step_k
> >
> >     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> >   end do
> >
> >
> > And I got the error message as below:
> >
> >
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> > [0]PETSC ERROR: Signal received!
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------
> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > ----------------------------------------------------
> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> > --------------------------------------------------------------------------
> > [mpi::mpi-api::mpi-abort]
> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > with errorcode 59.
> >
> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > You may or may not see output from other processes, depending on
> > exactly when Open MPI kills them.
> > --------------------------------------------------------------------------
> > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684]
> > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c]
> > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac]
> > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0]
> > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > [p01-024:26516] [(nil)]
> > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c]
> > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end
> > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> > [0]PETSC ERROR: Signal received!
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > [0]PETSC ERROR: ------------------------------------------------------------------------
> > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> >
> > However, if I change from
> > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > to
> > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> >
> > everything is fine.
> >
> > could you please suggest some way to solve this?
> >
> > Thanks
> >
> > Cong Li
> >
> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > Thank you very much for your help and suggestions.
> > With your help, finally I could continue my project.
> >
> > Regards
> >
> > Cong Li
> >
> >
> >
> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be created.
> >
> >   Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX.
> >
> >   Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant.
> >
> >   Barry
> >
> > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > Thanks very much. This answer is very helpful.
> > > And I have a following question.
> > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM.
> > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C)
> > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > > >
> > > > I am sorry that I should have explained it more clearly.
> > > > Actually I want to compute a recurrence.
> > > >
> > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on.
> > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...]
> > >
> > >    First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create
> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc.
> > >
> > >    Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix.
> > >
> > >   Barry
> > >
> > >
> > >
> > > >
> > > > Is there any way to do this efficiently.
> > > >
> > > >
> > > >
> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > Thanks for your reply.
> > > > >
> > > > > I have an other question.
> > > > > I want to do SPMM several times and combine result matrices into one bigger
> > > > > matrix.
> > > > > for example
> > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > >
> > > > > Could you please suggest a way of how to do this.
> > > > This is just linear algebra, nothing to do with PETSc specifically.
> > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > >
> > > > > Thanks
> > > > >
> > > > > Cong Li
> > > > >
> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > > > >
> > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > > > > > > multiplication) by using PETSc.
> > > > > >
> > > > > >
> > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > >
> > > >
> > >
> > >
> >
> >
> >
> 
> 
> 


From solvercorleone at gmail.com  Thu Aug  6 00:12:47 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Thu, 6 Aug 2015 14:12:47 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CD17FABF-17CD-4620-96FA-7698AB9F33E6@mcs.anl.gov>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CD17FABF-17CD-4620-96FA-7698AB9F33E6@mcs.anl.gov>
Message-ID: <CALSmn-m4TfrC4Xe0kzX93u9ZS++yzx-Jz-jpdETEvmC-Ss3XoA@mail.gmail.com>

Sure. Attached are input files.
mesh1e1.mtx.pbin is the binary petsc file for a 48x48 S.P.D. sparse matrix,
which is A in the program.
b.m48.n2.dat is the binary petsc file of a 2 column right-hand side matrix,
which is B in the program.

I use
mpiexec -n 2 ./progrma.name -f ~/mesh1e1.mtx.pbin -r ~/b.m48.n2.dat -k 2
-i 2 -w 2
to run the program


Thanks

Cong Li

On Thu, Aug 6, 2015 at 11:43 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Send the input files so I can actually run the thing (and make sure they
> are not very big, debugging with large data sets is silly and unproductive).
>
>    Thanks
>
>   Barry
>
> > On Aug 5, 2015, at 8:56 PM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > The entire source code files are attached.
> >
> > Also I copy and paste the here in this email
> >
> > thanks
> >
> > program test
> >
> >   implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >
> >   PetscViewer    :: view
> >   ! sparse matrix
> >   Mat            :: A
> >   ! distributed dense matrix of size n x m
> >   Mat            :: B, X, R, QDlt, AQDlt
> >   ! distributed dense matrix of size n x (m x k)
> >   Mat            :: Q, K, AQ_p, AQ
> >   ! local dense matrix (every process keep the identical copies), (m x
> k) x (m x k)
> >   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
> >
> >   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
> step_k,bsize
> >   PetscInt       :: ownRowS,ownRowE
> >   PetscScalar, allocatable :: XInit(:,:)
> >   PetscInt       :: XInitI, XInitJ
> >   PetscScalar    :: v=1.0
> >   PetscBool      :: flg
> >   PetscMPIInt    :: size, rank
> >
> >   character(128) ::  fin, rhsfin
> >
> >
> >   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
> >   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
> >   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> >
> >   ! read binary matrix file
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
> >
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
> >
> >
> >   call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
> >   call MatCreate(PETSC_COMM_WORLD,A,ierr)
> >   call MatSetType(A,MATAIJ,ierr)
> >   call MatLoad(A,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   ! for the time being, assume mDim == nDim is true
> >   call MatGetSize(A, nDim, mDim, ierr)
> >
> >   if (rank == 0) then
> >     print*,'Mat Size = ', nDim, mDim
> >   end if
> >
> >   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
> >
> >   ! create right-and-side matrix
> >   ! for the time being, choose row-wise decomposition
> >   ! for the time being, assume nDim%size = 0
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
> >   call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
> >   call MatLoad(B,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
> >   if (rank == 0) then
> >     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
> >     print*,'MRHS Size should be:', nDim, bsize
> >   end if
> >   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   ! inintial value guses X
> >   allocate(XInit(nDim,bsize))
> >   do XInitI=1, nDim
> >     do XInitJ=1, bsize
> >       XInit(XInitI,XInitJ) = 1.0
> >     end do
> >   end do
> >
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,XInit, X, ierr)
> >
> >   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
> >
> >
> >   !  B, X, R, QDlt, AQDlt
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
> >   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
> >   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
> >   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! Q, K, AQ_p, AQ of size n x (m x k)
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       (bsize*step_k), nDim,
> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
> >   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
> >   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
> >   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
> >   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
> >   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
> >                          PETSC_NULL_SCALAR, QtAQ, ierr)
> >   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
> >   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
> >   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
> >   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! calculation for R
> >
> > ! call matrix powers kernel
> >   call mpk_monomial (K, A, R, step_k, rank,size)
> >
> > ! destory matrices
> >   deallocate(XInit)
> >
> >   call MatDestroy(B, ierr)
> >   call MatDestroy(X, ierr)
> >   call MatDestroy(R, ierr)
> >   call MatDestroy(QDlt, ierr)
> >   call MatDestroy(AQDlt, ierr)
> >   call MatDestroy(Q, ierr)
> >   call MatDestroy(K, ierr)
> >   call MatDestroy(AQ_p, ierr)
> >   call MatDestroy(AQ, ierr)
> >   call MatDestroy(QtAQ, ierr)
> >   call MatDestroy(QtAQ_p, ierr)
> >   call MatDestroy(Dlt, ierr)
> >
> >
> >   call PetscFinalize(ierr)
> >
> >   stop
> >
> > end program test
> >
> >
> > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> >       implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >       Mat            :: K, Km(step_k)
> >       Mat            :: A, R
> >       PetscMPIInt    :: sizeMPI, rank
> >       PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol,
> genIdx
> >       PetscInt       :: ierr
> >       PetscInt       :: stepIdx, blockShift, localRsize
> >   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
> >   PetscOffset    :: KArrayOffset, RArrayOffset
> >
> >       call MatGetSize(R, nDim, bsize, ierr)
> >       if (rank == 0) then
> >         print*,'Mat Size = ', nDim, bsize
> >       end if
> >
> >   call MatGetArray(K,KArray,KArrayOffset,ierr)
> >
> >   call MatGetLocalSize(R,local_RRow,local_RCol)
> > !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
> >
> >   ! get arry from R to add values to K(1)
> >   call MatGetArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset +
> 1), Km(1), ierr)
> >
> >
> > !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
> > !                  ,local_RRow * local_RCol *
> STORAGE_SIZE(PetscScalarSize), ierr)
> >
> >   localRsize = local_RRow * local_RCol
> >   do genIdx= 1, localRsize
> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> >   end do
> >
> >
> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   do stepIdx= 2, step_k
> >
> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> >
> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1),
> Km(stepIdx), ierr)
> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   end do
> >
> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> >
> > !   do stepIdx= 2, step_k
> >   do stepIdx= 2,2
> >
> >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > !     call
> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> >   end do
> >
> > !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> > end subroutine mpk_monomial
> >
> >
> >
> > Cong Li
> >
> > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    Send the entire code so that we can compile it and run it ourselves
> to see what is going wrong.
> >
> >   Barry
> >
> > > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > Hi
> > >
> > > I tried the method you suggested. However, I got the error message.
> > > My code and message are below.
> > >
> > > K is the big matrix containing column matrices.
> > >
> > > code:
> > >
> > > call MatGetArray(K,KArray,KArrayOffset,ierr)
> > >
> > > call MatGetLocalSize(R,local_RRow,local_RCol)
> > >
> > > call MatGetArray(R,RArray,RArrayOffset,ierr)
> > >
> > > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset
> + 1), Km(1), ierr)
> > >
> > >   localRsize = local_RRow * local_RCol
> > >   do genIdx= 1, localRsize
> > >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> > >   end do
> > >
> > >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> > >
> > >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   do stepIdx= 2, step_k
> > >
> > >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> > >
> > >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim,
> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> > >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >   end do
> > >
> > >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> > >
> > >    do stepIdx= 2, step_k
> > >
> > >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > >   end do
> > >
> > >
> > > And I got the error message as below:
> > >
> > >
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > > ----------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
> Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > >
> --------------------------------------------------------------------------
> > > [mpi::mpi-api::mpi-abort]
> > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > > with errorcode 59.
> > >
> > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > > You may or may not see output from other processes, depending on
> > > exactly when Open MPI kills them.
> > >
> --------------------------------------------------------------------------
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
> [0xffffffff0091f684]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
> [0xffffffff006c389c]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
> [0xffffffff006db3ac]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
> [0xffffffff00281bf0]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > > [p01-024:26516] [(nil)]
> > > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
> [0xffffffff02d3b81c]
> > > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or
> the batch system) has told this process to end
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
> Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > > [ERR.] PLE 0019 plexec One of MPI processes was
> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> > >
> > > However, if I change from
> > > call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > > to
> > > call MatMatMult(A,Km(stepIdx-1),
> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > >
> > > everything is fine.
> > >
> > > could you please suggest some way to solve this?
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > Thank you very much for your help and suggestions.
> > > With your help, finally I could continue my project.
> > >
> > > Regards
> > >
> > > Cong Li
> > >
> > >
> > >
> > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
> created.
> > >
> > >   Since you want to use the C that is passed in you should use
> MAT_REUSE_MATRIX.
> > >
> > >   Note that since your B and C matrices are dense the issue of
> sparsity pattern of C is not relevant.
> > >
> > >   Barry
> > >
> > > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > >
> > > > Thanks very much. This answer is very helpful.
> > > > And I have a following question.
> > > > If I create B1, B2, .. by the way you suggested and then use
> MatMatMult to do SPMM.
> > > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
> fill,Mat *C)
> > > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > > >
> > > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > > >
> > > > > I am sorry that I should have explained it more clearly.
> > > > > Actually I want to compute a recurrence.
> > > > >
> > > > > Like, I want to firstly compute A*X1=B1, and then calculate
> A*B1=B2, A*B2=B3 and so on.
> > > > > Finally I want to combine all these results into a bigger matrix
> C=[B1,B2 ...]
> > > >
> > > >    First create C with MatCreateDense(,&C). Then call
> MatDenseGetArray(C,&array); then create B1 with
> MatCreateDense(....,array,&B1); then create
> > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
> the number of __local__ rows in B1 times the number of columns in B1, then
> create B3 with a larger shift etc.
> > > >
> > > >    Note that you are "sharing" the array space of C with B1, B2, B3,
> ..., each Bi contains its columns of the C matrix.
> > > >
> > > >   Barry
> > > >
> > > >
> > > >
> > > > >
> > > > > Is there any way to do this efficiently.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
> patrick.sanan at gmail.com> wrote:
> > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > I have an other question.
> > > > > > I want to do SPMM several times and combine result matrices into
> one bigger
> > > > > > matrix.
> > > > > > for example
> > > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > > >
> > > > > > Could you please suggest a way of how to do this.
> > > > > This is just linear algebra, nothing to do with PETSc specifically.
> > > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Cong Li
> > > > > >
> > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
> wrote:
> > > > > >
> > > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > > I am wondering if there is a way to implement SPMM (Sparse
> matrix-matrix
> > > > > > > > multiplication) by using PETSc.
> > > > > > >
> > > > > > >
> > > > > > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> > <mainprogram.f90><mpk_monomial.f90>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/4b9f56e3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: b.m48.n2.dat
Type: application/octet-stream
Size: 1360 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/4b9f56e3/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mesh1e1.mtx.pbin
Type: application/octet-stream
Size: 3880 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/4b9f56e3/attachment-0003.obj>

From solvercorleone at gmail.com  Thu Aug  6 00:22:20 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Thu, 6 Aug 2015 14:22:20 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
Message-ID: <CALSmn-kkZebNiczWib_q2OQSXo6r6zHyz1JifiG5zfQFN8Dvqw@mail.gmail.com>

Hong,

Sure.

I want to extend the Krylov subspace by step_k dimensions by using
monomial, which can be defined as

K={Km(1)m Km(2), ..., Km(step_k)}
  ={Km(1), AKm(1), AKm(2), ... , AKm(step_k-1)}
  ={R, AR, A^2R, ... A^(step_k-1)R}

,  in one loop. So, my plan now is to firstly calculate the recurrence,
which is P_n(x)=xP_n-1(x), and then use the results to update the items in
K. And then, in the next loop of Krylov subspace method, the K will be
updated again.

Input of the the mpk_monomial subroutine, is a preallocated dense matrix K.
And, A and R are for updating K in the subroutine.


Thanks

Cong Li

On Thu, Aug 6, 2015 at 12:23 PM, Hong <hzhang at mcs.anl.gov> wrote:

> Cong,
>
> Can you write out math equations for mpk_monomial (),
> list input and output parameters.
>
> Note:
> 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End
> 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after
>     MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..)
>
> Hong
>
>
> On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com> wrote:
>
>> The entire source code files are attached.
>>
>> Also I copy and paste the here in this email
>>
>> thanks
>>
>> program test
>>
>>   implicit none
>>
>> #include <finclude/petscsys.h>
>> #include <finclude/petscvec.h>
>> #include <finclude/petscmat.h>
>> #include <finclude/petscviewer.h>
>>
>>
>>   PetscViewer    :: view
>>   ! sparse matrix
>>   Mat            :: A
>>   ! distributed dense matrix of size n x m
>>   Mat            :: B, X, R, QDlt, AQDlt
>>   ! distributed dense matrix of size n x (m x k)
>>   Mat            :: Q, K, AQ_p, AQ
>>   ! local dense matrix (every process keep the identical copies), (m x k)
>> x (m x k)
>>   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
>>
>>   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
>> step_k,bsize
>>   PetscInt       :: ownRowS,ownRowE
>>   PetscScalar, allocatable :: XInit(:,:)
>>   PetscInt       :: XInitI, XInitJ
>>   PetscScalar    :: v=1.0
>>   PetscBool      :: flg
>>   PetscMPIInt    :: size, rank
>>
>>   character(128) ::  fin, rhsfin
>>
>>
>>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>>   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
>>   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
>>
>>   ! read binary matrix file
>>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
>>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
>>
>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
>>
>>
>>   call
>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
>>   call MatCreate(PETSC_COMM_WORLD,A,ierr)
>>   call MatSetType(A,MATAIJ,ierr)
>>   call MatLoad(A,view,ierr)
>>   call PetscViewerDestroy(view,ierr)
>>   ! for the time being, assume mDim == nDim is true
>>   call MatGetSize(A, nDim, mDim, ierr)
>>
>>   if (rank == 0) then
>>     print*,'Mat Size = ', nDim, mDim
>>   end if
>>
>>   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
>>
>>   ! create right-and-side matrix
>>   ! for the time being, choose row-wise decomposition
>>   ! for the time being, assume nDim%size = 0
>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
>>   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view,
>> ierr)
>>   call MatLoad(B,view,ierr)
>>   call PetscViewerDestroy(view,ierr)
>>   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
>>   if (rank == 0) then
>>     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
>>     print*,'MRHS Size should be:', nDim, bsize
>>   end if
>>   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   ! inintial value guses X
>>   allocate(XInit(nDim,bsize))
>>   do XInitI=1, nDim
>>     do XInitJ=1, bsize
>>       XInit(XInitI,XInitJ) = 1.0
>>     end do
>>   end do
>>
>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>                       bsize, nDim, bsize,XInit, X, ierr)
>>
>>   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
>>
>>
>>   !  B, X, R, QDlt, AQDlt
>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
>>   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
>>   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
>>   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>>
>> ! Q, K, AQ_p, AQ of size n x (m x k)
>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>                       (bsize*step_k), nDim,
>> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
>>   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
>>   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
>>   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
>>   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
>>
>> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
>>   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
>>                          PETSC_NULL_SCALAR, QtAQ, ierr)
>>   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
>>   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
>>   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
>>
>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
>>   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>>
>> ! calculation for R
>>
>> ! call matrix powers kernel
>>   call mpk_monomial (K, A, R, step_k, rank,size)
>>
>> ! destory matrices
>>   deallocate(XInit)
>>
>>   call MatDestroy(B, ierr)
>>   call MatDestroy(X, ierr)
>>   call MatDestroy(R, ierr)
>>   call MatDestroy(QDlt, ierr)
>>   call MatDestroy(AQDlt, ierr)
>>   call MatDestroy(Q, ierr)
>>   call MatDestroy(K, ierr)
>>   call MatDestroy(AQ_p, ierr)
>>   call MatDestroy(AQ, ierr)
>>   call MatDestroy(QtAQ, ierr)
>>   call MatDestroy(QtAQ_p, ierr)
>>   call MatDestroy(Dlt, ierr)
>>
>>
>>   call PetscFinalize(ierr)
>>
>>   stop
>>
>> end program test
>>
>>
>> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
>> implicit none
>>
>> #include <finclude/petscsys.h>
>> #include <finclude/petscvec.h>
>> #include <finclude/petscmat.h>
>> #include <finclude/petscviewer.h>
>>
>> Mat            :: K, Km(step_k)
>> Mat            :: A, R
>> PetscMPIInt    :: sizeMPI, rank
>> PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
>> PetscInt       :: ierr
>> PetscInt       :: stepIdx, blockShift, localRsize
>>   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
>>   PetscOffset    :: KArrayOffset, RArrayOffset
>>
>> call MatGetSize(R, nDim, bsize, ierr)
>>   if (rank == 0) then
>>    print*,'Mat Size = ', nDim, bsize
>>   end if
>>
>>   call MatGetArray(K,KArray,KArrayOffset,ierr)
>>
>>   call MatGetLocalSize(R,local_RRow,local_RCol)
>> !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
>>
>>   ! get arry from R to add values to K(1)
>>   call MatGetArray(R,RArray,RArrayOffset,ierr)
>>
>>   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset +
>> 1), Km(1), ierr)
>>
>>
>> !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
>> !                  ,local_RRow * local_RCol *
>> STORAGE_SIZE(PetscScalarSize), ierr)
>>
>>   localRsize = local_RRow * local_RCol
>>   do genIdx= 1, localRsize
>>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>   end do
>>
>>
>>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>
>>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>
>>   do stepIdx= 2, step_k
>>
>>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>>
>>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1),
>> Km(stepIdx), ierr)
>>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>
>>   end do
>>
>>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>
>> !   do stepIdx= 2, step_k
>>   do stepIdx= 2,2
>>
>>     call
>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>> ierr)
>> !     call
>> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>> ierr)
>>   end do
>>
>> !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
>>
>> end subroutine mpk_monomial
>>
>>
>>
>> Cong Li
>>
>> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>>    Send the entire code so that we can compile it and run it ourselves
>>> to see what is going wrong.
>>>
>>>   Barry
>>>
>>> > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>> >
>>> > Hi
>>> >
>>> > I tried the method you suggested. However, I got the error message.
>>> > My code and message are below.
>>> >
>>> > K is the big matrix containing column matrices.
>>> >
>>> > code:
>>> >
>>> > call MatGetArray(K,KArray,KArrayOffset,ierr)
>>> >
>>> > call MatGetLocalSize(R,local_RRow,local_RCol)
>>> >
>>> > call MatGetArray(R,RArray,RArrayOffset,ierr)
>>> >
>>> > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset
>>> + 1), Km(1), ierr)
>>> >
>>> >   localRsize = local_RRow * local_RCol
>>> >   do genIdx= 1, localRsize
>>> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>> >   end do
>>> >
>>> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>> >
>>> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>> >
>>> >   do stepIdx= 2, step_k
>>> >
>>> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>>> >
>>> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>> >                         PETSC_DECIDE , nDim,
>>> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>>> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>> >   end do
>>> >
>>> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>> >
>>> >    do stepIdx= 2, step_k
>>> >
>>> >     call
>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>> ierr)
>>> >   end do
>>> >
>>> >
>>> > And I got the error message as below:
>>> >
>>> >
>>> > [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>> > [0]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> > [0]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>>> find memory corruption errors
>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>>> and run
>>> > [0]PETSC ERROR: to get more information on the crash.
>>> > [0]PETSC ERROR: --------------------- Error Message
>>> ------------------------------------
>>> > [0]PETSC ERROR: Signal received!
>>> > [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>>> 22:15:24 CDT 2013
>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>>> > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>> > ----------------------------------------------------
>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
>>> Wed Aug  5 18:24:40 2015
>>> > [0]PETSC ERROR: Libraries linked from
>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>> > [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>> unknown file
>>> >
>>> --------------------------------------------------------------------------
>>> > [mpi::mpi-api::mpi-abort]
>>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>> > with errorcode 59.
>>> >
>>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>> > You may or may not see output from other processes, depending on
>>> > exactly when Open MPI kills them.
>>> >
>>> --------------------------------------------------------------------------
>>> > [p01-024:26516]
>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>> [0xffffffff0091f684]
>>> > [p01-024:26516]
>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>> [0xffffffff006c389c]
>>> > [p01-024:26516]
>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
>>> [0xffffffff006db3ac]
>>> > [p01-024:26516]
>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>>> [0xffffffff00281bf0]
>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
>>> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
>>> > [p01-024:26516] [(nil)]
>>> > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
>>> > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
>>> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
>>> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
>>> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
>>> [0xffffffff02d3b81c]
>>> > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
>>> > [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or
>>> the batch system) has told this process to end
>>> > [0]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> > [0]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>>> find memory corruption errors
>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>>> and run
>>> > [0]PETSC ERROR: to get more information on the crash.
>>> > [0]PETSC ERROR: --------------------- Error Message
>>> ------------------------------------
>>> > [0]PETSC ERROR: Signal received!
>>> > [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>>> 22:15:24 CDT 2013
>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>>> > [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
>>> Wed Aug  5 18:24:40 2015
>>> > [0]PETSC ERROR: Libraries linked from
>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>> > [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>> unknown file
>>> > [ERR.] PLE 0019 plexec One of MPI processes was
>>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
>>> >
>>> > However, if I change from
>>> > call
>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>> ierr)
>>> > to
>>> > call MatMatMult(A,Km(stepIdx-1),
>>> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>> >
>>> > everything is fine.
>>> >
>>> > could you please suggest some way to solve this?
>>> >
>>> > Thanks
>>> >
>>> > Cong Li
>>> >
>>> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
>>> wrote:
>>> > Thank you very much for your help and suggestions.
>>> > With your help, finally I could continue my project.
>>> >
>>> > Regards
>>> >
>>> > Cong Li
>>> >
>>> >
>>> >
>>> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> >
>>> >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
>>> created.
>>> >
>>> >   Since you want to use the C that is passed in you should use
>>> MAT_REUSE_MATRIX.
>>> >
>>> >   Note that since your B and C matrices are dense the issue of
>>> sparsity pattern of C is not relevant.
>>> >
>>> >   Barry
>>> >
>>> > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
>>> wrote:
>>> > >
>>> > > Thanks very much. This answer is very helpful.
>>> > > And I have a following question.
>>> > > If I create B1, B2, .. by the way you suggested and then use
>>> MatMatMult to do SPMM.
>>> > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
>>> fill,Mat *C)
>>> > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>>> > >
>>> > > Thanks
>>> > >
>>> > > Cong Li
>>> > >
>>> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> > >
>>> > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
>>> wrote:
>>> > > >
>>> > > > I am sorry that I should have explained it more clearly.
>>> > > > Actually I want to compute a recurrence.
>>> > > >
>>> > > > Like, I want to firstly compute A*X1=B1, and then calculate
>>> A*B1=B2, A*B2=B3 and so on.
>>> > > > Finally I want to combine all these results into a bigger matrix
>>> C=[B1,B2 ...]
>>> > >
>>> > >    First create C with MatCreateDense(,&C). Then call
>>> MatDenseGetArray(C,&array); then create B1 with
>>> MatCreateDense(....,array,&B1); then create
>>> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
>>> the number of __local__ rows in B1 times the number of columns in B1, then
>>> create B3 with a larger shift etc.
>>> > >
>>> > >    Note that you are "sharing" the array space of C with B1, B2, B3,
>>> ..., each Bi contains its columns of the C matrix.
>>> > >
>>> > >   Barry
>>> > >
>>> > >
>>> > >
>>> > > >
>>> > > > Is there any way to do this efficiently.
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>>> patrick.sanan at gmail.com> wrote:
>>> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>> > > > > Thanks for your reply.
>>> > > > >
>>> > > > > I have an other question.
>>> > > > > I want to do SPMM several times and combine result matrices into
>>> one bigger
>>> > > > > matrix.
>>> > > > > for example
>>> > > > > I firstly calculate AX1=B1, AX2=B2 ...
>>> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>> > > > >
>>> > > > > Could you please suggest a way of how to do this.
>>> > > > This is just linear algebra, nothing to do with PETSc specifically.
>>> > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>> > > > >
>>> > > > > Thanks
>>> > > > >
>>> > > > > Cong Li
>>> > > > >
>>> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>>> wrote:
>>> > > > >
>>> > > > > > Cong Li <solvercorleone at gmail.com> writes:
>>> > > > > >
>>> > > > > > > Hello,
>>> > > > > > >
>>> > > > > > > I am a PhD student using PETsc for my research.
>>> > > > > > > I am wondering if there is a way to implement SPMM (Sparse
>>> matrix-matrix
>>> > > > > > > multiplication) by using PETSc.
>>> > > > > >
>>> > > > > >
>>> > > > > >
>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>> > > > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>> >
>>> >
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/59920d9b/attachment-0001.html>

From solvercorleone at gmail.com  Thu Aug  6 00:27:52 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Thu, 6 Aug 2015 14:27:52 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <FCB0B289-58BD-4AA4-8DCD-9EEB755360D2@mcs.anl.gov>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
	<FCB0B289-58BD-4AA4-8DCD-9EEB755360D2@mcs.anl.gov>
Message-ID: <CALSmn-=p3k92byHk6Fwjhpm-DR0J0yBxxsesX-V-NQdbFZ7cXw@mail.gmail.com>

Barry,

Exactly. And thanks for the explaination.

Cong Li

On Thu, Aug 6, 2015 at 1:29 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Aug 5, 2015, at 10:23 PM, Hong <hzhang at mcs.anl.gov> wrote:
> >
> > Cong,
> >
> > Can you write out math equations for mpk_monomial (),
> > list input and output parameters.
> >
> > Note:
> > 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End
> > 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after
> >     MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..)
>
>   Hong, we want to reuse the space in the Km(stepIdx-1) from which it was
> created which means that MAT_INITIAL_MATRIX cannot be used. Since the
> result is always dense it is not the difficult case when a symbolic
> computation needs to be done initially so, at least in theory, he should
> not have to use MAT_INITIAL_MATRIX the first time through.
>
>   Barry
>
> >
> > Hong
> >
> >
> > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > The entire source code files are attached.
> >
> > Also I copy and paste the here in this email
> >
> > thanks
> >
> > program test
> >
> >   implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >
> >   PetscViewer    :: view
> >   ! sparse matrix
> >   Mat            :: A
> >   ! distributed dense matrix of size n x m
> >   Mat            :: B, X, R, QDlt, AQDlt
> >   ! distributed dense matrix of size n x (m x k)
> >   Mat            :: Q, K, AQ_p, AQ
> >   ! local dense matrix (every process keep the identical copies), (m x
> k) x (m x k)
> >   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
> >
> >   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
> step_k,bsize
> >   PetscInt       :: ownRowS,ownRowE
> >   PetscScalar, allocatable :: XInit(:,:)
> >   PetscInt       :: XInitI, XInitJ
> >   PetscScalar    :: v=1.0
> >   PetscBool      :: flg
> >   PetscMPIInt    :: size, rank
> >
> >   character(128) ::  fin, rhsfin
> >
> >
> >   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
> >   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
> >   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> >
> >   ! read binary matrix file
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
> >
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
> >
> >
> >   call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
> >   call MatCreate(PETSC_COMM_WORLD,A,ierr)
> >   call MatSetType(A,MATAIJ,ierr)
> >   call MatLoad(A,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   ! for the time being, assume mDim == nDim is true
> >   call MatGetSize(A, nDim, mDim, ierr)
> >
> >   if (rank == 0) then
> >     print*,'Mat Size = ', nDim, mDim
> >   end if
> >
> >   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
> >
> >   ! create right-and-side matrix
> >   ! for the time being, choose row-wise decomposition
> >   ! for the time being, assume nDim%size = 0
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
> >   call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
> >   call MatLoad(B,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
> >   if (rank == 0) then
> >     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
> >     print*,'MRHS Size should be:', nDim, bsize
> >   end if
> >   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   ! inintial value guses X
> >   allocate(XInit(nDim,bsize))
> >   do XInitI=1, nDim
> >     do XInitJ=1, bsize
> >       XInit(XInitI,XInitJ) = 1.0
> >     end do
> >   end do
> >
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,XInit, X, ierr)
> >
> >   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
> >
> >
> >   !  B, X, R, QDlt, AQDlt
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
> >   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
> >   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
> >   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! Q, K, AQ_p, AQ of size n x (m x k)
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       (bsize*step_k), nDim,
> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
> >   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
> >   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
> >   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
> >   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
> >   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
> >                          PETSC_NULL_SCALAR, QtAQ, ierr)
> >   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
> >   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
> >   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
> >   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! calculation for R
> >
> > ! call matrix powers kernel
> >   call mpk_monomial (K, A, R, step_k, rank,size)
> >
> > ! destory matrices
> >   deallocate(XInit)
> >
> >   call MatDestroy(B, ierr)
> >   call MatDestroy(X, ierr)
> >   call MatDestroy(R, ierr)
> >   call MatDestroy(QDlt, ierr)
> >   call MatDestroy(AQDlt, ierr)
> >   call MatDestroy(Q, ierr)
> >   call MatDestroy(K, ierr)
> >   call MatDestroy(AQ_p, ierr)
> >   call MatDestroy(AQ, ierr)
> >   call MatDestroy(QtAQ, ierr)
> >   call MatDestroy(QtAQ_p, ierr)
> >   call MatDestroy(Dlt, ierr)
> >
> >
> >   call PetscFinalize(ierr)
> >
> >   stop
> >
> > end program test
> >
> >
> > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> >       implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >       Mat            :: K, Km(step_k)
> >       Mat            :: A, R
> >       PetscMPIInt    :: sizeMPI, rank
> >       PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol,
> genIdx
> >       PetscInt       :: ierr
> >       PetscInt       :: stepIdx, blockShift, localRsize
> >   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
> >   PetscOffset    :: KArrayOffset, RArrayOffset
> >
> >       call MatGetSize(R, nDim, bsize, ierr)
> >       if (rank == 0) then
> >         print*,'Mat Size = ', nDim, bsize
> >       end if
> >
> >   call MatGetArray(K,KArray,KArrayOffset,ierr)
> >
> >   call MatGetLocalSize(R,local_RRow,local_RCol)
> > !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
> >
> >   ! get arry from R to add values to K(1)
> >   call MatGetArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset +
> 1), Km(1), ierr)
> >
> >
> > !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
> > !                  ,local_RRow * local_RCol *
> STORAGE_SIZE(PetscScalarSize), ierr)
> >
> >   localRsize = local_RRow * local_RCol
> >   do genIdx= 1, localRsize
> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> >   end do
> >
> >
> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   do stepIdx= 2, step_k
> >
> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> >
> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1),
> Km(stepIdx), ierr)
> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   end do
> >
> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> >
> > !   do stepIdx= 2, step_k
> >   do stepIdx= 2,2
> >
> >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > !     call
> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> >   end do
> >
> > !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> > end subroutine mpk_monomial
> >
> >
> >
> > Cong Li
> >
> > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    Send the entire code so that we can compile it and run it ourselves
> to see what is going wrong.
> >
> >   Barry
> >
> > > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > Hi
> > >
> > > I tried the method you suggested. However, I got the error message.
> > > My code and message are below.
> > >
> > > K is the big matrix containing column matrices.
> > >
> > > code:
> > >
> > > call MatGetArray(K,KArray,KArrayOffset,ierr)
> > >
> > > call MatGetLocalSize(R,local_RRow,local_RCol)
> > >
> > > call MatGetArray(R,RArray,RArrayOffset,ierr)
> > >
> > > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset
> + 1), Km(1), ierr)
> > >
> > >   localRsize = local_RRow * local_RCol
> > >   do genIdx= 1, localRsize
> > >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> > >   end do
> > >
> > >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> > >
> > >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   do stepIdx= 2, step_k
> > >
> > >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> > >
> > >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim,
> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> > >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >   end do
> > >
> > >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> > >
> > >    do stepIdx= 2, step_k
> > >
> > >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > >   end do
> > >
> > >
> > > And I got the error message as below:
> > >
> > >
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > > ----------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
> Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > >
> --------------------------------------------------------------------------
> > > [mpi::mpi-api::mpi-abort]
> > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > > with errorcode 59.
> > >
> > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > > You may or may not see output from other processes, depending on
> > > exactly when Open MPI kills them.
> > >
> --------------------------------------------------------------------------
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
> [0xffffffff0091f684]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
> [0xffffffff006c389c]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
> [0xffffffff006db3ac]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
> [0xffffffff00281bf0]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > > [p01-024:26516] [(nil)]
> > > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
> [0xffffffff02d3b81c]
> > > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or
> the batch system) has told this process to end
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
> Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > > [ERR.] PLE 0019 plexec One of MPI processes was
> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> > >
> > > However, if I change from
> > > call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > > to
> > > call MatMatMult(A,Km(stepIdx-1),
> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > >
> > > everything is fine.
> > >
> > > could you please suggest some way to solve this?
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > Thank you very much for your help and suggestions.
> > > With your help, finally I could continue my project.
> > >
> > > Regards
> > >
> > > Cong Li
> > >
> > >
> > >
> > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
> created.
> > >
> > >   Since you want to use the C that is passed in you should use
> MAT_REUSE_MATRIX.
> > >
> > >   Note that since your B and C matrices are dense the issue of
> sparsity pattern of C is not relevant.
> > >
> > >   Barry
> > >
> > > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > >
> > > > Thanks very much. This answer is very helpful.
> > > > And I have a following question.
> > > > If I create B1, B2, .. by the way you suggested and then use
> MatMatMult to do SPMM.
> > > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
> fill,Mat *C)
> > > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > > >
> > > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > > >
> > > > > I am sorry that I should have explained it more clearly.
> > > > > Actually I want to compute a recurrence.
> > > > >
> > > > > Like, I want to firstly compute A*X1=B1, and then calculate
> A*B1=B2, A*B2=B3 and so on.
> > > > > Finally I want to combine all these results into a bigger matrix
> C=[B1,B2 ...]
> > > >
> > > >    First create C with MatCreateDense(,&C). Then call
> MatDenseGetArray(C,&array); then create B1 with
> MatCreateDense(....,array,&B1); then create
> > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
> the number of __local__ rows in B1 times the number of columns in B1, then
> create B3 with a larger shift etc.
> > > >
> > > >    Note that you are "sharing" the array space of C with B1, B2, B3,
> ..., each Bi contains its columns of the C matrix.
> > > >
> > > >   Barry
> > > >
> > > >
> > > >
> > > > >
> > > > > Is there any way to do this efficiently.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
> patrick.sanan at gmail.com> wrote:
> > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > I have an other question.
> > > > > > I want to do SPMM several times and combine result matrices into
> one bigger
> > > > > > matrix.
> > > > > > for example
> > > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > > >
> > > > > > Could you please suggest a way of how to do this.
> > > > > This is just linear algebra, nothing to do with PETSc specifically.
> > > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Cong Li
> > > > > >
> > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
> wrote:
> > > > > >
> > > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > > I am wondering if there is a way to implement SPMM (Sparse
> matrix-matrix
> > > > > > > > multiplication) by using PETSc.
> > > > > > >
> > > > > > >
> > > > > > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/65b2a87f/attachment-0001.html>

From rongliang.chan at gmail.com  Thu Aug  6 01:36:56 2015
From: rongliang.chan at gmail.com (Rongliang Chen)
Date: Thu, 06 Aug 2015 14:36:56 +0800
Subject: [petsc-users] Fail to Configure petsc-3.6.1
In-Reply-To: <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org>
	<55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org>
	<CAMYG4Gn3wmY7CfAzG3DBGyVSKY=BrWHNkNU7Bu+LT_=Ejq9muA@mail.gmail.com>
	<49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov>
	<87pp31y1sc.fsf@jedbrown.org>
	<37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov>
Message-ID: <55C30088.3000503@gmail.com>

Thanks for all your helps! The problem has been solved by using shared 
libraries.

Best,
Rongliang

On 08/06/2015 04:11 AM, Barry Smith wrote:
>> On Aug 5, 2015, at 2:26 PM, Jed Brown <jed at jedbrown.org> wrote:
>>
>> Barry Smith <bsmith at mcs.anl.gov> writes:
>>>    Our approach is always to work around bugs and stupidity in other packages design,
>> Do we report it to them as a bug?
>    When there is a place to report them then we should and sometimes do.
>
>    Barry
>
>


From dave.mayhem23 at gmail.com  Thu Aug  6 02:54:21 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Thu, 6 Aug 2015 09:54:21 +0200
Subject: [petsc-users] problem with MatShellGetContext
In-Reply-To: <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr>
References: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr>
	<624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr>
Message-ID: <CAJ98EDqqQgQw1tDnR5FbbFp5epHfEx1EgKcYdGpVne6FtSLhNw@mail.gmail.com>

On 5 August 2015 at 11:15, Nicolas Pozin <nicolas.pozin at inria.fr> wrote:

> Hello,
>
> I'm trying to solve a system with a matrix free operator and through
> conjugate gradient method.
> To make ideas clear, I set up the following simple example (I am using
> petsc-3.6) and I get this error message :
> "
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Wrong type of object: Parameter # 1!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: *Petsc Release Version 3.4.3*, Oct, 15, 2013
>


Also it appears that you are linking against petsc 3.4, not petsc 3.6.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/d6695272/attachment.html>

From dave.mayhem23 at gmail.com  Thu Aug  6 04:17:24 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Thu, 6 Aug 2015 11:17:24 +0200
Subject: [petsc-users] KSP changes for successive solver
In-Reply-To: <D4C38E1F-819A-45BA-B026-D21D145ECF1C@mcs.anl.gov>
References: <1437083588.21829.18.camel@kolmog5>
	<94823D83-AABD-4C22-8BF3-EBB0F1B1F7AA@mcs.anl.gov>
	<1437086528.21829.27.camel@kolmog5>
	<E02874E3-7BAB-4EB7-A8D2-8C39B82E3BEF@mcs.anl.gov>
	<1437092337.21829.42.camel@kolmog5>
	<CADOhEh6AWkXGL9L=z5Fth0KGR0Dup6ynHCUso26VKOnHLfZLaQ@mail.gmail.com>
	<1437762913.17123.11.camel@kolmog5>
	<F4545361-F541-41DA-B64C-B9309AE75502@mcs.anl.gov>
	<1437767070.17123.17.camel@kolmog5>
	<E171A248-9258-4B6B-8BFC-16C84AA3B8D5@mcs.anl.gov>
	<04691CE0-B35E-4F46-ABCA-6B05EA033F19@mcs.anl.gov>
	<CAJ98EDpOeBkKZmGRsFu9mHM0-n87xq1D_aphHN8zLTzDe6LmhA@mail.gmail.com>
	<D4C38E1F-819A-45BA-B026-D21D145ECF1C@mcs.anl.gov>
Message-ID: <CAJ98EDrnrHRrtZB1O9E0NSCo1ZkdnmyJzyQ8Jsyqig-w4Nc99Q@mail.gmail.com>

>    I agree with you more than the "consensus". I think the consensus does
> it just because it is perceived as too difficult or we don't have the right
> infrastructure to do it "correctly"
> >
> > In the end that is what I want to do. :D
> >
> > I would be happy to contribute a similar repartitioning preconditioner
> to petsc.
>
>    We'd love to have this reduced processor repartitioning for both
> DMDA/PCMG and for PCGAMG in PETSc.
>
>

Hi Barry,

I've created a pull-request which defines such a preconditiner.

I've tentatively called it SemiRedundant - but I don't think it is a great
name
in the sense it doesn't really describe what the preconditioner actually
can do.
I hate naming things. Possibly "Repart" or "Repartition" would be better
names.
Given the existence of "Redistribute", "Redundant", it is likely that it
will be hard
for a new user to know what the actual difference is between all these
preconditioners....

Cheers,
  Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/09d051e4/attachment.html>

From Mahir.Ulker-Kaustell at tyrens.se  Thu Aug  6 06:34:45 2015
From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se)
Date: Thu, 6 Aug 2015 11:34:45 +0000
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
	<e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
	<CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>
Message-ID: <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se>

Hong,

I have been using PETSC_COMM_WORLD.

Mahir

From: Hong [mailto:hzhang at mcs.anl.gov]
Sent: den 5 augusti 2015 17:11
To: ?lker-Kaustell, Mahir
Cc: Hong; Xiaoye S. Li; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir:
As you noticed, you ran the code in serial mode, not parallel.
Check your code on input communicator, e.g., what input communicator do you use in
KSPCreate(comm,&ksp)?

I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact'
in serial mode, this option is ignored with a warning.

Hong

Hong,

If I set parsymbfact:

$ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[63679,1],0]
  Exit code:    255
--------------------------------------------------------------------------

Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view.

If I do not set it, I get a serial run even if I specify ?n 2:

mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
?
KSP Object: 1 MPI processes
  type: preonly
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using NONE norm type for convergence test
PC Object: 1 MPI processes
  type: lu
    LU: out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: nd
    factor fill ratio given 0, needed 0
      Factored matrix follows:
        Mat Object:         1 MPI processes
          type: seqaij
          rows=954, cols=954
          package used to perform factorization: superlu_dist
          total: nonzeros=0, allocated nonzeros=0
          total number of mallocs used during MatSetValues calls =0
            SuperLU_DIST run parameters:
              Process grid nprow 1 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 0
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 1 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm
  linear system matrix = precond matrix:
  Mat Object:   1 MPI processes
    type: seqaij
    rows=954, cols=954
    total: nonzeros=34223, allocated nonzeros=34223
    total number of mallocs used during MatSetValues calls =0
      using I-node routines: found 668 nodes, limit used is 5

I am running PETSc via Cygwin on a windows machine.
When I installed PETSc the tests with different numbers of processes ran well.

Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 3 augusti 2015 19:06
To: ?lker-Kaustell, Mahir
Cc: Hong; Xiaoye S. Li; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir,


I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs.

If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1:
mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1

The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact.

Please run it with '-ksp_view' and see what
'SuperLU_DIST run parameters:' are being used, e.g.
petsc/src/ksp/ksp/examples/tutorials (maint)
$ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view

...
  SuperLU_DIST run parameters:
              Process grid nprow 2 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 1
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 2 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm

I do not understand why your code uses matrix input mode = global.

Hong


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 3 augusti 2015 16:46
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list

Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry found the culprit. I can reproduce it:
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact

Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
...

PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes.
Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run?

I'll add an error flag for these use cases.

Hong

On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I think I know the problem.   Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal().  This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).

That's why you get the following error:
Invalid ISPEC at line 484 in file get_perm_c.c

You need to use distributed matrix input interface pzgssvx() (without ABglobal)

Sherry


On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Hong and Sherry,

I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:

If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c

Mahir

From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 30 juli 2015 02:58
To: ?lker-Kaustell, Mahir
Cc: Xiaoye Li; PETSc users list

Subject: Fwd: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry fixed several bugs in superlu_dist-v4.1.
The current petsc-release interfaces with superlu_dist-v4.0.
We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?

Here is how to do it:
1. download superlu_dist v4.1
2. remove existing PETSC_ARCH directory, then configure petsc with
'--download-superlu_dist=superlu_dist_4.1.tar.gz'
3. build petsc

Let us know if the issue remains.

Hong


---------- Forwarded message ----------
From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Date: Wed, Jul 29, 2015 at 2:24 PM
Subject: Fwd: [petsc-users] SuperLU MPI-problem
To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>
Hong,
I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:

Invalid ISPEC at line 484 in file get_perm_c.c
This has nothing to do with my bug fix.
?  Shall we ask him to try the new version, or try to get him matrix?
Sherry
?

---------- Forwarded message ----------
From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
Date: Wed, Jul 22, 2015 at 1:32 PM
Subject: RE: [petsc-users] SuperLU MPI-problem
To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?

If i use -mat_superlu_dist_parsymbfact the program crashes with

Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------

If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with

Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
col block 3006 -------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------


/Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 22 juli 2015 21:34
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; petsc-users

Subject: Re: [petsc-users] SuperLU MPI-problem

In Petsc/superlu_dist interface, we set default

options.ParSymbFact = NO;

When user raises the flag "-mat_superlu_dist_parsymbfact",
we set

    options.ParSymbFact = YES;
    options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */

We do not change anything else.

Hong

On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.

The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.

I don't understand why you get the following error when you use
?-mat_superlu_dist_parsymbfact?.

Invalid ISPEC at line 484 in file get_perm_c.c

Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.

?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
?-mat_superlu_dist_parsymbfact?
? ?  (the default is to use  sequential symbolic factorization.)


Sherry

On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Thank you for your reply.

As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.

I am working in a Windows-environment and have installed PETSc through Cygwin.
Apparently, there is no support for Valgrind in this OS.

If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?


Best regards,
Mahir

______________________________________________
Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
______________________________________________

-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
Sent: den 22 juli 2015 02:57
To: ?lker-Kaustell, Mahir
Cc: Xiaoye S. Li; petsc-users
Subject: Re: [petsc-users] SuperLU MPI-problem


   Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)

   Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.

  Barry


==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42048== Syscall param write(buf) points to uninitialised byte(s)
==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Address 0x104810704 is on thread 1's stack
==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049== Conditional jump or move depends on uninitialised value(s)
==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049== Conditional jump or move depends on uninitialised value(s)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==


> On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Ok. So I have been creating the full factorization on each process. That gives me some hope!
>
> I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> However, now the program crashes with:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> And so on?
>
> From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
>
> Mahir
>
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
>
> From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> Sent: den 20 juli 2015 18:12
> To: ?lker-Kaustell, Mahir
> Cc: Hong; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
>
> The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
>
> You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
>
> Sherry Li
>
>
> On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Hong:
>
> Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
>
> The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
>
> Mahir
>
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 20 juli 2015 17:39
> To: ?lker-Kaustell, Mahir
> Cc: petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir:
> Direct solvers consume large amount of memory. Suggest to try followings:
>
> 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
>
> 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> Do you get memory crash in the 1st symbolic factorization?
> In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
>
> 3. Use a machine that gives larger memory.
>
> Hong
>
> Dear Petsc-Users,
>
> I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> The frequency dependency of the problem requires that the system
>
>                              [-omega^2M + K]u = F
>
> where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> K is a complex matrix, including material damping.
>
> I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
>
> The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
>
> I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
>
> Mahir


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/adb34d11/attachment-0001.html>

From knepley at gmail.com  Thu Aug  6 06:44:02 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 6 Aug 2015 06:44:02 -0500
Subject: [petsc-users] Error running DMPlex example
In-Reply-To: <CAPz1Tner12g5FLXkLTAbRMUgVd-u7kc=m7hdEO+E=06a=OvX6w@mail.gmail.com>
References: <CAPz1Tneo=KcNRNVWj-5Debs2wPg2TUT27sO8o6bH_whTet49ug@mail.gmail.com>
	<CAMYG4Gnozz2GQ04GAtX+xMV=MUVG_Qanzntb4t21g+-=ANYOSA@mail.gmail.com>
	<CAPz1Tner12g5FLXkLTAbRMUgVd-u7kc=m7hdEO+E=06a=OvX6w@mail.gmail.com>
Message-ID: <CAMYG4G=0DGC+-LQJiop6deVu5NRkB8fw0bPgbE2JumBdbsA9Tg@mail.gmail.com>

On Wed, Aug 5, 2015 at 10:21 PM, Gautam Bisht <gbisht at lbl.gov> wrote:

> Hi Matt,
>
> Instead of using gcc4.9, I reinstalled PETSc using clang on mac os x 10.10
> and the example runs fine.
>
> Btw, are there any examples that use DMPlex+DMComposite?
>

I don't think so. What would you anticipate using it for?

  Thanks,

    Matt


> Thanks,
> -Gautam.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/ab5da5fa/attachment.html>

From hzhang at mcs.anl.gov  Thu Aug  6 09:36:24 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Thu, 6 Aug 2015 09:36:24 -0500
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
	<e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
	<CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>
	<63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se>
Message-ID: <CAGCphBts_ppiRAKXz3zJPYYz+dwC4fsvOOi-4yhXLRdLRKLahg@mail.gmail.com>

Mahir:
>
>
>
> I have been using PETSC_COMM_WORLD.
>

What do you get by running a petsc example, e.g.,
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist
-ksp_view

KSP Object: 2 MPI processes
  type: gmres
...

Hong

>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 5 augusti 2015 17:11
> *To:* ?lker-Kaustell, Mahir
> *Cc:* Hong; Xiaoye S. Li; PETSc users list
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir:
>
> As you noticed, you ran the code in serial mode, not parallel.
>
> Check your code on input communicator, e.g., what input communicator do
> you use in
>
> KSPCreate(comm,&ksp)?
>
>
>
> I have added error flag to superlu_dist interface (released version). When
> user uses '-mat_superlu_dist_parsymbfact'
>
> in serial mode, this option is ignored with a warning.
>
>
>
> Hong
>
>
>
> Hong,
>
>
>
> If I set parsymbfact:
>
>
>
> $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> --------------------------------------------------------------------------
>
> mpiexec detected that one or more processes exited with non-zero status,
> thus causing
>
> the job to be terminated. The first process to do so was:
>
>
>
>   Process name: [[63679,1],0]
>
>   Exit code:    255
>
> --------------------------------------------------------------------------
>
>
>
> Since the program does not finish the call to KSPSolve(), we do not get
> any information about the KSP from ?ksp_view.
>
>
>
> If I do not set it, I get a serial run even if I specify ?n 2:
>
>
>
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -ksp_view
>
> ?
>
> KSP Object: 1 MPI processes
>
>   type: preonly
>
>   maximum iterations=10000, initial guess is zero
>
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>
>   left preconditioning
>
>   using NONE norm type for convergence test
>
> PC Object: 1 MPI processes
>
>   type: lu
>
>     LU: out-of-place factorization
>
>     tolerance for zero pivot 2.22045e-14
>
>     matrix ordering: nd
>
>     factor fill ratio given 0, needed 0
>
>       Factored matrix follows:
>
>         Mat Object:         1 MPI processes
>
>           type: seqaij
>
>           rows=954, cols=954
>
>           package used to perform factorization: superlu_dist
>
>           total: nonzeros=0, allocated nonzeros=0
>
>           total number of mallocs used during MatSetValues calls =0
>
>             SuperLU_DIST run parameters:
>
>               Process grid nprow 1 x npcol 1
>
>               Equilibrate matrix TRUE
>
>               Matrix input mode 0
>
>               Replace tiny pivots TRUE
>
>               Use iterative refinement FALSE
>
>               Processors in row 1 col partition 1
>
>               Row permutation LargeDiag
>
>               Column permutation METIS_AT_PLUS_A
>
>               Parallel symbolic factorization FALSE
>
>               Repeated factorization SamePattern_SameRowPerm
>
>   linear system matrix = precond matrix:
>
>   Mat Object:   1 MPI processes
>
>     type: seqaij
>
>     rows=954, cols=954
>
>     total: nonzeros=34223, allocated nonzeros=34223
>
>     total number of mallocs used during MatSetValues calls =0
>
>       using I-node routines: found 668 nodes, limit used is 5
>
>
>
> I am running PETSc via Cygwin on a windows machine.
>
> When I installed PETSc the tests with different numbers of processes ran
> well.
>
>
>
> Mahir
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 3 augusti 2015 19:06
> *To:* ?lker-Kaustell, Mahir
> *Cc:* Hong; Xiaoye S. Li; PETSc users list
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
>
>
> I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for
> parallel runs.
>
>
>
> If I use 2 processors, the program runs if I use
> *?mat_superlu_dist_parsymbfact=1*:
>
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> GLOBAL -mat_superlu_dist_parsymbfact=1
>
>
>
> The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so
> your code runs well without parsymbfact.
>
>
>
> Please run it with '-ksp_view' and see what
>
> 'SuperLU_DIST run parameters:' are being used, e.g.
>
> petsc/src/ksp/ksp/examples/tutorials (maint)
>
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package
> superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view
>
>
>
> ...
>
>   SuperLU_DIST run parameters:
>
>               Process grid nprow 2 x npcol 1
>
>               Equilibrate matrix TRUE
>
>               Matrix input mode 1
>
>               Replace tiny pivots TRUE
>
>               Use iterative refinement FALSE
>
>               Processors in row 2 col partition 1
>
>               Row permutation LargeDiag
>
>               Column permutation METIS_AT_PLUS_A
>
>               Parallel symbolic factorization FALSE
>
>               Repeated factorization SamePattern_SameRowPerm
>
>
>
> I do not understand why your code uses matrix input mode = global.
>
>
>
> Hong
>
>
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 3 augusti 2015 16:46
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; Hong; PETSc users list
>
>
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry found the culprit. I can reproduce it:
>
> petsc/src/ksp/ksp/examples/tutorials
>
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist
> -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> ...
>
>
>
> PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when
> using more than one processes.
>
> Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or
> set matinput=GLOBAL for parallel run?
>
>
>
> I'll add an error flag for these use cases.
>
>
>
> Hong
>
>
>
> On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I think I know the problem.   Since zdistribute.c is called, I guess you
> are using the global (replicated) matrix input interface,
> pzgssvx_ABglobal().  This interface does not allow you to use parallel
> symbolic factorization (since matrix is centralized).
>
>
>
> That's why you get the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> You need to use distributed matrix input interface pzgssvx() (without
> ABglobal)
>
> Sherry
>
>
>
>
>
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Hong and Sherry,
>
>
>
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
>
>
>
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid
> ISPEC at line 484 in file get_perm_c.c
>
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the
> program crashes with:  Calloc fails for SPA dense[]. at line 438 in file
> zdistribute.c
>
>
>
> Mahir
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
> *Sent:* den 30 juli 2015 02:58
> *To:* ?lker-Kaustell, Mahir
> *Cc:* Xiaoye Li; PETSc users list
>
>
> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem
>
>
>
> Mahir,
>
>
>
> Sherry fixed several bugs in superlu_dist-v4.1.
>
> The current petsc-release interfaces with superlu_dist-v4.0.
>
> We do not know whether the reported issue (attached below) has been
> resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
>
>
>
> Here is how to do it:
>
> 1. download superlu_dist v4.1
>
> 2. remove existing PETSC_ARCH directory, then configure petsc with
>
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
>
> 3. build petsc
>
>
>
> Let us know if the issue remains.
>
>
>
> Hong
>
>
>
>
>
> ---------- Forwarded message ----------
> From: *Xiaoye S. Li* <xsli at lbl.gov>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <hzhang at mcs.anl.gov>
>
> Hong,
>
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure
> whether the new fix to parallel symbolic factorization solves the problem.
> What bothers be is that he is getting the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> This has nothing to do with my bug fix.
>
> ?  Shall we ask him to try the new version, or try to get him matrix?
>
> Sherry
> ?
>
>
>
> ---------- Forwarded message ----------
> From: *Mahir.Ulker-Kaustell at tyrens.se <Mahir.Ulker-Kaustell at tyrens.se>* <
> Mahir.Ulker-Kaustell at tyrens.se>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <hzhang at mcs.anl.gov>, "Xiaoye S. Li" <xsli at lbl.gov>
> Cc: petsc-users <petsc-users at mcs.anl.gov>
>
> The 1000 was just a conservative guess. The number of non-zeros per row is
> in the tens in general but certain constraints lead to non-diagonal streaks
> in the sparsity-pattern.
>
> Is it the reordering of the matrix that is killing me here? How can I set
> options.ColPerm?
>
>
>
> If i use -mat_superlu_dist_parsymbfact the program crashes with
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:23 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat
> later) with
>
>
>
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
>
> col block 3006 -------------------------------------------------------
>
> Primary job  terminated normally, but 1 process returned
>
> a non-zero exit code.. Per user-direction, the job has been aborted.
>
> -------------------------------------------------------
>
> col block 1924 [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:58 2015
>
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [unset]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
>
>
>
>
> /Mahir
>
>
>
>
>
> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>
> *Sent:* den 22 juli 2015 21:34
> *To:* Xiaoye S. Li
> *Cc:* ?lker-Kaustell, Mahir; petsc-users
>
>
> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>
>
>
> In Petsc/superlu_dist interface, we set default
>
>
>
> options.ParSymbFact = NO;
>
>
>
> When user raises the flag "-mat_superlu_dist_parsymbfact",
>
> we set
>
>
>
>     options.ParSymbFact = YES;
>
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for
> ParSymbFact regardless of user ordering setting */
>
>
>
> We do not change anything else.
>
>
>
> Hong
>
>
>
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>
> I am trying to understand your problem. You said you are solving Naviers
> equation (elastodynamics) in the frequency domain, using finite element
> discretization.  I wonder why you have about 1000 nonzeros per row.
> Usually in many PDE discretized matrices, the number of nonzeros per row is
> in the tens (even for 3D problems), not in the thousands.   So, your matrix
> is quite a bit denser than many sparse matrices we deal with.
>
>
>
> The number of nonzeros in the L and U factors is much more than that in
> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be
> as bad as 50-100x fill ratio for 3D.  But since your matrix starts much
> denser (i.e., the underlying graph has many connections), it may not lend
> to any good ordering strategy to preserve sparsity of L and U; that is, the
> L and U fill ratio may be large.
>
>
>
> I don't understand why you get the following error when you use
>
> ?-mat_superlu_dist_parsymbfact?.
>
>
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
>
>
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
>
>
>
> ?Hong -- in order to use parallel symbolic factorization, is it sufficient
> to specify only
>
> ?-mat_superlu_dist_parsymbfact?
>
> ? ?  (the default is to use  sequential symbolic factorization.)
>
>
>
>
>
> Sherry
>
>
>
> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Thank you for your reply.
>
> As you have probably figured out already, I am not a computational
> scientist. I am a researcher in civil engineering (railways for high-speed
> traffic), trying to produce some, from my perspective, fairly large
> parametric studies based on finite element discretizations.
>
> I am working in a Windows-environment and have installed PETSc through
> Cygwin.
> Apparently, there is no support for Valgrind in this OS.
>
> If I have understood you correct, the memory issues are related to superLU
> and given my background, there is not much I can do. Is this correct?
>
>
> Best regards,
> Mahir
>
> ______________________________________________
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> ______________________________________________
>
>
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
> Sent: den 22 juli 2015 02:57
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
>
>    Run the program under valgrind
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use
> the option -mat_superlu_dist_parsymbfact I get many scary memory problems
> some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>
>    Note that I consider it unacceptable for running programs to EVER use
> uninitialized values; until these are all cleaned up I won't trust any runs
> like this.
>
>   Barry
>
>
>
>
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size
> 752,720 alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis
> (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis
> (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size
> 131,072 alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
>
>
> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That
> gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option
> ?-mat_superlu_dist_parsymbfact?.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on?
> >
> > From the SuperLU manual; I should give the option either YES or NO,
> however -mat_superlu_dist_parsymbfact YES makes the program crash in the
> same way as above.
> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the
> PETSc documentation
> >
> > Mahir
> >
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
> >
> > From: Xiaoye S. Li [mailto:xsli at lbl.gov]
> > Sent: den 20 juli 2015 18:12
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization.
> Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to
> {L\U} data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option:
> '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very
> difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6
> degrees of freedom. The matrices are derived from finite elements so they
> are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory
> needed to less than 20GB, so if the solver needs twice or even three times
> as much, it should still work well. Or have I completely misunderstood
> something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov]
> > Sent: den 20 juli 2015 17:39
> > To: ?lker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too
> ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix
> orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so
> you only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from
> Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and
> stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small
> (1000 degrees of freedom) test problem on one or several processors, but it
> keeps crashing when I try it on my full scale (in the order of 10^6 degrees
> of freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error
> messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so
> I am hoping that someone can push  me in the right direction?
> >
> > Mahir
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/d0a21666/attachment-0001.html>

From gbisht at lbl.gov  Thu Aug  6 10:08:58 2015
From: gbisht at lbl.gov (Gautam Bisht)
Date: Thu, 6 Aug 2015 08:08:58 -0700
Subject: [petsc-users] Error running DMPlex example
In-Reply-To: <CAMYG4G=0DGC+-LQJiop6deVu5NRkB8fw0bPgbE2JumBdbsA9Tg@mail.gmail.com>
References: <CAPz1Tneo=KcNRNVWj-5Debs2wPg2TUT27sO8o6bH_whTet49ug@mail.gmail.com>
	<CAMYG4Gnozz2GQ04GAtX+xMV=MUVG_Qanzntb4t21g+-=ANYOSA@mail.gmail.com>
	<CAPz1Tner12g5FLXkLTAbRMUgVd-u7kc=m7hdEO+E=06a=OvX6w@mail.gmail.com>
	<CAMYG4G=0DGC+-LQJiop6deVu5NRkB8fw0bPgbE2JumBdbsA9Tg@mail.gmail.com>
Message-ID: <CAPz1Tnem_poZdgcQkgm8yo2fUPPkDu_7hv0-OMHyuTaW3mWVYg@mail.gmail.com>

I'm going to move this thread over to dev mailing list.

-Gautam.

On Thu, Aug 6, 2015 at 4:44 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Aug 5, 2015 at 10:21 PM, Gautam Bisht <gbisht at lbl.gov> wrote:
>
>> Hi Matt,
>>
>> Instead of using gcc4.9, I reinstalled PETSc using clang on mac os x
>> 10.10 and the example runs fine.
>>
>> Btw, are there any examples that use DMPlex+DMComposite?
>>
>
> I don't think so. What would you anticipate using it for?
>
>   Thanks,
>
>     Matt
>
>
>> Thanks,
>> -Gautam.
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/5d394c3c/attachment.html>

From hzhang at mcs.anl.gov  Thu Aug  6 10:09:02 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Thu, 6 Aug 2015 10:09:02 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <FCB0B289-58BD-4AA4-8DCD-9EEB755360D2@mcs.anl.gov>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
	<FCB0B289-58BD-4AA4-8DCD-9EEB755360D2@mcs.anl.gov>
Message-ID: <CAGCphBu+an7ffKHi19upfV4=68DRaBpgRHziaMEv5AFHx6=aAQ@mail.gmail.com>

Barry:
>
>
>   Hong, we want to reuse the space in the Km(stepIdx-1) from which it was
> created which means that MAT_INITIAL_MATRIX cannot be used. Since the
> result is always dense it is not the difficult case when

a symbolic computation needs to be done initially so, at least in theory,
> he should not have to use MAT_INITIAL_MATRIX the first time through.
>

Petsc implementation of MatMatMult() assumes user call
MatMatMultSymbolic() first, in which, we define which
MatMatMultNumeric() routine to be followed, and most importantly, we create
specific data structure to be reused by  MatMatMultNumeric().
In MatMatMultSymbolic_MPIAIJ_MPIDense(), we create a 'container'.

Without calling the case of MAT_INITIAL_MATRIX, these info are missing, and
code simply crashes.

Hong


> >
> > Hong
> >
> >
> > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > The entire source code files are attached.
> >
> > Also I copy and paste the here in this email
> >
> > thanks
> >
> > program test
> >
> >   implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >
> >   PetscViewer    :: view
> >   ! sparse matrix
> >   Mat            :: A
> >   ! distributed dense matrix of size n x m
> >   Mat            :: B, X, R, QDlt, AQDlt
> >   ! distributed dense matrix of size n x (m x k)
> >   Mat            :: Q, K, AQ_p, AQ
> >   ! local dense matrix (every process keep the identical copies), (m x
> k) x (m x k)
> >   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
> >
> >   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
> step_k,bsize
> >   PetscInt       :: ownRowS,ownRowE
> >   PetscScalar, allocatable :: XInit(:,:)
> >   PetscInt       :: XInitI, XInitJ
> >   PetscScalar    :: v=1.0
> >   PetscBool      :: flg
> >   PetscMPIInt    :: size, rank
> >
> >   character(128) ::  fin, rhsfin
> >
> >
> >   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
> >   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
> >   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> >
> >   ! read binary matrix file
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
> >
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
> >
> >
> >   call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
> >   call MatCreate(PETSC_COMM_WORLD,A,ierr)
> >   call MatSetType(A,MATAIJ,ierr)
> >   call MatLoad(A,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   ! for the time being, assume mDim == nDim is true
> >   call MatGetSize(A, nDim, mDim, ierr)
> >
> >   if (rank == 0) then
> >     print*,'Mat Size = ', nDim, mDim
> >   end if
> >
> >   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
> >
> >   ! create right-and-side matrix
> >   ! for the time being, choose row-wise decomposition
> >   ! for the time being, assume nDim%size = 0
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
> >   call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
> >   call MatLoad(B,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
> >   if (rank == 0) then
> >     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
> >     print*,'MRHS Size should be:', nDim, bsize
> >   end if
> >   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   ! inintial value guses X
> >   allocate(XInit(nDim,bsize))
> >   do XInitI=1, nDim
> >     do XInitJ=1, bsize
> >       XInit(XInitI,XInitJ) = 1.0
> >     end do
> >   end do
> >
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,XInit, X, ierr)
> >
> >   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
> >
> >
> >   !  B, X, R, QDlt, AQDlt
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
> >   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
> >   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
> >   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! Q, K, AQ_p, AQ of size n x (m x k)
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       (bsize*step_k), nDim,
> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
> >   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
> >   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
> >   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
> >   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
> >   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
> >                          PETSC_NULL_SCALAR, QtAQ, ierr)
> >   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
> >   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
> >   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
> >   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! calculation for R
> >
> > ! call matrix powers kernel
> >   call mpk_monomial (K, A, R, step_k, rank,size)
> >
> > ! destory matrices
> >   deallocate(XInit)
> >
> >   call MatDestroy(B, ierr)
> >   call MatDestroy(X, ierr)
> >   call MatDestroy(R, ierr)
> >   call MatDestroy(QDlt, ierr)
> >   call MatDestroy(AQDlt, ierr)
> >   call MatDestroy(Q, ierr)
> >   call MatDestroy(K, ierr)
> >   call MatDestroy(AQ_p, ierr)
> >   call MatDestroy(AQ, ierr)
> >   call MatDestroy(QtAQ, ierr)
> >   call MatDestroy(QtAQ_p, ierr)
> >   call MatDestroy(Dlt, ierr)
> >
> >
> >   call PetscFinalize(ierr)
> >
> >   stop
> >
> > end program test
> >
> >
> > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> >       implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >       Mat            :: K, Km(step_k)
> >       Mat            :: A, R
> >       PetscMPIInt    :: sizeMPI, rank
> >       PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol,
> genIdx
> >       PetscInt       :: ierr
> >       PetscInt       :: stepIdx, blockShift, localRsize
> >   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
> >   PetscOffset    :: KArrayOffset, RArrayOffset
> >
> >       call MatGetSize(R, nDim, bsize, ierr)
> >       if (rank == 0) then
> >         print*,'Mat Size = ', nDim, bsize
> >       end if
> >
> >   call MatGetArray(K,KArray,KArrayOffset,ierr)
> >
> >   call MatGetLocalSize(R,local_RRow,local_RCol)
> > !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
> >
> >   ! get arry from R to add values to K(1)
> >   call MatGetArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset +
> 1), Km(1), ierr)
> >
> >
> > !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
> > !                  ,local_RRow * local_RCol *
> STORAGE_SIZE(PetscScalarSize), ierr)
> >
> >   localRsize = local_RRow * local_RCol
> >   do genIdx= 1, localRsize
> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> >   end do
> >
> >
> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   do stepIdx= 2, step_k
> >
> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> >
> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1),
> Km(stepIdx), ierr)
> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   end do
> >
> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> >
> > !   do stepIdx= 2, step_k
> >   do stepIdx= 2,2
> >
> >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > !     call
> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> >   end do
> >
> > !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> > end subroutine mpk_monomial
> >
> >
> >
> > Cong Li
> >
> > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    Send the entire code so that we can compile it and run it ourselves
> to see what is going wrong.
> >
> >   Barry
> >
> > > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > Hi
> > >
> > > I tried the method you suggested. However, I got the error message.
> > > My code and message are below.
> > >
> > > K is the big matrix containing column matrices.
> > >
> > > code:
> > >
> > > call MatGetArray(K,KArray,KArrayOffset,ierr)
> > >
> > > call MatGetLocalSize(R,local_RRow,local_RCol)
> > >
> > > call MatGetArray(R,RArray,RArrayOffset,ierr)
> > >
> > > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset
> + 1), Km(1), ierr)
> > >
> > >   localRsize = local_RRow * local_RCol
> > >   do genIdx= 1, localRsize
> > >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> > >   end do
> > >
> > >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> > >
> > >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   do stepIdx= 2, step_k
> > >
> > >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> > >
> > >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim,
> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> > >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >   end do
> > >
> > >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> > >
> > >    do stepIdx= 2, step_k
> > >
> > >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > >   end do
> > >
> > >
> > > And I got the error message as below:
> > >
> > >
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > > ----------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
> Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > >
> --------------------------------------------------------------------------
> > > [mpi::mpi-api::mpi-abort]
> > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > > with errorcode 59.
> > >
> > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > > You may or may not see output from other processes, depending on
> > > exactly when Open MPI kills them.
> > >
> --------------------------------------------------------------------------
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
> [0xffffffff0091f684]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
> [0xffffffff006c389c]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
> [0xffffffff006db3ac]
> > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
> [0xffffffff00281bf0]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > > [p01-024:26516] [(nil)]
> > > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
> [0xffffffff02d3b81c]
> > > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or
> the batch system) has told this process to end
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
> Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > > [ERR.] PLE 0019 plexec One of MPI processes was
> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> > >
> > > However, if I change from
> > > call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > > to
> > > call MatMatMult(A,Km(stepIdx-1),
> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > >
> > > everything is fine.
> > >
> > > could you please suggest some way to solve this?
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > Thank you very much for your help and suggestions.
> > > With your help, finally I could continue my project.
> > >
> > > Regards
> > >
> > > Cong Li
> > >
> > >
> > >
> > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
> created.
> > >
> > >   Since you want to use the C that is passed in you should use
> MAT_REUSE_MATRIX.
> > >
> > >   Note that since your B and C matrices are dense the issue of
> sparsity pattern of C is not relevant.
> > >
> > >   Barry
> > >
> > > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > >
> > > > Thanks very much. This answer is very helpful.
> > > > And I have a following question.
> > > > If I create B1, B2, .. by the way you suggested and then use
> MatMatMult to do SPMM.
> > > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
> fill,Mat *C)
> > > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > > >
> > > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > > >
> > > > > I am sorry that I should have explained it more clearly.
> > > > > Actually I want to compute a recurrence.
> > > > >
> > > > > Like, I want to firstly compute A*X1=B1, and then calculate
> A*B1=B2, A*B2=B3 and so on.
> > > > > Finally I want to combine all these results into a bigger matrix
> C=[B1,B2 ...]
> > > >
> > > >    First create C with MatCreateDense(,&C). Then call
> MatDenseGetArray(C,&array); then create B1 with
> MatCreateDense(....,array,&B1); then create
> > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
> the number of __local__ rows in B1 times the number of columns in B1, then
> create B3 with a larger shift etc.
> > > >
> > > >    Note that you are "sharing" the array space of C with B1, B2, B3,
> ..., each Bi contains its columns of the C matrix.
> > > >
> > > >   Barry
> > > >
> > > >
> > > >
> > > > >
> > > > > Is there any way to do this efficiently.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
> patrick.sanan at gmail.com> wrote:
> > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > I have an other question.
> > > > > > I want to do SPMM several times and combine result matrices into
> one bigger
> > > > > > matrix.
> > > > > > for example
> > > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > > >
> > > > > > Could you please suggest a way of how to do this.
> > > > > This is just linear algebra, nothing to do with PETSc specifically.
> > > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Cong Li
> > > > > >
> > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
> wrote:
> > > > > >
> > > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > > I am wondering if there is a way to implement SPMM (Sparse
> matrix-matrix
> > > > > > > > multiplication) by using PETSc.
> > > > > > >
> > > > > > >
> > > > > > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/9e5c2521/attachment-0001.html>

From hzhang at mcs.anl.gov  Thu Aug  6 10:20:42 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Thu, 6 Aug 2015 10:20:42 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-kkZebNiczWib_q2OQSXo6r6zHyz1JifiG5zfQFN8Dvqw@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
	<CALSmn-kkZebNiczWib_q2OQSXo6r6zHyz1JifiG5zfQFN8Dvqw@mail.gmail.com>
Message-ID: <CAGCphBsYzC8fcgQEd9-xWj=yYkzAPHjtDdw5-Dq_ZO5OMN6VMA@mail.gmail.com>

Cong:

> Hong,
>
> Sure.
>
> I want to extend the Krylov subspace by step_k dimensions by using
> monomial, which can be defined as
>
> K={Km(1)m Km(2), ..., Km(step_k)}
>   ={Km(1), AKm(1), AKm(2), ... , AKm(step_k-1)}
>   ={R, AR, A^2R, ... A^(step_k-1)R}
>

A subspace with dense matrices as basis?
How large step_k and your matrices will be?

Hong

>
> On Thu, Aug 6, 2015 at 12:23 PM, Hong <hzhang at mcs.anl.gov> wrote:
>
>> Cong,
>>
>> Can you write out math equations for mpk_monomial (),
>> list input and output parameters.
>>
>> Note:
>> 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End
>> 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after
>>     MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..)
>>
>> Hong
>>
>>
>> On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com> wrote:
>>
>>> The entire source code files are attached.
>>>
>>> Also I copy and paste the here in this email
>>>
>>> thanks
>>>
>>> program test
>>>
>>>   implicit none
>>>
>>> #include <finclude/petscsys.h>
>>> #include <finclude/petscvec.h>
>>> #include <finclude/petscmat.h>
>>> #include <finclude/petscviewer.h>
>>>
>>>
>>>   PetscViewer    :: view
>>>   ! sparse matrix
>>>   Mat            :: A
>>>   ! distributed dense matrix of size n x m
>>>   Mat            :: B, X, R, QDlt, AQDlt
>>>   ! distributed dense matrix of size n x (m x k)
>>>   Mat            :: Q, K, AQ_p, AQ
>>>   ! local dense matrix (every process keep the identical copies), (m x
>>> k) x (m x k)
>>>   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
>>>
>>>   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
>>> step_k,bsize
>>>   PetscInt       :: ownRowS,ownRowE
>>>   PetscScalar, allocatable :: XInit(:,:)
>>>   PetscInt       :: XInitI, XInitJ
>>>   PetscScalar    :: v=1.0
>>>   PetscBool      :: flg
>>>   PetscMPIInt    :: size, rank
>>>
>>>   character(128) ::  fin, rhsfin
>>>
>>>
>>>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>>>   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
>>>   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
>>>
>>>   ! read binary matrix file
>>>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
>>>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
>>>
>>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
>>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
>>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
>>>
>>>
>>>   call
>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
>>>   call MatCreate(PETSC_COMM_WORLD,A,ierr)
>>>   call MatSetType(A,MATAIJ,ierr)
>>>   call MatLoad(A,view,ierr)
>>>   call PetscViewerDestroy(view,ierr)
>>>   ! for the time being, assume mDim == nDim is true
>>>   call MatGetSize(A, nDim, mDim, ierr)
>>>
>>>   if (rank == 0) then
>>>     print*,'Mat Size = ', nDim, mDim
>>>   end if
>>>
>>>   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
>>>
>>>   ! create right-and-side matrix
>>>   ! for the time being, choose row-wise decomposition
>>>   ! for the time being, assume nDim%size = 0
>>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>>                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
>>>   call
>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
>>>   call MatLoad(B,view,ierr)
>>>   call PetscViewerDestroy(view,ierr)
>>>   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
>>>   if (rank == 0) then
>>>     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
>>>     print*,'MRHS Size should be:', nDim, bsize
>>>   end if
>>>   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   ! inintial value guses X
>>>   allocate(XInit(nDim,bsize))
>>>   do XInitI=1, nDim
>>>     do XInitJ=1, bsize
>>>       XInit(XInitI,XInitJ) = 1.0
>>>     end do
>>>   end do
>>>
>>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>>                       bsize, nDim, bsize,XInit, X, ierr)
>>>
>>>   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>
>>>   !  B, X, R, QDlt, AQDlt
>>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
>>>   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
>>>   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
>>>   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>> ! Q, K, AQ_p, AQ of size n x (m x k)
>>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>>                       (bsize*step_k), nDim,
>>> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
>>>   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
>>>   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
>>>   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
>>>   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
>>>   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
>>>                          PETSC_NULL_SCALAR, QtAQ, ierr)
>>>   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
>>>   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
>>>   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
>>>   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>>>
>>> ! calculation for R
>>>
>>> ! call matrix powers kernel
>>>   call mpk_monomial (K, A, R, step_k, rank,size)
>>>
>>> ! destory matrices
>>>   deallocate(XInit)
>>>
>>>   call MatDestroy(B, ierr)
>>>   call MatDestroy(X, ierr)
>>>   call MatDestroy(R, ierr)
>>>   call MatDestroy(QDlt, ierr)
>>>   call MatDestroy(AQDlt, ierr)
>>>   call MatDestroy(Q, ierr)
>>>   call MatDestroy(K, ierr)
>>>   call MatDestroy(AQ_p, ierr)
>>>   call MatDestroy(AQ, ierr)
>>>   call MatDestroy(QtAQ, ierr)
>>>   call MatDestroy(QtAQ_p, ierr)
>>>   call MatDestroy(Dlt, ierr)
>>>
>>>
>>>   call PetscFinalize(ierr)
>>>
>>>   stop
>>>
>>> end program test
>>>
>>>
>>> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
>>> implicit none
>>>
>>> #include <finclude/petscsys.h>
>>> #include <finclude/petscvec.h>
>>> #include <finclude/petscmat.h>
>>> #include <finclude/petscviewer.h>
>>>
>>> Mat            :: K, Km(step_k)
>>> Mat            :: A, R
>>> PetscMPIInt    :: sizeMPI, rank
>>> PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
>>> PetscInt       :: ierr
>>> PetscInt       :: stepIdx, blockShift, localRsize
>>>   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
>>>   PetscOffset    :: KArrayOffset, RArrayOffset
>>>
>>> call MatGetSize(R, nDim, bsize, ierr)
>>>   if (rank == 0) then
>>>    print*,'Mat Size = ', nDim, bsize
>>>   end if
>>>
>>>   call MatGetArray(K,KArray,KArrayOffset,ierr)
>>>
>>>   call MatGetLocalSize(R,local_RRow,local_RCol)
>>> !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
>>>
>>>   ! get arry from R to add values to K(1)
>>>   call MatGetArray(R,RArray,RArrayOffset,ierr)
>>>
>>>   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset +
>>> 1), Km(1), ierr)
>>>
>>>
>>> !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
>>> !                  ,local_RRow * local_RCol *
>>> STORAGE_SIZE(PetscScalarSize), ierr)
>>>
>>>   localRsize = local_RRow * local_RCol
>>>   do genIdx= 1, localRsize
>>>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>>   end do
>>>
>>>
>>>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>>
>>>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   do stepIdx= 2, step_k
>>>
>>>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>>>
>>>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1),
>>> Km(stepIdx), ierr)
>>>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>
>>>   end do
>>>
>>>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>>
>>> !   do stepIdx= 2, step_k
>>>   do stepIdx= 2,2
>>>
>>>     call
>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>> ierr)
>>> !     call
>>> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>> ierr)
>>>   end do
>>>
>>> !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
>>>
>>> end subroutine mpk_monomial
>>>
>>>
>>>
>>> Cong Li
>>>
>>> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>>>
>>>>    Send the entire code so that we can compile it and run it ourselves
>>>> to see what is going wrong.
>>>>
>>>>   Barry
>>>>
>>>> > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
>>>> >
>>>> > Hi
>>>> >
>>>> > I tried the method you suggested. However, I got the error message.
>>>> > My code and message are below.
>>>> >
>>>> > K is the big matrix containing column matrices.
>>>> >
>>>> > code:
>>>> >
>>>> > call MatGetArray(K,KArray,KArrayOffset,ierr)
>>>> >
>>>> > call MatGetLocalSize(R,local_RRow,local_RCol)
>>>> >
>>>> > call MatGetArray(R,RArray,RArrayOffset,ierr)
>>>> >
>>>> > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>> >                         PETSC_DECIDE , nDim,
>>>> bsize,KArray(KArrayOffset + 1), Km(1), ierr)
>>>> >
>>>> >   localRsize = local_RRow * local_RCol
>>>> >   do genIdx= 1, localRsize
>>>> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>>> >   end do
>>>> >
>>>> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>>> >
>>>> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>> >
>>>> >   do stepIdx= 2, step_k
>>>> >
>>>> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow *
>>>> local_RCol)
>>>> >
>>>> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>> >                         PETSC_DECIDE , nDim,
>>>> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>>>> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>> >   end do
>>>> >
>>>> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>>> >
>>>> >    do stepIdx= 2, step_k
>>>> >
>>>> >     call
>>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>>> ierr)
>>>> >   end do
>>>> >
>>>> >
>>>> > And I got the error message as below:
>>>> >
>>>> >
>>>> > [0]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>> probably memory access out of range
>>>> > [0]PETSC ERROR: Try option -start_in_debugger or
>>>> -on_error_attach_debugger
>>>> > [0]PETSC ERROR: or see
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>>>> find memory corruption errors
>>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile,
>>>> link, and run
>>>> > [0]PETSC ERROR: to get more information on the crash.
>>>> > [0]PETSC ERROR: --------------------- Error Message
>>>> ------------------------------------
>>>> > [0]PETSC ERROR: Signal received!
>>>> > [0]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>>>> 22:15:24 CDT 2013
>>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>>>> > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>> probably memory access out of range
>>>> > ----------------------------------------------------
>>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
>>>> Wed Aug  5 18:24:40 2015
>>>> > [0]PETSC ERROR: Libraries linked from
>>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>>> > [0]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>>> unknown file
>>>> >
>>>> --------------------------------------------------------------------------
>>>> > [mpi::mpi-api::mpi-abort]
>>>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>>> > with errorcode 59.
>>>> >
>>>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>> > You may or may not see output from other processes, depending on
>>>> > exactly when Open MPI kills them.
>>>> >
>>>> --------------------------------------------------------------------------
>>>> > [p01-024:26516]
>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>>> [0xffffffff0091f684]
>>>> > [p01-024:26516]
>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>>> [0xffffffff006c389c]
>>>> > [p01-024:26516]
>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
>>>> [0xffffffff006db3ac]
>>>> > [p01-024:26516]
>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>>>> [0xffffffff00281bf0]
>>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
>>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
>>>> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
>>>> > [p01-024:26516] [(nil)]
>>>> > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
>>>> > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
>>>> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
>>>> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
>>>> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
>>>> [0xffffffff02d3b81c]
>>>> > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
>>>> > [0]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or
>>>> the batch system) has told this process to end
>>>> > [0]PETSC ERROR: Try option -start_in_debugger or
>>>> -on_error_attach_debugger
>>>> > [0]PETSC ERROR: or see
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>>>> find memory corruption errors
>>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile,
>>>> link, and run
>>>> > [0]PETSC ERROR: to get more information on the crash.
>>>> > [0]PETSC ERROR: --------------------- Error Message
>>>> ------------------------------------
>>>> > [0]PETSC ERROR: Signal received!
>>>> > [0]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>>>> 22:15:24 CDT 2013
>>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>>>> > [0]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
>>>> Wed Aug  5 18:24:40 2015
>>>> > [0]PETSC ERROR: Libraries linked from
>>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>>> > [0]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>>> unknown file
>>>> > [ERR.] PLE 0019 plexec One of MPI processes was
>>>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
>>>> >
>>>> > However, if I change from
>>>> > call
>>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>>> ierr)
>>>> > to
>>>> > call MatMatMult(A,Km(stepIdx-1),
>>>> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>>> >
>>>> > everything is fine.
>>>> >
>>>> > could you please suggest some way to solve this?
>>>> >
>>>> > Thanks
>>>> >
>>>> > Cong Li
>>>> >
>>>> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
>>>> wrote:
>>>> > Thank you very much for your help and suggestions.
>>>> > With your help, finally I could continue my project.
>>>> >
>>>> > Regards
>>>> >
>>>> > Cong Li
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov>
>>>> wrote:
>>>> >
>>>> >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
>>>> created.
>>>> >
>>>> >   Since you want to use the C that is passed in you should use
>>>> MAT_REUSE_MATRIX.
>>>> >
>>>> >   Note that since your B and C matrices are dense the issue of
>>>> sparsity pattern of C is not relevant.
>>>> >
>>>> >   Barry
>>>> >
>>>> > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
>>>> wrote:
>>>> > >
>>>> > > Thanks very much. This answer is very helpful.
>>>> > > And I have a following question.
>>>> > > If I create B1, B2, .. by the way you suggested and then use
>>>> MatMatMult to do SPMM.
>>>> > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
>>>> fill,Mat *C)
>>>> > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>>>> > >
>>>> > > Thanks
>>>> > >
>>>> > > Cong Li
>>>> > >
>>>> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
>>>> wrote:
>>>> > >
>>>> > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
>>>> wrote:
>>>> > > >
>>>> > > > I am sorry that I should have explained it more clearly.
>>>> > > > Actually I want to compute a recurrence.
>>>> > > >
>>>> > > > Like, I want to firstly compute A*X1=B1, and then calculate
>>>> A*B1=B2, A*B2=B3 and so on.
>>>> > > > Finally I want to combine all these results into a bigger matrix
>>>> C=[B1,B2 ...]
>>>> > >
>>>> > >    First create C with MatCreateDense(,&C). Then call
>>>> MatDenseGetArray(C,&array); then create B1 with
>>>> MatCreateDense(....,array,&B1); then create
>>>> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
>>>> the number of __local__ rows in B1 times the number of columns in B1, then
>>>> create B3 with a larger shift etc.
>>>> > >
>>>> > >    Note that you are "sharing" the array space of C with B1, B2,
>>>> B3, ..., each Bi contains its columns of the C matrix.
>>>> > >
>>>> > >   Barry
>>>> > >
>>>> > >
>>>> > >
>>>> > > >
>>>> > > > Is there any way to do this efficiently.
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>>>> patrick.sanan at gmail.com> wrote:
>>>> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>>> > > > > Thanks for your reply.
>>>> > > > >
>>>> > > > > I have an other question.
>>>> > > > > I want to do SPMM several times and combine result matrices
>>>> into one bigger
>>>> > > > > matrix.
>>>> > > > > for example
>>>> > > > > I firstly calculate AX1=B1, AX2=B2 ...
>>>> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>>> > > > >
>>>> > > > > Could you please suggest a way of how to do this.
>>>> > > > This is just linear algebra, nothing to do with PETSc
>>>> specifically.
>>>> > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>>> > > > >
>>>> > > > > Thanks
>>>> > > > >
>>>> > > > > Cong Li
>>>> > > > >
>>>> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>>>> wrote:
>>>> > > > >
>>>> > > > > > Cong Li <solvercorleone at gmail.com> writes:
>>>> > > > > >
>>>> > > > > > > Hello,
>>>> > > > > > >
>>>> > > > > > > I am a PhD student using PETsc for my research.
>>>> > > > > > > I am wondering if there is a way to implement SPMM (Sparse
>>>> matrix-matrix
>>>> > > > > > > multiplication) by using PETSc.
>>>> > > > > >
>>>> > > > > >
>>>> > > > > >
>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>>> > > > > >
>>>> > > >
>>>> > >
>>>> > >
>>>> >
>>>> >
>>>> >
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/f6814aaa/attachment-0001.html>

From xzhao99 at gmail.com  Thu Aug  6 11:22:29 2015
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Thu, 6 Aug 2015 11:22:29 -0500
Subject: [petsc-users] Vec Allgather operation in PETSc
Message-ID: <CAHOKZ67oYWt5TCLJRUykMfn_v=Mk8uyu9SPLb2bUFvJGu3X+yw@mail.gmail.com>

Hi all,

For a parallel Vec whose components are stored in N processes, I would like
to have an "Allgatherv" operation to obtain a whole copy on each process.
Can anyone tell me which function I should use? Thanks

Xujun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/209d107e/attachment.html>

From dave.mayhem23 at gmail.com  Thu Aug  6 11:26:53 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Thu, 6 Aug 2015 18:26:53 +0200
Subject: [petsc-users] Vec Allgather operation in PETSc
In-Reply-To: <CAHOKZ67oYWt5TCLJRUykMfn_v=Mk8uyu9SPLb2bUFvJGu3X+yw@mail.gmail.com>
References: <CAHOKZ67oYWt5TCLJRUykMfn_v=Mk8uyu9SPLb2bUFvJGu3X+yw@mail.gmail.com>
Message-ID: <CAJ98EDq2Qfr2eVboEBT6+Xsu65VtW5Uq59bLw_ueAs3Lv+O3sg@mail.gmail.com>

Use this
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreateToAll.html

On 6 August 2015 at 18:22, Xujun Zhao <xzhao99 at gmail.com> wrote:

> Hi all,
>
> For a parallel Vec whose components are stored in N processes, I would
> like to have an "Allgatherv" operation to obtain a whole copy on each
> process. Can anyone tell me which function I should use? Thanks
>
> Xujun
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/146288d7/attachment.html>

From bsmith at mcs.anl.gov  Thu Aug  6 11:35:59 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 6 Aug 2015 11:35:59 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAGCphBu+an7ffKHi19upfV4=68DRaBpgRHziaMEv5AFHx6=aAQ@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
	<FCB0B289-58BD-4AA4-8DCD-9EEB755360D2@mcs.anl.gov>
	<CAGCphBu+an7ffKHi19upfV4=68DRaBpgRHziaMEv5AFHx6=aAQ@mail.gmail.com>
Message-ID: <DE1C5F6B-FD82-4555-AB6C-D8B9F2ADC517@mcs.anl.gov>


> On Aug 6, 2015, at 10:09 AM, Hong <hzhang at mcs.anl.gov> wrote:
> 
> Barry:
> 
>   Hong, we want to reuse the space in the Km(stepIdx-1) from which it was created which means that MAT_INITIAL_MATRIX cannot be used. Since the result is always dense it is not the difficult case when 
> a symbolic computation needs to be done initially so, at least in theory, he should not have to use MAT_INITIAL_MATRIX the first time through.
> 
> Petsc implementation of MatMatMult() assumes user call
> MatMatMultSymbolic() first, in which, we define which 
> MatMatMultNumeric() routine to be followed, and most importantly, we create specific data structure to be reused by  MatMatMultNumeric().
> In MatMatMultSymbolic_MPIAIJ_MPIDense(), we create a 'container'.
> 
> Without calling the case of MAT_INITIAL_MATRIX, these info are missing, and code simply crashes.

  Sure but in this case (with dense matrices) the container is very simple and we can create it the first time in if MAT_REUSE_MATRIX is passed in but the container is not there already. 

  For the sparse result case you are right it doesn't make sense since the nonzero structure of the matrix needs to be figured out by symbolic factorization.

  Barry
 
> 
> Hong
> 
> 
> >
> > Hong
> >
> >
> > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com> wrote:
> > The entire source code files are attached.
> >
> > Also I copy and paste the here in this email
> >
> > thanks
> >
> > program test
> >
> >   implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >
> >   PetscViewer    :: view
> >   ! sparse matrix
> >   Mat            :: A
> >   ! distributed dense matrix of size n x m
> >   Mat            :: B, X, R, QDlt, AQDlt
> >   ! distributed dense matrix of size n x (m x k)
> >   Mat            :: Q, K, AQ_p, AQ
> >   ! local dense matrix (every process keep the identical copies), (m x k) x (m x k)
> >   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
> >
> >   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize
> >   PetscInt       :: ownRowS,ownRowE
> >   PetscScalar, allocatable :: XInit(:,:)
> >   PetscInt       :: XInitI, XInitJ
> >   PetscScalar    :: v=1.0
> >   PetscBool      :: flg
> >   PetscMPIInt    :: size, rank
> >
> >   character(128) ::  fin, rhsfin
> >
> >
> >   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
> >   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
> >   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> >
> >   ! read binary matrix file
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
> >
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
> >
> >
> >   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
> >   call MatCreate(PETSC_COMM_WORLD,A,ierr)
> >   call MatSetType(A,MATAIJ,ierr)
> >   call MatLoad(A,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   ! for the time being, assume mDim == nDim is true
> >   call MatGetSize(A, nDim, mDim, ierr)
> >
> >   if (rank == 0) then
> >     print*,'Mat Size = ', nDim, mDim
> >   end if
> >
> >   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
> >
> >   ! create right-and-side matrix
> >   ! for the time being, choose row-wise decomposition
> >   ! for the time being, assume nDim%size = 0
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
> >   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
> >   call MatLoad(B,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
> >   if (rank == 0) then
> >     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
> >     print*,'MRHS Size should be:', nDim, bsize
> >   end if
> >   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   ! inintial value guses X
> >   allocate(XInit(nDim,bsize))
> >   do XInitI=1, nDim
> >     do XInitJ=1, bsize
> >       XInit(XInitI,XInitJ) = 1.0
> >     end do
> >   end do
> >
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,XInit, X, ierr)
> >
> >   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
> >
> >
> >   !  B, X, R, QDlt, AQDlt
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
> >   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
> >   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
> >   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! Q, K, AQ_p, AQ of size n x (m x k)
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
> >   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
> >   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
> >   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
> >   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
> >   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
> >                          PETSC_NULL_SCALAR, QtAQ, ierr)
> >   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
> >   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
> >   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
> >   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! calculation for R
> >
> > ! call matrix powers kernel
> >   call mpk_monomial (K, A, R, step_k, rank,size)
> >
> > ! destory matrices
> >   deallocate(XInit)
> >
> >   call MatDestroy(B, ierr)
> >   call MatDestroy(X, ierr)
> >   call MatDestroy(R, ierr)
> >   call MatDestroy(QDlt, ierr)
> >   call MatDestroy(AQDlt, ierr)
> >   call MatDestroy(Q, ierr)
> >   call MatDestroy(K, ierr)
> >   call MatDestroy(AQ_p, ierr)
> >   call MatDestroy(AQ, ierr)
> >   call MatDestroy(QtAQ, ierr)
> >   call MatDestroy(QtAQ_p, ierr)
> >   call MatDestroy(Dlt, ierr)
> >
> >
> >   call PetscFinalize(ierr)
> >
> >   stop
> >
> > end program test
> >
> >
> > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> >       implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >       Mat            :: K, Km(step_k)
> >       Mat            :: A, R
> >       PetscMPIInt    :: sizeMPI, rank
> >       PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
> >       PetscInt       :: ierr
> >       PetscInt       :: stepIdx, blockShift, localRsize
> >   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
> >   PetscOffset    :: KArrayOffset, RArrayOffset
> >
> >       call MatGetSize(R, nDim, bsize, ierr)
> >       if (rank == 0) then
> >         print*,'Mat Size = ', nDim, bsize
> >       end if
> >
> >   call MatGetArray(K,KArray,KArrayOffset,ierr)
> >
> >   call MatGetLocalSize(R,local_RRow,local_RCol)
> > !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
> >
> >   ! get arry from R to add values to K(1)
> >   call MatGetArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> >
> >
> > !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
> > !                  ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr)
> >
> >   localRsize = local_RRow * local_RCol
> >   do genIdx= 1, localRsize
> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> >   end do
> >
> >
> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   do stepIdx= 2, step_k
> >
> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> >
> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   end do
> >
> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> >
> > !   do stepIdx= 2, step_k
> >   do stepIdx= 2,2
> >
> >     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > !     call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> >   end do
> >
> > !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> > end subroutine mpk_monomial
> >
> >
> >
> > Cong Li
> >
> > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    Send the entire code so that we can compile it and run it ourselves to see what is going wrong.
> >
> >   Barry
> >
> > > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > Hi
> > >
> > > I tried the method you suggested. However, I got the error message.
> > > My code and message are below.
> > >
> > > K is the big matrix containing column matrices.
> > >
> > > code:
> > >
> > > call MatGetArray(K,KArray,KArrayOffset,ierr)
> > >
> > > call MatGetLocalSize(R,local_RRow,local_RCol)
> > >
> > > call MatGetArray(R,RArray,RArrayOffset,ierr)
> > >
> > > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> > >
> > >   localRsize = local_RRow * local_RCol
> > >   do genIdx= 1, localRsize
> > >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> > >   end do
> > >
> > >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> > >
> > >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   do stepIdx= 2, step_k
> > >
> > >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> > >
> > >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> > >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >   end do
> > >
> > >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> > >
> > >    do stepIdx= 2, step_k
> > >
> > >     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > >   end do
> > >
> > >
> > > And I got the error message as below:
> > >
> > >
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------
> > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > > ----------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> > > --------------------------------------------------------------------------
> > > [mpi::mpi-api::mpi-abort]
> > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > > with errorcode 59.
> > >
> > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > > You may or may not see output from other processes, depending on
> > > exactly when Open MPI kills them.
> > > --------------------------------------------------------------------------
> > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684]
> > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c]
> > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac]
> > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > > [p01-024:26516] [(nil)]
> > > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c]
> > > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end
> > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> > > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> > >
> > > However, if I change from
> > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > > to
> > > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > >
> > > everything is fine.
> > >
> > > could you please suggest some way to solve this?
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > > Thank you very much for your help and suggestions.
> > > With your help, finally I could continue my project.
> > >
> > > Regards
> > >
> > > Cong Li
> > >
> > >
> > >
> > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be created.
> > >
> > >   Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX.
> > >
> > >   Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant.
> > >
> > >   Barry
> > >
> > > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > > >
> > > > Thanks very much. This answer is very helpful.
> > > > And I have a following question.
> > > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM.
> > > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C)
> > > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > > >
> > > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > > > >
> > > > > I am sorry that I should have explained it more clearly.
> > > > > Actually I want to compute a recurrence.
> > > > >
> > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on.
> > > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...]
> > > >
> > > >    First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create
> > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc.
> > > >
> > > >    Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix.
> > > >
> > > >   Barry
> > > >
> > > >
> > > >
> > > > >
> > > > > Is there any way to do this efficiently.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > I have an other question.
> > > > > > I want to do SPMM several times and combine result matrices into one bigger
> > > > > > matrix.
> > > > > > for example
> > > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > > >
> > > > > > Could you please suggest a way of how to do this.
> > > > > This is just linear algebra, nothing to do with PETSc specifically.
> > > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Cong Li
> > > > > >
> > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > > > > >
> > > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > > > > > > > multiplication) by using PETSc.
> > > > > > >
> > > > > > >
> > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
> 
> 


From xzhao99 at gmail.com  Thu Aug  6 14:57:52 2015
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Thu, 6 Aug 2015 14:57:52 -0500
Subject: [petsc-users] Vec Allgather operation in PETSc
In-Reply-To: <CAJ98EDq2Qfr2eVboEBT6+Xsu65VtW5Uq59bLw_ueAs3Lv+O3sg@mail.gmail.com>
References: <CAHOKZ67oYWt5TCLJRUykMfn_v=Mk8uyu9SPLb2bUFvJGu3X+yw@mail.gmail.com>
	<CAJ98EDq2Qfr2eVboEBT6+Xsu65VtW5Uq59bLw_ueAs3Lv+O3sg@mail.gmail.com>
Message-ID: <CAHOKZ67qigy8bhUWujxpFuShcEQY4Ansb4Z7q1zYxjROz-BUWw@mail.gmail.com>

Hi Dave,

Thank you. This solves my problem!

Xujun

On Thu, Aug 6, 2015 at 11:26 AM, Dave May <dave.mayhem23 at gmail.com> wrote:

> Use this
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreateToAll.html
>
> On 6 August 2015 at 18:22, Xujun Zhao <xzhao99 at gmail.com> wrote:
>
>> Hi all,
>>
>> For a parallel Vec whose components are stored in N processes, I would
>> like to have an "Allgatherv" operation to obtain a whole copy on each
>> process. Can anyone tell me which function I should use? Thanks
>>
>> Xujun
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/5c2fb2c0/attachment.html>

From juris.vencels at gmail.com  Thu Aug  6 16:16:57 2015
From: juris.vencels at gmail.com (Juris Vencels)
Date: Thu, 06 Aug 2015 15:16:57 -0600
Subject: [petsc-users] Remove Jacobian matrix values less than tolerance
Message-ID: <55C3CEC9.2000705@gmail.com>

Hi Users,


When I construct analytical Jacobian matrix it has many small values of 
order 1E-16.

How can I remove these values that are less than a given tolerance, 
let's say 1E-10?

I tried to use MatChop together with MatCopy and MatDuplicate, but none 
of these functions ignores zeros.


Thanks!

From knepley at gmail.com  Thu Aug  6 16:22:36 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 6 Aug 2015 16:22:36 -0500
Subject: [petsc-users] Remove Jacobian matrix values less than tolerance
In-Reply-To: <55C3CEC9.2000705@gmail.com>
References: <55C3CEC9.2000705@gmail.com>
Message-ID: <CAMYG4GkaU_L+S0mHAjEeB0K0DgfQaXFHaQXdjReK8VH-9mMoLg@mail.gmail.com>

On Thu, Aug 6, 2015 at 4:16 PM, Juris Vencels <juris.vencels at gmail.com>
wrote:

> Hi Users,
>
>
> When I construct analytical Jacobian matrix it has many small values of
> order 1E-16.
>
> How can I remove these values that are less than a given tolerance, let's
> say 1E-10?
>
> I tried to use MatChop together with MatCopy and MatDuplicate, but none of
> these functions ignores zeros.
>

Do you mean that you want to change the sparsity pattern? We do not have a
function which does this. It would
require a copy to regain the lost memory, and its not normally worth the
trouble.

Do you have some data or a model that tells you its worth it in your case?
This is a question I
always ask myself before programming.

  Thanks,

     Matt


> Thanks!

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/e56a81e6/attachment.html>

From bsmith at mcs.anl.gov  Thu Aug  6 16:55:11 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 6 Aug 2015 16:55:11 -0500
Subject: [petsc-users] Remove Jacobian matrix values less than tolerance
In-Reply-To: <55C3CEC9.2000705@gmail.com>
References: <55C3CEC9.2000705@gmail.com>
Message-ID: <FC841D9F-21BD-4430-BC97-2EB3387CEFA9@mcs.anl.gov>


> On Aug 6, 2015, at 4:16 PM, Juris Vencels <juris.vencels at gmail.com> wrote:
> 
> Hi Users,
> 
> 
> When I construct analytical Jacobian matrix it has many small values of order 1E-16.

Are the values at those locations always that small or at different Newton steps or time-steps will they be larger? Unless there are a huge number of these and you know they are always small then I would not try to take them out. If you don't want them in there then don't put them in orginally; that is don't call MatSetValues() at all for those really small locations and don't allocate space for them.

   Barry

> 
> How can I remove these values that are less than a given tolerance, let's say 1E-10?
> 
> I tried to use MatChop together with MatCopy and MatDuplicate, but none of these functions ignores zeros.
> 
> 
> Thanks!


From bsmith at mcs.anl.gov  Thu Aug  6 18:47:39 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 6 Aug 2015 18:47:39 -0500
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CALSmn-=p3k92byHk6Fwjhpm-DR0J0yBxxsesX-V-NQdbFZ7cXw@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
	<FCB0B289-58BD-4AA4-8DCD-9EEB755360D2@mcs.anl.gov>
	<CALSmn-=p3k92byHk6Fwjhpm-DR0J0yBxxsesX-V-NQdbFZ7cXw@mail.gmail.com>
Message-ID: <340E63F1-4389-4C3B-8221-4F119330764F@mcs.anl.gov>


  Cong Li,

    I have updated PETSc to support the use of MatMatMult() per your needs. You will need to switch to the master development branch http://www.mcs.anl.gov/petsc/developers/index.html of PETSc so install that first.

   I found a number of bugs in your code that I needed to fix to get it to run successfully on 1 and 2 processes to correctly load the matrices and do everything else it was doing 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1f.F90
Type: application/octet-stream
Size: 4403 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/6e8b142e/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpk_monomial.F90
Type: application/octet-stream
Size: 2366 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/6e8b142e/attachment-0003.obj>
-------------- next part --------------
with the MatMatMult() (note I do not think it generates the right numbers but at least it doesn't crash and does successfully do the MatMatMult(). I've attached the fixed files.

  Barry


> On Aug 6, 2015, at 12:27 AM, Cong Li <solvercorleone at gmail.com> wrote:
> 
> Barry,
> 
> Exactly. And thanks for the explaination.
> 
> Cong Li
> 
> On Thu, Aug 6, 2015 at 1:29 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On Aug 5, 2015, at 10:23 PM, Hong <hzhang at mcs.anl.gov> wrote:
> >
> > Cong,
> >
> > Can you write out math equations for mpk_monomial (),
> > list input and output parameters.
> >
> > Note:
> > 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End
> > 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after
> >     MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..)
> 
>   Hong, we want to reuse the space in the Km(stepIdx-1) from which it was created which means that MAT_INITIAL_MATRIX cannot be used. Since the result is always dense it is not the difficult case when a symbolic computation needs to be done initially so, at least in theory, he should not have to use MAT_INITIAL_MATRIX the first time through.
> 
>   Barry
> 
> >
> > Hong
> >
> >
> > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com> wrote:
> > The entire source code files are attached.
> >
> > Also I copy and paste the here in this email
> >
> > thanks
> >
> > program test
> >
> >   implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >
> >   PetscViewer    :: view
> >   ! sparse matrix
> >   Mat            :: A
> >   ! distributed dense matrix of size n x m
> >   Mat            :: B, X, R, QDlt, AQDlt
> >   ! distributed dense matrix of size n x (m x k)
> >   Mat            :: Q, K, AQ_p, AQ
> >   ! local dense matrix (every process keep the identical copies), (m x k) x (m x k)
> >   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
> >
> >   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize
> >   PetscInt       :: ownRowS,ownRowE
> >   PetscScalar, allocatable :: XInit(:,:)
> >   PetscInt       :: XInitI, XInitJ
> >   PetscScalar    :: v=1.0
> >   PetscBool      :: flg
> >   PetscMPIInt    :: size, rank
> >
> >   character(128) ::  fin, rhsfin
> >
> >
> >   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
> >   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
> >   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> >
> >   ! read binary matrix file
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
> >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
> >
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
> >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
> >
> >
> >   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
> >   call MatCreate(PETSC_COMM_WORLD,A,ierr)
> >   call MatSetType(A,MATAIJ,ierr)
> >   call MatLoad(A,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   ! for the time being, assume mDim == nDim is true
> >   call MatGetSize(A, nDim, mDim, ierr)
> >
> >   if (rank == 0) then
> >     print*,'Mat Size = ', nDim, mDim
> >   end if
> >
> >   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
> >
> >   ! create right-and-side matrix
> >   ! for the time being, choose row-wise decomposition
> >   ! for the time being, assume nDim%size = 0
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
> >   call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
> >   call MatLoad(B,view,ierr)
> >   call PetscViewerDestroy(view,ierr)
> >   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
> >   if (rank == 0) then
> >     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
> >     print*,'MRHS Size should be:', nDim, bsize
> >   end if
> >   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   ! inintial value guses X
> >   allocate(XInit(nDim,bsize))
> >   do XInitI=1, nDim
> >     do XInitJ=1, bsize
> >       XInit(XInitI,XInitJ) = 1.0
> >     end do
> >   end do
> >
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       bsize, nDim, bsize,XInit, X, ierr)
> >
> >   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
> >
> >
> >   !  B, X, R, QDlt, AQDlt
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
> >   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
> >   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
> >   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! Q, K, AQ_p, AQ of size n x (m x k)
> >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> >                       (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
> >   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
> >   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
> >   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
> >   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
> >   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
> >                          PETSC_NULL_SCALAR, QtAQ, ierr)
> >   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
> >   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
> >   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
> >
> >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
> >   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> >
> > ! calculation for R
> >
> > ! call matrix powers kernel
> >   call mpk_monomial (K, A, R, step_k, rank,size)
> >
> > ! destory matrices
> >   deallocate(XInit)
> >
> >   call MatDestroy(B, ierr)
> >   call MatDestroy(X, ierr)
> >   call MatDestroy(R, ierr)
> >   call MatDestroy(QDlt, ierr)
> >   call MatDestroy(AQDlt, ierr)
> >   call MatDestroy(Q, ierr)
> >   call MatDestroy(K, ierr)
> >   call MatDestroy(AQ_p, ierr)
> >   call MatDestroy(AQ, ierr)
> >   call MatDestroy(QtAQ, ierr)
> >   call MatDestroy(QtAQ_p, ierr)
> >   call MatDestroy(Dlt, ierr)
> >
> >
> >   call PetscFinalize(ierr)
> >
> >   stop
> >
> > end program test
> >
> >
> > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> >       implicit none
> >
> > #include <finclude/petscsys.h>
> > #include <finclude/petscvec.h>
> > #include <finclude/petscmat.h>
> > #include <finclude/petscviewer.h>
> >
> >       Mat            :: K, Km(step_k)
> >       Mat            :: A, R
> >       PetscMPIInt    :: sizeMPI, rank
> >       PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
> >       PetscInt       :: ierr
> >       PetscInt       :: stepIdx, blockShift, localRsize
> >   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
> >   PetscOffset    :: KArrayOffset, RArrayOffset
> >
> >       call MatGetSize(R, nDim, bsize, ierr)
> >       if (rank == 0) then
> >         print*,'Mat Size = ', nDim, bsize
> >       end if
> >
> >   call MatGetArray(K,KArray,KArrayOffset,ierr)
> >
> >   call MatGetLocalSize(R,local_RRow,local_RCol)
> > !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
> >
> >   ! get arry from R to add values to K(1)
> >   call MatGetArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> >
> >
> > !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) &
> > !                  ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr)
> >
> >   localRsize = local_RRow * local_RCol
> >   do genIdx= 1, localRsize
> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> >   end do
> >
> >
> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> >
> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   do stepIdx= 2, step_k
> >
> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> >
> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> >
> >   end do
> >
> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> >
> > !   do stepIdx= 2, step_k
> >   do stepIdx= 2,2
> >
> >     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > !     call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> >   end do
> >
> > !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> > end subroutine mpk_monomial
> >
> >
> >
> > Cong Li
> >
> > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    Send the entire code so that we can compile it and run it ourselves to see what is going wrong.
> >
> >   Barry
> >
> > > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > >
> > > Hi
> > >
> > > I tried the method you suggested. However, I got the error message.
> > > My code and message are below.
> > >
> > > K is the big matrix containing column matrices.
> > >
> > > code:
> > >
> > > call MatGetArray(K,KArray,KArrayOffset,ierr)
> > >
> > > call MatGetLocalSize(R,local_RRow,local_RCol)
> > >
> > > call MatGetArray(R,RArray,RArrayOffset,ierr)
> > >
> > > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> > >
> > >   localRsize = local_RRow * local_RCol
> > >   do genIdx= 1, localRsize
> > >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> > >   end do
> > >
> > >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> > >
> > >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   do stepIdx= 2, step_k
> > >
> > >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> > >
> > >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> > >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >   end do
> > >
> > >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> > >
> > >    do stepIdx= 2, step_k
> > >
> > >     call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > >   end do
> > >
> > >
> > > And I got the error message as below:
> > >
> > >
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------
> > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> > > ----------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> > > --------------------------------------------------------------------------
> > > [mpi::mpi-api::mpi-abort]
> > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > > with errorcode 59.
> > >
> > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > > You may or may not see output from other processes, depending on
> > > exactly when Open MPI kills them.
> > > --------------------------------------------------------------------------
> > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684]
> > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c]
> > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac]
> > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > > [p01-024:26516] [(nil)]
> > > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c]
> > > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end
> > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> > > [0]PETSC ERROR: Signal received!
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013
> > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug  5 18:24:40 2015
> > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > [0]PETSC ERROR: ------------------------------------------------------------------------
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file
> > > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> > >
> > > However, if I change from
> > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > > to
> > > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > >
> > > everything is fine.
> > >
> > > could you please suggest some way to solve this?
> > >
> > > Thanks
> > >
> > > Cong Li
> > >
> > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > > Thank you very much for your help and suggestions.
> > > With your help, finally I could continue my project.
> > >
> > > Regards
> > >
> > > Cong Li
> > >
> > >
> > >
> > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be created.
> > >
> > >   Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX.
> > >
> > >   Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant.
> > >
> > >   Barry
> > >
> > > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > > >
> > > > Thanks very much. This answer is very helpful.
> > > > And I have a following question.
> > > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM.
> > > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C)
> > > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > > >
> > > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com> wrote:
> > > > >
> > > > > I am sorry that I should have explained it more clearly.
> > > > > Actually I want to compute a recurrence.
> > > > >
> > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on.
> > > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...]
> > > >
> > > >    First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create
> > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc.
> > > >
> > > >    Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix.
> > > >
> > > >   Barry
> > > >
> > > >
> > > >
> > > > >
> > > > > Is there any way to do this efficiently.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
> > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > I have an other question.
> > > > > > I want to do SPMM several times and combine result matrices into one bigger
> > > > > > matrix.
> > > > > > for example
> > > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > > >
> > > > > > Could you please suggest a way of how to do this.
> > > > > This is just linear algebra, nothing to do with PETSc specifically.
> > > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > Cong Li
> > > > > >
> > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> > > > > >
> > > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
> > > > > > > > multiplication) by using PETSc.
> > > > > > >
> > > > > > >
> > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
> 
> 


From solvercorleone at gmail.com  Thu Aug  6 20:08:25 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Fri, 7 Aug 2015 10:08:25 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <CAGCphBsYzC8fcgQEd9-xWj=yYkzAPHjtDdw5-Dq_ZO5OMN6VMA@mail.gmail.com>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
	<CALSmn-kkZebNiczWib_q2OQSXo6r6zHyz1JifiG5zfQFN8Dvqw@mail.gmail.com>
	<CAGCphBsYzC8fcgQEd9-xWj=yYkzAPHjtDdw5-Dq_ZO5OMN6VMA@mail.gmail.com>
Message-ID: <CALSmn-k+vhVFoSW84mcZJtPVT5+_Rm=ch6HQCGsohwPdAA9Y+Q@mail.gmail.com>

Hong,
>>A subspace with dense matrices as basis?
>>How large step_k and your matrices will be?

So far, I are not very sure how large it's gonna be in the future. But I
use less than 50 right now.
However, I hope it can be as large as possible.

Cong Li

On Fri, Aug 7, 2015 at 12:20 AM, Hong <hzhang at mcs.anl.gov> wrote:

> Cong:
>
>> Hong,
>>
>> Sure.
>>
>> I want to extend the Krylov subspace by step_k dimensions by using
>> monomial, which can be defined as
>>
>> K={Km(1)m Km(2), ..., Km(step_k)}
>>   ={Km(1), AKm(1), AKm(2), ... , AKm(step_k-1)}
>>   ={R, AR, A^2R, ... A^(step_k-1)R}
>>
>
> A subspace with dense matrices as basis?
> How large step_k and your matrices will be?
>
> Hong
>
>>
>> On Thu, Aug 6, 2015 at 12:23 PM, Hong <hzhang at mcs.anl.gov> wrote:
>>
>>> Cong,
>>>
>>> Can you write out math equations for mpk_monomial (),
>>> list input and output parameters.
>>>
>>> Note:
>>> 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End
>>> 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after
>>>     MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..)
>>>
>>> Hong
>>>
>>>
>>> On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com>
>>> wrote:
>>>
>>>> The entire source code files are attached.
>>>>
>>>> Also I copy and paste the here in this email
>>>>
>>>> thanks
>>>>
>>>> program test
>>>>
>>>>   implicit none
>>>>
>>>> #include <finclude/petscsys.h>
>>>> #include <finclude/petscvec.h>
>>>> #include <finclude/petscmat.h>
>>>> #include <finclude/petscviewer.h>
>>>>
>>>>
>>>>   PetscViewer    :: view
>>>>   ! sparse matrix
>>>>   Mat            :: A
>>>>   ! distributed dense matrix of size n x m
>>>>   Mat            :: B, X, R, QDlt, AQDlt
>>>>   ! distributed dense matrix of size n x (m x k)
>>>>   Mat            :: Q, K, AQ_p, AQ
>>>>   ! local dense matrix (every process keep the identical copies), (m x
>>>> k) x (m x k)
>>>>   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
>>>>
>>>>   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
>>>> step_k,bsize
>>>>   PetscInt       :: ownRowS,ownRowE
>>>>   PetscScalar, allocatable :: XInit(:,:)
>>>>   PetscInt       :: XInitI, XInitJ
>>>>   PetscScalar    :: v=1.0
>>>>   PetscBool      :: flg
>>>>   PetscMPIInt    :: size, rank
>>>>
>>>>   character(128) ::  fin, rhsfin
>>>>
>>>>
>>>>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>>>>   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
>>>>   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
>>>>
>>>>   ! read binary matrix file
>>>>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
>>>>   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
>>>>
>>>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
>>>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
>>>>   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
>>>>
>>>>
>>>>   call
>>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
>>>>   call MatCreate(PETSC_COMM_WORLD,A,ierr)
>>>>   call MatSetType(A,MATAIJ,ierr)
>>>>   call MatLoad(A,view,ierr)
>>>>   call PetscViewerDestroy(view,ierr)
>>>>   ! for the time being, assume mDim == nDim is true
>>>>   call MatGetSize(A, nDim, mDim, ierr)
>>>>
>>>>   if (rank == 0) then
>>>>     print*,'Mat Size = ', nDim, mDim
>>>>   end if
>>>>
>>>>   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
>>>>
>>>>   ! create right-and-side matrix
>>>>   ! for the time being, choose row-wise decomposition
>>>>   ! for the time being, assume nDim%size = 0
>>>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>>>                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
>>>>   call
>>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
>>>>   call MatLoad(B,view,ierr)
>>>>   call PetscViewerDestroy(view,ierr)
>>>>   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
>>>>   if (rank == 0) then
>>>>     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
>>>>     print*,'MRHS Size should be:', nDim, bsize
>>>>   end if
>>>>   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   ! inintial value guses X
>>>>   allocate(XInit(nDim,bsize))
>>>>   do XInitI=1, nDim
>>>>     do XInitJ=1, bsize
>>>>       XInit(XInitI,XInitJ) = 1.0
>>>>     end do
>>>>   end do
>>>>
>>>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>>>                       bsize, nDim, bsize,XInit, X, ierr)
>>>>
>>>>   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>
>>>>   !  B, X, R, QDlt, AQDlt
>>>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
>>>>   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
>>>>   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
>>>>   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>> ! Q, K, AQ_p, AQ of size n x (m x k)
>>>>   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
>>>>                       (bsize*step_k), nDim,
>>>> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
>>>>   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
>>>>   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
>>>>   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
>>>>   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
>>>>   call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
>>>>                          PETSC_NULL_SCALAR, QtAQ, ierr)
>>>>   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
>>>>   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
>>>>   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
>>>>   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>> ! calculation for R
>>>>
>>>> ! call matrix powers kernel
>>>>   call mpk_monomial (K, A, R, step_k, rank,size)
>>>>
>>>> ! destory matrices
>>>>   deallocate(XInit)
>>>>
>>>>   call MatDestroy(B, ierr)
>>>>   call MatDestroy(X, ierr)
>>>>   call MatDestroy(R, ierr)
>>>>   call MatDestroy(QDlt, ierr)
>>>>   call MatDestroy(AQDlt, ierr)
>>>>   call MatDestroy(Q, ierr)
>>>>   call MatDestroy(K, ierr)
>>>>   call MatDestroy(AQ_p, ierr)
>>>>   call MatDestroy(AQ, ierr)
>>>>   call MatDestroy(QtAQ, ierr)
>>>>   call MatDestroy(QtAQ_p, ierr)
>>>>   call MatDestroy(Dlt, ierr)
>>>>
>>>>
>>>>   call PetscFinalize(ierr)
>>>>
>>>>   stop
>>>>
>>>> end program test
>>>>
>>>>
>>>> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
>>>> implicit none
>>>>
>>>> #include <finclude/petscsys.h>
>>>> #include <finclude/petscvec.h>
>>>> #include <finclude/petscmat.h>
>>>> #include <finclude/petscviewer.h>
>>>>
>>>> Mat            :: K, Km(step_k)
>>>> Mat            :: A, R
>>>> PetscMPIInt    :: sizeMPI, rank
>>>> PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx
>>>> PetscInt       :: ierr
>>>> PetscInt       :: stepIdx, blockShift, localRsize
>>>>   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
>>>>   PetscOffset    :: KArrayOffset, RArrayOffset
>>>>
>>>> call MatGetSize(R, nDim, bsize, ierr)
>>>>   if (rank == 0) then
>>>>    print*,'Mat Size = ', nDim, bsize
>>>>   end if
>>>>
>>>>   call MatGetArray(K,KArray,KArrayOffset,ierr)
>>>>
>>>>   call MatGetLocalSize(R,local_RRow,local_RCol)
>>>> !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
>>>>
>>>>   ! get arry from R to add values to K(1)
>>>>   call MatGetArray(R,RArray,RArrayOffset,ierr)
>>>>
>>>>   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>>                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset
>>>> + 1), Km(1), ierr)
>>>>
>>>>
>>>> !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1)
>>>> &
>>>> !                  ,local_RRow * local_RCol *
>>>> STORAGE_SIZE(PetscScalarSize), ierr)
>>>>
>>>>   localRsize = local_RRow * local_RCol
>>>>   do genIdx= 1, localRsize
>>>>     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>>>   end do
>>>>
>>>>
>>>>   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>>>
>>>>   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>>   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   do stepIdx= 2, step_k
>>>>
>>>>     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
>>>>
>>>>     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>>                         PETSC_DECIDE , nDim,
>>>> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>>>>     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>>     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>>
>>>>   end do
>>>>
>>>>   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>>>
>>>> !   do stepIdx= 2, step_k
>>>>   do stepIdx= 2,2
>>>>
>>>>     call
>>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>>> ierr)
>>>> !     call
>>>> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>>> ierr)
>>>>   end do
>>>>
>>>> !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
>>>>
>>>> end subroutine mpk_monomial
>>>>
>>>>
>>>>
>>>> Cong Li
>>>>
>>>> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>
>>>>>
>>>>>    Send the entire code so that we can compile it and run it ourselves
>>>>> to see what is going wrong.
>>>>>
>>>>>   Barry
>>>>>
>>>>> > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi
>>>>> >
>>>>> > I tried the method you suggested. However, I got the error message.
>>>>> > My code and message are below.
>>>>> >
>>>>> > K is the big matrix containing column matrices.
>>>>> >
>>>>> > code:
>>>>> >
>>>>> > call MatGetArray(K,KArray,KArrayOffset,ierr)
>>>>> >
>>>>> > call MatGetLocalSize(R,local_RRow,local_RCol)
>>>>> >
>>>>> > call MatGetArray(R,RArray,RArrayOffset,ierr)
>>>>> >
>>>>> > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>>> >                         PETSC_DECIDE , nDim,
>>>>> bsize,KArray(KArrayOffset + 1), Km(1), ierr)
>>>>> >
>>>>> >   localRsize = local_RRow * local_RCol
>>>>> >   do genIdx= 1, localRsize
>>>>> >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
>>>>> >   end do
>>>>> >
>>>>> >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
>>>>> >
>>>>> >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>>> >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
>>>>> >
>>>>> >   do stepIdx= 2, step_k
>>>>> >
>>>>> >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow *
>>>>> local_RCol)
>>>>> >
>>>>> >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
>>>>> >                         PETSC_DECIDE , nDim,
>>>>> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
>>>>> >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>>> >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
>>>>> >   end do
>>>>> >
>>>>> >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
>>>>> >
>>>>> >    do stepIdx= 2, step_k
>>>>> >
>>>>> >     call
>>>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>>>> ierr)
>>>>> >   end do
>>>>> >
>>>>> >
>>>>> > And I got the error message as below:
>>>>> >
>>>>> >
>>>>> > [0]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>> Violation, probably memory access out of range
>>>>> > [0]PETSC ERROR: Try option -start_in_debugger or
>>>>> -on_error_attach_debugger
>>>>> > [0]PETSC ERROR: or see
>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>>>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>>>>> find memory corruption errors
>>>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile,
>>>>> link, and run
>>>>> > [0]PETSC ERROR: to get more information on the crash.
>>>>> > [0]PETSC ERROR: --------------------- Error Message
>>>>> ------------------------------------
>>>>> > [0]PETSC ERROR: Signal received!
>>>>> > [0]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>>>>> 22:15:24 CDT 2013
>>>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>>>>> > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>> Violation, probably memory access out of range
>>>>> > ----------------------------------------------------
>>>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
>>>>> Wed Aug  5 18:24:40 2015
>>>>> > [0]PETSC ERROR: Libraries linked from
>>>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>>>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>>>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>>>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>>>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>>>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>>>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>>>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>>>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>>>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>>>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>>>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>>>> > [0]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>>>> unknown file
>>>>> >
>>>>> --------------------------------------------------------------------------
>>>>> > [mpi::mpi-api::mpi-abort]
>>>>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>>>> > with errorcode 59.
>>>>> >
>>>>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>>> > You may or may not see output from other processes, depending on
>>>>> > exactly when Open MPI kills them.
>>>>> >
>>>>> --------------------------------------------------------------------------
>>>>> > [p01-024:26516]
>>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>>>> [0xffffffff0091f684]
>>>>> > [p01-024:26516]
>>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>>>> [0xffffffff006c389c]
>>>>> > [p01-024:26516]
>>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
>>>>> [0xffffffff006db3ac]
>>>>> > [p01-024:26516]
>>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>>>>> [0xffffffff00281bf0]
>>>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
>>>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
>>>>> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
>>>>> > [p01-024:26516] [(nil)]
>>>>> > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
>>>>> > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
>>>>> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
>>>>> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
>>>>> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
>>>>> [0xffffffff02d3b81c]
>>>>> > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
>>>>> > [0]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or
>>>>> the batch system) has told this process to end
>>>>> > [0]PETSC ERROR: Try option -start_in_debugger or
>>>>> -on_error_attach_debugger
>>>>> > [0]PETSC ERROR: or see
>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
>>>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to
>>>>> find memory corruption errors
>>>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile,
>>>>> link, and run
>>>>> > [0]PETSC ERROR: to get more information on the crash.
>>>>> > [0]PETSC ERROR: --------------------- Error Message
>>>>> ------------------------------------
>>>>> > [0]PETSC ERROR: Signal received!
>>>>> > [0]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
>>>>> 22:15:24 CDT 2013
>>>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>>> > [0]PETSC ERROR: See docs/index.html for manual pages.
>>>>> > [0]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
>>>>> Wed Aug  5 18:24:40 2015
>>>>> > [0]PETSC ERROR: Libraries linked from
>>>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
>>>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
>>>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
>>>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
>>>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
>>>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
>>>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
>>>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
>>>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
>>>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
>>>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
>>>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
>>>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
>>>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
>>>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
>>>>> > [0]PETSC ERROR:
>>>>> ------------------------------------------------------------------------
>>>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>>>> unknown file
>>>>> > [ERR.] PLE 0019 plexec One of MPI processes was
>>>>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
>>>>> >
>>>>> > However, if I change from
>>>>> > call
>>>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
>>>>> ierr)
>>>>> > to
>>>>> > call MatMatMult(A,Km(stepIdx-1),
>>>>> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
>>>>> >
>>>>> > everything is fine.
>>>>> >
>>>>> > could you please suggest some way to solve this?
>>>>> >
>>>>> > Thanks
>>>>> >
>>>>> > Cong Li
>>>>> >
>>>>> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
>>>>> wrote:
>>>>> > Thank you very much for your help and suggestions.
>>>>> > With your help, finally I could continue my project.
>>>>> >
>>>>> > Regards
>>>>> >
>>>>> > Cong Li
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov>
>>>>> wrote:
>>>>> >
>>>>> >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
>>>>> created.
>>>>> >
>>>>> >   Since you want to use the C that is passed in you should use
>>>>> MAT_REUSE_MATRIX.
>>>>> >
>>>>> >   Note that since your B and C matrices are dense the issue of
>>>>> sparsity pattern of C is not relevant.
>>>>> >
>>>>> >   Barry
>>>>> >
>>>>> > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
>>>>> wrote:
>>>>> > >
>>>>> > > Thanks very much. This answer is very helpful.
>>>>> > > And I have a following question.
>>>>> > > If I create B1, B2, .. by the way you suggested and then use
>>>>> MatMatMult to do SPMM.
>>>>> > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
>>>>> fill,Mat *C)
>>>>> > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
>>>>> > >
>>>>> > > Thanks
>>>>> > >
>>>>> > > Cong Li
>>>>> > >
>>>>> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
>>>>> wrote:
>>>>> > >
>>>>> > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
>>>>> wrote:
>>>>> > > >
>>>>> > > > I am sorry that I should have explained it more clearly.
>>>>> > > > Actually I want to compute a recurrence.
>>>>> > > >
>>>>> > > > Like, I want to firstly compute A*X1=B1, and then calculate
>>>>> A*B1=B2, A*B2=B3 and so on.
>>>>> > > > Finally I want to combine all these results into a bigger matrix
>>>>> C=[B1,B2 ...]
>>>>> > >
>>>>> > >    First create C with MatCreateDense(,&C). Then call
>>>>> MatDenseGetArray(C,&array); then create B1 with
>>>>> MatCreateDense(....,array,&B1); then create
>>>>> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
>>>>> the number of __local__ rows in B1 times the number of columns in B1, then
>>>>> create B3 with a larger shift etc.
>>>>> > >
>>>>> > >    Note that you are "sharing" the array space of C with B1, B2,
>>>>> B3, ..., each Bi contains its columns of the C matrix.
>>>>> > >
>>>>> > >   Barry
>>>>> > >
>>>>> > >
>>>>> > >
>>>>> > > >
>>>>> > > > Is there any way to do this efficiently.
>>>>> > > >
>>>>> > > >
>>>>> > > >
>>>>> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
>>>>> patrick.sanan at gmail.com> wrote:
>>>>> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
>>>>> > > > > Thanks for your reply.
>>>>> > > > >
>>>>> > > > > I have an other question.
>>>>> > > > > I want to do SPMM several times and combine result matrices
>>>>> into one bigger
>>>>> > > > > matrix.
>>>>> > > > > for example
>>>>> > > > > I firstly calculate AX1=B1, AX2=B2 ...
>>>>> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
>>>>> > > > >
>>>>> > > > > Could you please suggest a way of how to do this.
>>>>> > > > This is just linear algebra, nothing to do with PETSc
>>>>> specifically.
>>>>> > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
>>>>> > > > >
>>>>> > > > > Thanks
>>>>> > > > >
>>>>> > > > > Cong Li
>>>>> > > > >
>>>>> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
>>>>> wrote:
>>>>> > > > >
>>>>> > > > > > Cong Li <solvercorleone at gmail.com> writes:
>>>>> > > > > >
>>>>> > > > > > > Hello,
>>>>> > > > > > >
>>>>> > > > > > > I am a PhD student using PETsc for my research.
>>>>> > > > > > > I am wondering if there is a way to implement SPMM (Sparse
>>>>> matrix-matrix
>>>>> > > > > > > multiplication) by using PETSc.
>>>>> > > > > >
>>>>> > > > > >
>>>>> > > > > >
>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
>>>>> > > > > >
>>>>> > > >
>>>>> > >
>>>>> > >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150807/525db1f1/attachment-0001.html>

From solvercorleone at gmail.com  Thu Aug  6 20:21:12 2015
From: solvercorleone at gmail.com (Cong Li)
Date: Fri, 7 Aug 2015 10:21:12 +0900
Subject: [petsc-users] I am wondering if there is a way to implement SPMM
In-Reply-To: <340E63F1-4389-4C3B-8221-4F119330764F@mcs.anl.gov>
References: <CALSmn-kJBCXA8uJT=me6wH4mKXV=8rmPy_oYDWnvFj-YO+aDVg@mail.gmail.com>
	<87egjjr2j9.fsf@jedbrown.org>
	<CALSmn-ncmYEk5Dbzr=pTZShvR07_yhWyDEox0uxCyVVPZiJn=g@mail.gmail.com>
	<20150804084548.GB52392@Patricks-MacBook-Pro-3.local>
	<CALSmn-m+aJgTEfM986ftFxtJyiF4-YBEbQ6YuV9-2oZfToOkMA@mail.gmail.com>
	<07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov>
	<CALSmn-kE1EQV8TbFJV3uh5-=br=wC1UqBu+jY5MVuGm63db=Rg@mail.gmail.com>
	<06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov>
	<CALSmn-n-dqWn6RxQWjeMHAt-_HTKOTwm8xWKKaKrWZKQ_oFq+g@mail.gmail.com>
	<CALSmn-k-eEe5-HQ5aVXMqVGTxvHW2Yefx_FU4sfw=JG9a3sRzQ@mail.gmail.com>
	<F4AA697B-E3AA-48A1-AFEE-55442211799E@mcs.anl.gov>
	<CALSmn-kY5JxZGbJoUaFjZD_fuJRfGK=wVK-=o2-gf_S=0=b7kQ@mail.gmail.com>
	<CAGCphBs3xz0CGb88JtpreLXUeFLXNu5qc=3-B0fsVn-=zv6KtQ@mail.gmail.com>
	<FCB0B289-58BD-4AA4-8DCD-9EEB755360D2@mcs.anl.gov>
	<CALSmn-=p3k92byHk6Fwjhpm-DR0J0yBxxsesX-V-NQdbFZ7cXw@mail.gmail.com>
	<340E63F1-4389-4C3B-8221-4F119330764F@mcs.anl.gov>
Message-ID: <CALSmn-ngYhDhiQFWGCdtztv7uW87MFLEANzr3cFw0+GxceE5eg@mail.gmail.com>

Barry,
Thank you very much.
I will install and try the updated version.

Regards

Cong Li

On Fri, Aug 7, 2015 at 8:47 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Cong Li,
>
>     I have updated PETSc to support the use of MatMatMult() per your
> needs. You will need to switch to the master development branch
> http://www.mcs.anl.gov/petsc/developers/index.html of PETSc so install
> that first.
>
>    I found a number of bugs in your code that I needed to fix to get it to
> run successfully on 1 and 2 processes to correctly load the matrices and do
> everything else it was doing
> with the MatMatMult() (note I do not think it generates the right numbers
> but at least it doesn't crash and does successfully do the MatMatMult().
> I've attached the fixed files.
>
>   Barry
>
>
> > On Aug 6, 2015, at 12:27 AM, Cong Li <solvercorleone at gmail.com> wrote:
> >
> > Barry,
> >
> > Exactly. And thanks for the explaination.
> >
> > Cong Li
> >
> > On Thu, Aug 6, 2015 at 1:29 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > > On Aug 5, 2015, at 10:23 PM, Hong <hzhang at mcs.anl.gov> wrote:
> > >
> > > Cong,
> > >
> > > Can you write out math equations for mpk_monomial (),
> > > list input and output parameters.
> > >
> > > Note:
> > > 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End
> > > 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after
> > >     MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..)
> >
> >   Hong, we want to reuse the space in the Km(stepIdx-1) from which it
> was created which means that MAT_INITIAL_MATRIX cannot be used. Since the
> result is always dense it is not the difficult case when a symbolic
> computation needs to be done initially so, at least in theory, he should
> not have to use MAT_INITIAL_MATRIX the first time through.
> >
> >   Barry
> >
> > >
> > > Hong
> > >
> > >
> > > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > The entire source code files are attached.
> > >
> > > Also I copy and paste the here in this email
> > >
> > > thanks
> > >
> > > program test
> > >
> > >   implicit none
> > >
> > > #include <finclude/petscsys.h>
> > > #include <finclude/petscvec.h>
> > > #include <finclude/petscmat.h>
> > > #include <finclude/petscviewer.h>
> > >
> > >
> > >   PetscViewer    :: view
> > >   ! sparse matrix
> > >   Mat            :: A
> > >   ! distributed dense matrix of size n x m
> > >   Mat            :: B, X, R, QDlt, AQDlt
> > >   ! distributed dense matrix of size n x (m x k)
> > >   Mat            :: Q, K, AQ_p, AQ
> > >   ! local dense matrix (every process keep the identical copies), (m x
> k) x (m x k)
> > >   Mat            :: AConjPara, QtAQ, QtAQ_p, Dlt
> > >
> > >   PetscInt       :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter,
> step_k,bsize
> > >   PetscInt       :: ownRowS,ownRowE
> > >   PetscScalar, allocatable :: XInit(:,:)
> > >   PetscInt       :: XInitI, XInitJ
> > >   PetscScalar    :: v=1.0
> > >   PetscBool      :: flg
> > >   PetscMPIInt    :: size, rank
> > >
> > >   character(128) ::  fin, rhsfin
> > >
> > >
> > >   call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
> > >   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr)
> > >   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> > >
> > >   ! read binary matrix file
> > >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr)
> > >   call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr)
> > >
> > >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr)
> > >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr)
> > >   call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr)
> > >
> > >
> > >   call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr)
> > >   call MatCreate(PETSC_COMM_WORLD,A,ierr)
> > >   call MatSetType(A,MATAIJ,ierr)
> > >   call MatLoad(A,view,ierr)
> > >   call PetscViewerDestroy(view,ierr)
> > >   ! for the time being, assume mDim == nDim is true
> > >   call MatGetSize(A, nDim, mDim, ierr)
> > >
> > >   if (rank == 0) then
> > >     print*,'Mat Size = ', nDim, mDim
> > >   end if
> > >
> > >   call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr)
> > >
> > >   ! create right-and-side matrix
> > >   ! for the time being, choose row-wise decomposition
> > >   ! for the time being, assume nDim%size = 0
> > >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> > >                       bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr)
> > >   call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr)
> > >   call MatLoad(B,view,ierr)
> > >   call PetscViewerDestroy(view,ierr)
> > >   call MatGetSize(B, rhsMDim, rhsNDim, ierr)
> > >   if (rank == 0) then
> > >     print*,'MRHS Size actually are:', rhsMDim, rhsNDim
> > >     print*,'MRHS Size should be:', nDim, bsize
> > >   end if
> > >   call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   ! inintial value guses X
> > >   allocate(XInit(nDim,bsize))
> > >   do XInitI=1, nDim
> > >     do XInitJ=1, bsize
> > >       XInit(XInitI,XInitJ) = 1.0
> > >     end do
> > >   end do
> > >
> > >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> > >                       bsize, nDim, bsize,XInit, X, ierr)
> > >
> > >   call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (X, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >
> > >   !  B, X, R, QDlt, AQDlt
> > >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr)
> > >   call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (R, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr)
> > >   call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (QDlt, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr)
> > >   call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (AQDlt, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > > ! Q, K, AQ_p, AQ of size n x (m x k)
> > >   call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), &
> > >                       (bsize*step_k), nDim,
> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr)
> > >   call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr)
> > >   call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr)
> > >   call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr)
> > >   call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k)
> > >   call
> MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),&
> > >                          PETSC_NULL_SCALAR, QtAQ, ierr)
> > >   call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (QtAQ, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p    , ierr)
> > >   call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt       , ierr)
> > >   call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (Dlt, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr)
> > >   call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (AConjPara, MAT_FINAL_ASSEMBLY, ierr)
> > >
> > > ! calculation for R
> > >
> > > ! call matrix powers kernel
> > >   call mpk_monomial (K, A, R, step_k, rank,size)
> > >
> > > ! destory matrices
> > >   deallocate(XInit)
> > >
> > >   call MatDestroy(B, ierr)
> > >   call MatDestroy(X, ierr)
> > >   call MatDestroy(R, ierr)
> > >   call MatDestroy(QDlt, ierr)
> > >   call MatDestroy(AQDlt, ierr)
> > >   call MatDestroy(Q, ierr)
> > >   call MatDestroy(K, ierr)
> > >   call MatDestroy(AQ_p, ierr)
> > >   call MatDestroy(AQ, ierr)
> > >   call MatDestroy(QtAQ, ierr)
> > >   call MatDestroy(QtAQ_p, ierr)
> > >   call MatDestroy(Dlt, ierr)
> > >
> > >
> > >   call PetscFinalize(ierr)
> > >
> > >   stop
> > >
> > > end program test
> > >
> > >
> > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI)
> > >       implicit none
> > >
> > > #include <finclude/petscsys.h>
> > > #include <finclude/petscvec.h>
> > > #include <finclude/petscmat.h>
> > > #include <finclude/petscviewer.h>
> > >
> > >       Mat            :: K, Km(step_k)
> > >       Mat            :: A, R
> > >       PetscMPIInt    :: sizeMPI, rank
> > >       PetscInt       :: nDim, bsize, step_k, local_RRow, local_RCol,
> genIdx
> > >       PetscInt       :: ierr
> > >       PetscInt       :: stepIdx, blockShift, localRsize
> > >   PetscScalar    :: KArray(1), RArray(1), PetscScalarSize
> > >   PetscOffset    :: KArrayOffset, RArrayOffset
> > >
> > >       call MatGetSize(R, nDim, bsize, ierr)
> > >       if (rank == 0) then
> > >         print*,'Mat Size = ', nDim, bsize
> > >       end if
> > >
> > >   call MatGetArray(K,KArray,KArrayOffset,ierr)
> > >
> > >   call MatGetLocalSize(R,local_RRow,local_RCol)
> > > !   print *, "local_RRow,local_RCol", local_RRow,local_RCol
> > >
> > >   ! get arry from R to add values to K(1)
> > >   call MatGetArray(R,RArray,RArrayOffset,ierr)
> > >
> > >   call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset
> + 1), Km(1), ierr)
> > >
> > >
> > > !   call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset +
> 1) &
> > > !                  ,local_RRow * local_RCol *
> STORAGE_SIZE(PetscScalarSize), ierr)
> > >
> > >   localRsize = local_RRow * local_RCol
> > >   do genIdx= 1, localRsize
> > >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> > >   end do
> > >
> > >
> > >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> > >
> > >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   do stepIdx= 2, step_k
> > >
> > >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol)
> > >
> > >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > >                         PETSC_DECIDE , nDim,
> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> > >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > >
> > >   end do
> > >
> > >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> > >
> > > !   do stepIdx= 2, step_k
> > >   do stepIdx= 2,2
> > >
> > >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > > !     call
> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > >   end do
> > >
> > > !   call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr)
> > >
> > > end subroutine mpk_monomial
> > >
> > >
> > >
> > > Cong Li
> > >
> > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >    Send the entire code so that we can compile it and run it ourselves
> to see what is going wrong.
> > >
> > >   Barry
> > >
> > > > On Aug 5, 2015, at 4:42 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > >
> > > > Hi
> > > >
> > > > I tried the method you suggested. However, I got the error message.
> > > > My code and message are below.
> > > >
> > > > K is the big matrix containing column matrices.
> > > >
> > > > code:
> > > >
> > > > call MatGetArray(K,KArray,KArrayOffset,ierr)
> > > >
> > > > call MatGetLocalSize(R,local_RRow,local_RCol)
> > > >
> > > > call MatGetArray(R,RArray,RArrayOffset,ierr)
> > > >
> > > > call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > > >                         PETSC_DECIDE , nDim,
> bsize,KArray(KArrayOffset + 1), Km(1), ierr)
> > > >
> > > >   localRsize = local_RRow * local_RCol
> > > >   do genIdx= 1, localRsize
> > > >     KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx)
> > > >   end do
> > > >
> > > >   call MatRestoreArray(R,RArray,RArrayOffset,ierr)
> > > >
> > > >   call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > > >   call MatAssemblyEnd  (Km(1), MAT_FINAL_ASSEMBLY, ierr)
> > > >
> > > >   do stepIdx= 2, step_k
> > > >
> > > >     blockShift = KArrayOffset + (stepIdx-1) * (local_RRow *
> local_RCol)
> > > >
> > > >     call MatCreateDense(PETSC_COMM_WORLD,  PETSC_DECIDE, &
> > > >                         PETSC_DECIDE , nDim,
> bsize,KArray(blockShift+1), Km(stepIdx), ierr)
> > > >     call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > > >     call MatAssemblyEnd  (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr)
> > > >   end do
> > > >
> > > >   call MatRestoreArray(K,KArray,KArrayOffset,ierr)
> > > >
> > > >    do stepIdx= 2, step_k
> > > >
> > > >     call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > > >   end do
> > > >
> > > >
> > > > And I got the error message as below:
> > > >
> > > >
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
> Violation, probably memory access out of range
> > > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile,
> link, and run
> > > > [0]PETSC ERROR: to get more information on the crash.
> > > > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > > > [0]PETSC ERROR: Signal received!
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > > [0]PETSC ERROR: --------------------[1]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
> Violation, probably memory access out of range
> > > > ----------------------------------------------------
> > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
> Wed Aug  5 18:24:40 2015
> > > > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > > >
> --------------------------------------------------------------------------
> > > > [mpi::mpi-api::mpi-abort]
> > > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > > > with errorcode 59.
> > > >
> > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > > > You may or may not see output from other processes, depending on
> > > > exactly when Open MPI kills them.
> > > >
> --------------------------------------------------------------------------
> > > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
> [0xffffffff0091f684]
> > > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
> [0xffffffff006c389c]
> > > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c)
> [0xffffffff006db3ac]
> > > > [p01-024:26516]
> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
> [0xffffffff00281bf0]
> > > > [p01-024:26516] ./kmath.bcbcg [0x1bf620]
> > > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c]
> > > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600]
> > > > [p01-024:26516] [(nil)]
> > > > [p01-024:26516] ./kmath.bcbcg [0x1a2054]
> > > > [p01-024:26516] ./kmath.bcbcg [0x1064f8]
> > > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c]
> > > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c]
> > > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194)
> [0xffffffff02d3b81c]
> > > > [p01-024:26516] ./kmath.bcbcg [0x1051ec]
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or
> the batch system) has told this process to end
> > > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile,
> link, and run
> > > > [0]PETSC ERROR: to get more information on the crash.
> > > > [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> > > > [0]PETSC ERROR: Signal received!
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11
> 22:15:24 CDT 2013
> > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > > > [0]PETSC ERROR: See docs/index.html for manual pages.
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293
> Wed Aug  5 18:24:40 2015
> > > > [0]PETSC ERROR: Libraries linked from
> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib
> > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015
> > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768
> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8
> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8
> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8
> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1
> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg"
> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt
> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe
> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2"
> --with-x=0 --with-c++-support --with-batch=1 --with-info=1
> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> > > > [ERR.] PLE 0019 plexec One of MPI processes was
> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104)
> > > >
> > > > However, if I change from
> > > > call
> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx),
> ierr)
> > > > to
> > > > call MatMatMult(A,Km(stepIdx-1),
> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)
> > > >
> > > > everything is fine.
> > > >
> > > > could you please suggest some way to solve this?
> > > >
> > > > Thanks
> > > >
> > > > Cong Li
> > > >
> > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > > Thank you very much for your help and suggestions.
> > > > With your help, finally I could continue my project.
> > > >
> > > > Regards
> > > >
> > > > Cong Li
> > > >
> > > >
> > > >
> > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > > >
> > > >   From the manual page:  Unless scall is MAT_REUSE_MATRIX C will be
> created.
> > > >
> > > >   Since you want to use the C that is passed in you should use
> MAT_REUSE_MATRIX.
> > > >
> > > >   Note that since your B and C matrices are dense the issue of
> sparsity pattern of C is not relevant.
> > > >
> > > >   Barry
> > > >
> > > > > On Aug 4, 2015, at 11:59 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > > >
> > > > > Thanks very much. This answer is very helpful.
> > > > > And I have a following question.
> > > > > If I create B1, B2, .. by the way you suggested and then use
> MatMatMult to do SPMM.
> > > > > PetscErrorCode  MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal
> fill,Mat *C)
> > > > > should I use  MAT_REUSE_MATRIX for MatReuse part of the arguement.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Cong Li
> > > > >
> > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > > > >
> > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li <solvercorleone at gmail.com>
> wrote:
> > > > > >
> > > > > > I am sorry that I should have explained it more clearly.
> > > > > > Actually I want to compute a recurrence.
> > > > > >
> > > > > > Like, I want to firstly compute A*X1=B1, and then calculate
> A*B1=B2, A*B2=B3 and so on.
> > > > > > Finally I want to combine all these results into a bigger matrix
> C=[B1,B2 ...]
> > > > >
> > > > >    First create C with MatCreateDense(,&C). Then call
> MatDenseGetArray(C,&array); then create B1 with
> MatCreateDense(....,array,&B1); then create
> > > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals
> the number of __local__ rows in B1 times the number of columns in B1, then
> create B3 with a larger shift etc.
> > > > >
> > > > >    Note that you are "sharing" the array space of C with B1, B2,
> B3, ..., each Bi contains its columns of the C matrix.
> > > > >
> > > > >   Barry
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Is there any way to do this efficiently.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan <
> patrick.sanan at gmail.com> wrote:
> > > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
> > > > > > > Thanks for your reply.
> > > > > > >
> > > > > > > I have an other question.
> > > > > > > I want to do SPMM several times and combine result matrices
> into one bigger
> > > > > > > matrix.
> > > > > > > for example
> > > > > > > I firstly calculate AX1=B1, AX2=B2 ...
> > > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
> > > > > > >
> > > > > > > Could you please suggest a way of how to do this.
> > > > > > This is just linear algebra, nothing to do with PETSc
> specifically.
> > > > > > A * [X1, X2, ... ] = [AX1, AX2, ...]
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > Cong Li
> > > > > > >
> > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown <jed at jedbrown.org>
> wrote:
> > > > > > >
> > > > > > > > Cong Li <solvercorleone at gmail.com> writes:
> > > > > > > >
> > > > > > > > > Hello,
> > > > > > > > >
> > > > > > > > > I am a PhD student using PETsc for my research.
> > > > > > > > > I am wondering if there is a way to implement SPMM (Sparse
> matrix-matrix
> > > > > > > > > multiplication) by using PETSc.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
> > > > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150807/e8d15c68/attachment-0001.html>

From jychang48 at gmail.com  Thu Aug  6 21:16:49 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Thu, 6 Aug 2015 21:16:49 -0500
Subject: [petsc-users] Issues running with intel MPI compiler
Message-ID: <CAP2=TMhdVPt2xidtrTe_y8E9GsYKJf3NmRjpZ0xa1zxc9rFwPw@mail.gmail.com>

Hi all,

I configured PETSc using my university's intel compilers. I configured with
these options:

./configure --download-chaco --download-ctetgen --download-exodusii
--download-fblaslapack --download-hdf5 --download-hypre --download-metis
--download-netcdf --download-parmetis --download-triangle
--with-cmake=cmake --with-mpi-dir=/share/apps/intel/impi/5.0.2.044/intel64
--with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2
PETSC_ARCH=arch-linux2-c-opt --with-debugging=0

when I run any examples via make, i get the following error:

> mpiexec_opuntia.cacds.uh.edu: cannot connect to local mpd
(/tmp/mpd2.console_jchang23); possible causes:

>   1. no mpd is running on this host

>   2. an mpd is running but was started without a "console" (-n option)


However, if I simply run /share/apps/intel/impi/
5.0.2.044/intel64/bin/mpiexec -n 1 my_program, it works fine. Anyone know
why this is happening?

Thanks,
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150806/43d72bb3/attachment.html>

From balay at mcs.anl.gov  Fri Aug  7 00:04:10 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 7 Aug 2015 00:04:10 -0500
Subject: [petsc-users] Issues running with intel MPI compiler
In-Reply-To: <CAP2=TMhdVPt2xidtrTe_y8E9GsYKJf3NmRjpZ0xa1zxc9rFwPw@mail.gmail.com>
References: <CAP2=TMhdVPt2xidtrTe_y8E9GsYKJf3NmRjpZ0xa1zxc9rFwPw@mail.gmail.com>
Message-ID: <alpine.LFD.2.20.1508070001550.412@asterix>

perhaps you'll see the issue with:

/share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec  -n 2 my_program

In this case - you can retry with:

/share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec.hydra  -n 2 my_program

wrt running example with make - you can try equivalent of

make MPIEXEC=/share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec.hydra test

Satish

On Thu, 6 Aug 2015, Justin Chang wrote:

> Hi all,
> 
> I configured PETSc using my university's intel compilers. I configured with
> these options:
> 
> ./configure --download-chaco --download-ctetgen --download-exodusii
> --download-fblaslapack --download-hdf5 --download-hypre --download-metis
> --download-netcdf --download-parmetis --download-triangle
> --with-cmake=cmake --with-mpi-dir=/share/apps/intel/impi/5.0.2.044/intel64
> --with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2
> PETSC_ARCH=arch-linux2-c-opt --with-debugging=0
> 
> when I run any examples via make, i get the following error:
> 
> > mpiexec_opuntia.cacds.uh.edu: cannot connect to local mpd
> (/tmp/mpd2.console_jchang23); possible causes:
> 
> >   1. no mpd is running on this host
> 
> >   2. an mpd is running but was started without a "console" (-n option)
> 
> 
> However, if I simply run /share/apps/intel/impi/
> 5.0.2.044/intel64/bin/mpiexec -n 1 my_program, it works fine. Anyone know
> why this is happening?
> 
> Thanks,
> Justin
> 


From Mahir.Ulker-Kaustell at tyrens.se  Fri Aug  7 04:59:05 2015
From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se)
Date: Fri, 7 Aug 2015 09:59:05 +0000
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <CAGCphBts_ppiRAKXz3zJPYYz+dwC4fsvOOi-4yhXLRdLRKLahg@mail.gmail.com>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBtNupdccRJWMuUrC9fQa_i5ggA_j4+n=tqrsnXn7w3tvg@mail.gmail.com>
	<051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se>
	<CAFvbobWcrw-2Xd7xWgGSnWJtY=u5JkAQ7O4u3aoWto=kBOzCbA@mail.gmail.com>
	<7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se>
	<19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov>
	<03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se>
	<CAFvbobWtvtOk3aH31YU7_TKkwV6bjG7fGNu9o6QocM+nb3xiuQ@mail.gmail.com>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
	<e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
	<CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>
	<63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se>
	<CAGCphBts_ppiRAKXz3zJPYYz+dwC4fsvOOi-4yhXLRdLRKLahg@mail.gmail.com>
Message-ID: <429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se>

Hong,

Running example 2 with the command line given below gives me two uniprocessor runs!?

$ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
KSP Object: 1 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: lu
    LU: out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: nd
    factor fill ratio given 0, needed 0
      Factored matrix follows:
        Mat Object:         1 MPI processes
          type: seqaij
          rows=56, cols=56
          package used to perform factorization: superlu_dist
          total: nonzeros=0, allocated nonzeros=0
          total number of mallocs used during MatSetValues calls =0
            SuperLU_DIST run parameters:
              Process grid nprow 1 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 0
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 1 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm
  linear system matrix = precond matrix:
  Mat Object:   1 MPI processes
    type: seqaij
    rows=56, cols=56
    total: nonzeros=250, allocated nonzeros=280
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines
Norm of error 5.21214e-15 iterations 1
KSP Object: 1 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: lu
    LU: out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: nd
    factor fill ratio given 0, needed 0
      Factored matrix follows:
        Mat Object:         1 MPI processes
          type: seqaij
          rows=56, cols=56
          package used to perform factorization: superlu_dist
          total: nonzeros=0, allocated nonzeros=0
          total number of mallocs used during MatSetValues calls =0
            SuperLU_DIST run parameters:
              Process grid nprow 1 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 0
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 1 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm
  linear system matrix = precond matrix:
  Mat Object:   1 MPI processes
    type: seqaij
    rows=56, cols=56
    total: nonzeros=250, allocated nonzeros=280
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines
Norm of error 5.21214e-15 iterations 1

Mahir

From: Hong [mailto:hzhang at mcs.anl.gov]
Sent: den 6 augusti 2015 16:36
To: ?lker-Kaustell, Mahir
Cc: Hong; Xiaoye S. Li; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir:

I have been using PETSC_COMM_WORLD.

What do you get by running a petsc example, e.g.,
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view

KSP Object: 2 MPI processes
  type: gmres
...

Hong

From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 5 augusti 2015 17:11
To: ?lker-Kaustell, Mahir
Cc: Hong; Xiaoye S. Li; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir:
As you noticed, you ran the code in serial mode, not parallel.
Check your code on input communicator, e.g., what input communicator do you use in
KSPCreate(comm,&ksp)?

I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact'
in serial mode, this option is ignored with a warning.

Hong

Hong,

If I set parsymbfact:

$ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[63679,1],0]
  Exit code:    255
--------------------------------------------------------------------------

Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view.

If I do not set it, I get a serial run even if I specify ?n 2:

mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
?
KSP Object: 1 MPI processes
  type: preonly
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using NONE norm type for convergence test
PC Object: 1 MPI processes
  type: lu
    LU: out-of-place factorization
    tolerance for zero pivot 2.22045e-14
    matrix ordering: nd
    factor fill ratio given 0, needed 0
      Factored matrix follows:
        Mat Object:         1 MPI processes
          type: seqaij
          rows=954, cols=954
          package used to perform factorization: superlu_dist
          total: nonzeros=0, allocated nonzeros=0
          total number of mallocs used during MatSetValues calls =0
            SuperLU_DIST run parameters:
              Process grid nprow 1 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 0
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 1 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm
  linear system matrix = precond matrix:
  Mat Object:   1 MPI processes
    type: seqaij
    rows=954, cols=954
    total: nonzeros=34223, allocated nonzeros=34223
    total number of mallocs used during MatSetValues calls =0
      using I-node routines: found 668 nodes, limit used is 5

I am running PETSc via Cygwin on a windows machine.
When I installed PETSc the tests with different numbers of processes ran well.

Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 3 augusti 2015 19:06
To: ?lker-Kaustell, Mahir
Cc: Hong; Xiaoye S. Li; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir,


I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs.

If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1:
mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1

The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact.

Please run it with '-ksp_view' and see what
'SuperLU_DIST run parameters:' are being used, e.g.
petsc/src/ksp/ksp/examples/tutorials (maint)
$ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view

...
  SuperLU_DIST run parameters:
              Process grid nprow 2 x npcol 1
              Equilibrate matrix TRUE
              Matrix input mode 1
              Replace tiny pivots TRUE
              Use iterative refinement FALSE
              Processors in row 2 col partition 1
              Row permutation LargeDiag
              Column permutation METIS_AT_PLUS_A
              Parallel symbolic factorization FALSE
              Repeated factorization SamePattern_SameRowPerm

I do not understand why your code uses matrix input mode = global.

Hong


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 3 augusti 2015 16:46
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list

Subject: Re: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry found the culprit. I can reproduce it:
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact

Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
...

PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes.
Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run?

I'll add an error flag for these use cases.

Hong

On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I think I know the problem.   Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal().  This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).

That's why you get the following error:
Invalid ISPEC at line 484 in file get_perm_c.c

You need to use distributed matrix input interface pzgssvx() (without ABglobal)

Sherry


On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Hong and Sherry,

I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:

If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c

Mahir

From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 30 juli 2015 02:58
To: ?lker-Kaustell, Mahir
Cc: Xiaoye Li; PETSc users list

Subject: Fwd: [petsc-users] SuperLU MPI-problem

Mahir,

Sherry fixed several bugs in superlu_dist-v4.1.
The current petsc-release interfaces with superlu_dist-v4.0.
We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?

Here is how to do it:
1. download superlu_dist v4.1
2. remove existing PETSC_ARCH directory, then configure petsc with
'--download-superlu_dist=superlu_dist_4.1.tar.gz'
3. build petsc

Let us know if the issue remains.

Hong


---------- Forwarded message ----------
From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Date: Wed, Jul 29, 2015 at 2:24 PM
Subject: Fwd: [petsc-users] SuperLU MPI-problem
To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>
Hong,
I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:

Invalid ISPEC at line 484 in file get_perm_c.c
This has nothing to do with my bug fix.
?  Shall we ask him to try the new version, or try to get him matrix?
Sherry
?

---------- Forwarded message ----------
From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
Date: Wed, Jul 22, 2015 at 1:32 PM
Subject: RE: [petsc-users] SuperLU MPI-problem
To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?

If i use -mat_superlu_dist_parsymbfact the program crashes with

Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------

If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with

Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
col block 3006 -------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
[0]PETSC ERROR: ------------------------------------------------------------------------


/Mahir


From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
Sent: den 22 juli 2015 21:34
To: Xiaoye S. Li
Cc: ?lker-Kaustell, Mahir; petsc-users

Subject: Re: [petsc-users] SuperLU MPI-problem

In Petsc/superlu_dist interface, we set default

options.ParSymbFact = NO;

When user raises the flag "-mat_superlu_dist_parsymbfact",
we set

    options.ParSymbFact = YES;
    options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */

We do not change anything else.

Hong

On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.

The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.

I don't understand why you get the following error when you use
?-mat_superlu_dist_parsymbfact?.

Invalid ISPEC at line 484 in file get_perm_c.c

Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.

?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
?-mat_superlu_dist_parsymbfact?
? ?  (the default is to use  sequential symbolic factorization.)


Sherry

On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Thank you for your reply.

As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.

I am working in a Windows-environment and have installed PETSc through Cygwin.
Apparently, there is no support for Valgrind in this OS.

If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?


Best regards,
Mahir

______________________________________________
Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
______________________________________________

-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
Sent: den 22 juli 2015 02:57
To: ?lker-Kaustell, Mahir
Cc: Xiaoye S. Li; petsc-users
Subject: Re: [petsc-users] SuperLU MPI-problem


   Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)

   Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.

  Barry


==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
==42050==
==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42048== Syscall param write(buf) points to uninitialised byte(s)
==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Address 0x104810704 is on thread 1's stack
==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a stack allocation
==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
==42050==
==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049== Conditional jump or move depends on uninitialised value(s)
==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42048== Conditional jump or move depends on uninitialised value(s)
==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049== Conditional jump or move depends on uninitialised value(s)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==  Uninitialised value was created by a heap allocation
==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==  Uninitialised value was created by a heap allocation
==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42048==    by 0x100001B3C: main (in ./ex19)
==42048==
==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42049==    by 0x100001B3C: main (in ./ex19)
==42049==
==42050== Conditional jump or move depends on uninitialised value(s)
==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==  Uninitialised value was created by a heap allocation
==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
==42050==    by 0x100001B3C: main (in ./ex19)
==42050==


> On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
> Ok. So I have been creating the full factorization on each process. That gives me some hope!
>
> I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> However, now the program crashes with:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> And so on?
>
> From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
>
> Mahir
>
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
>
> From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> Sent: den 20 juli 2015 18:12
> To: ?lker-Kaustell, Mahir
> Cc: Hong; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
>
> The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
>
> You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
>
> Sherry Li
>
>
> On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Hong:
>
> Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
>
> The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
>
> Mahir
>
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 20 juli 2015 17:39
> To: ?lker-Kaustell, Mahir
> Cc: petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir:
> Direct solvers consume large amount of memory. Suggest to try followings:
>
> 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
>
> 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> Do you get memory crash in the 1st symbolic factorization?
> In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
>
> 3. Use a machine that gives larger memory.
>
> Hong
>
> Dear Petsc-Users,
>
> I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> The frequency dependency of the problem requires that the system
>
>                              [-omega^2M + K]u = F
>
> where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> K is a complex matrix, including material damping.
>
> I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
>
> The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
>
> I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
>
> Mahir


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150807/e0c77900/attachment-0001.html>

From jychang48 at gmail.com  Fri Aug  7 09:36:30 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Fri, 7 Aug 2015 09:36:30 -0500
Subject: [petsc-users] Issues running with intel MPI compiler
In-Reply-To: <alpine.LFD.2.20.1508070001550.412@asterix>
References: <CAP2=TMhdVPt2xidtrTe_y8E9GsYKJf3NmRjpZ0xa1zxc9rFwPw@mail.gmail.com>
	<alpine.LFD.2.20.1508070001550.412@asterix>
Message-ID: <CAP2=TMghOo=67uderKNP=keywbvyo3QJBB=UA3kMROvWgd9FxA@mail.gmail.com>

That did the trick, thank you very much

On Fri, Aug 7, 2015 at 12:04 AM, Satish Balay <balay at mcs.anl.gov> wrote:

> perhaps you'll see the issue with:
>
> /share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec  -n 2 my_program
>
> In this case - you can retry with:
>
> /share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec.hydra  -n 2
> my_program
>
> wrt running example with make - you can try equivalent of
>
> make MPIEXEC=/share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec.hydra
> test
>
> Satish
>
> On Thu, 6 Aug 2015, Justin Chang wrote:
>
> > Hi all,
> >
> > I configured PETSc using my university's intel compilers. I configured
> with
> > these options:
> >
> > ./configure --download-chaco --download-ctetgen --download-exodusii
> > --download-fblaslapack --download-hdf5 --download-hypre --download-metis
> > --download-netcdf --download-parmetis --download-triangle
> > --with-cmake=cmake --with-mpi-dir=/share/apps/intel/impi/
> 5.0.2.044/intel64
> > --with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2
> > PETSC_ARCH=arch-linux2-c-opt --with-debugging=0
> >
> > when I run any examples via make, i get the following error:
> >
> > > mpiexec_opuntia.cacds.uh.edu: cannot connect to local mpd
> > (/tmp/mpd2.console_jchang23); possible causes:
> >
> > >   1. no mpd is running on this host
> >
> > >   2. an mpd is running but was started without a "console" (-n option)
> >
> >
> > However, if I simply run /share/apps/intel/impi/
> > 5.0.2.044/intel64/bin/mpiexec -n 1 my_program, it works fine. Anyone
> know
> > why this is happening?
> >
> > Thanks,
> > Justin
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150807/757f4f27/attachment.html>

From balay at mcs.anl.gov  Fri Aug  7 11:08:43 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 7 Aug 2015 11:08:43 -0500
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
	<e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
	<CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>
	<63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se>
	<CAGCphBts_ppiRAKXz3zJPYYz+dwC4fsvOOi-4yhXLRdLRKLahg@mail.gmail.com>
	<429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se>
Message-ID: <alpine.LFD.2.20.1508071108120.20765@asterix>

This usually happens if you use the wrong MPIEXEC

i.e use the mpiexec from the MPI you built PETSc with.

Satish

On Fri, 7 Aug 2015, Mahir.Ulker-Kaustell at tyrens.se wrote:

> Hong,
> 
> Running example 2 with the command line given below gives me two uniprocessor runs!?
> 
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=56, cols=56
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=56, cols=56
>     total: nonzeros=250, allocated nonzeros=280
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Norm of error 5.21214e-15 iterations 1
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=56, cols=56
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=56, cols=56
>     total: nonzeros=250, allocated nonzeros=280
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Norm of error 5.21214e-15 iterations 1
> 
> Mahir
> 
> From: Hong [mailto:hzhang at mcs.anl.gov]
> Sent: den 6 augusti 2015 16:36
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir:
> 
> I have been using PETSC_COMM_WORLD.
> 
> What do you get by running a petsc example, e.g.,
> petsc/src/ksp/ksp/examples/tutorials
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
> 
> KSP Object: 2 MPI processes
>   type: gmres
> ...
> 
> Hong
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 5 augusti 2015 17:11
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir:
> As you noticed, you ran the code in serial mode, not parallel.
> Check your code on input communicator, e.g., what input communicator do you use in
> KSPCreate(comm,&ksp)?
> 
> I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact'
> in serial mode, this option is ignored with a warning.
> 
> Hong
> 
> Hong,
> 
> If I set parsymbfact:
> 
> $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
> Invalid ISPEC at line 484 in file get_perm_c.c
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec detected that one or more processes exited with non-zero status, thus causing
> the job to be terminated. The first process to do so was:
> 
>   Process name: [[63679,1],0]
>   Exit code:    255
> --------------------------------------------------------------------------
> 
> Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view.
> 
> If I do not set it, I get a serial run even if I specify ?n 2:
> 
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
> ?
> KSP Object: 1 MPI processes
>   type: preonly
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using NONE norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=954, cols=954
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=954, cols=954
>     total: nonzeros=34223, allocated nonzeros=34223
>     total number of mallocs used during MatSetValues calls =0
>       using I-node routines: found 668 nodes, limit used is 5
> 
> I am running PETSc via Cygwin on a windows machine.
> When I installed PETSc the tests with different numbers of processes ran well.
> 
> Mahir
> 
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 3 augusti 2015 19:06
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> 
> I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs.
> 
> If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1:
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1
> 
> The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact.
> 
> Please run it with '-ksp_view' and see what
> 'SuperLU_DIST run parameters:' are being used, e.g.
> petsc/src/ksp/ksp/examples/tutorials (maint)
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view
> 
> ...
>   SuperLU_DIST run parameters:
>               Process grid nprow 2 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 1
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 2 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
> 
> I do not understand why your code uses matrix input mode = global.
> 
> Hong
> 
> 
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 3 augusti 2015 16:46
> To: Xiaoye S. Li
> Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list
> 
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> Sherry found the culprit. I can reproduce it:
> petsc/src/ksp/ksp/examples/tutorials
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> ...
> 
> PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes.
> Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run?
> 
> I'll add an error flag for these use cases.
> 
> Hong
> 
> On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
> I think I know the problem.   Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal().  This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).
> 
> That's why you get the following error:
> Invalid ISPEC at line 484 in file get_perm_c.c
> 
> You need to use distributed matrix input interface pzgssvx() (without ABglobal)
> 
> Sherry
> 
> 
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Hong and Sherry,
> 
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
> 
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c
> 
> Mahir
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 30 juli 2015 02:58
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye Li; PETSc users list
> 
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> Sherry fixed several bugs in superlu_dist-v4.1.
> The current petsc-release interfaces with superlu_dist-v4.0.
> We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
> 
> Here is how to do it:
> 1. download superlu_dist v4.1
> 2. remove existing PETSC_ARCH directory, then configure petsc with
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
> 3. build petsc
> 
> Let us know if the issue remains.
> 
> Hong
> 
> 
> ---------- Forwarded message ----------
> From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>
> Hong,
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> This has nothing to do with my bug fix.
> ?  Shall we ask him to try the new version, or try to get him matrix?
> Sherry
> ?
> 
> ---------- Forwarded message ----------
> From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
> Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
> The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
> Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?
> 
> If i use -mat_superlu_dist_parsymbfact the program crashes with
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [0]PETSC ERROR: ------------------------------------------------------------------------
> 
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with
> 
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
> col block 3006 -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [0]PETSC ERROR: ------------------------------------------------------------------------
> 
> 
> /Mahir
> 
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 22 juli 2015 21:34
> To: Xiaoye S. Li
> Cc: ?lker-Kaustell, Mahir; petsc-users
> 
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> In Petsc/superlu_dist interface, we set default
> 
> options.ParSymbFact = NO;
> 
> When user raises the flag "-mat_superlu_dist_parsymbfact",
> we set
> 
>     options.ParSymbFact = YES;
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */
> 
> We do not change anything else.
> 
> Hong
> 
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
> I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.
> 
> The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.
> 
> I don't understand why you get the following error when you use
> ?-mat_superlu_dist_parsymbfact?.
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> 
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
> 
> ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
> ?-mat_superlu_dist_parsymbfact?
> ? ?  (the default is to use  sequential symbolic factorization.)
> 
> 
> Sherry
> 
> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Thank you for your reply.
> 
> As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.
> 
> I am working in a Windows-environment and have installed PETSc through Cygwin.
> Apparently, there is no support for Valgrind in this OS.
> 
> If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?
> 
> 
> Best regards,
> Mahir
> 
> ______________________________________________
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
> ______________________________________________
> 
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
> Sent: den 22 juli 2015 02:57
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> 
>    Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> 
>    Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.
> 
>   Barry
> 
> 
> 
> 
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> 
> 
> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on?
> >
> > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
> >
> > Mahir
> >
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
> >
> > From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> > Sent: den 20 juli 2015 18:12
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> > Sent: den 20 juli 2015 17:39
> > To: ?lker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
> >
> > Mahir
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

From xzhao99 at gmail.com  Fri Aug  7 11:21:40 2015
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Fri, 7 Aug 2015 11:21:40 -0500
Subject: [petsc-users] SLEPc fails with POWER ITERATION method
Message-ID: <CAHOKZ654e1hg-YDGiHmxK25oMh7Pa1P52Wt-iaPDp=5Wv_Gr7A@mail.gmail.com>

Hi all,

I am solving the max eigenvalue of a Shell matrix using SLEPc. the Shell
operation is set MATOP_MULT with user-defined function u = M*f. It works
with the Krylov-Schur and Arnoldi method, but fails when I use Power
Iteration method and several others. This is strange, because some of those
are supposed to work with any type of problem.

I also wrote a power iteration algorithm by myself, and it works well and
obtains the same results with that from Krylov_Schur. I am curious why this
doesn't work is SLEPc. The following are my code and error messages:

----------------------------------------------------------------------------------
  ierr = MatSetFromOptions(M);      CHKERRQ(ierr);
  ierr =
MatShellSetOperation(M,MATOP_MULT,(void(*)())_MatMult_Stokes);CHKERRQ(ierr);

  ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");
  printf("--->test: n = %d, N = %d, rank = %d\n",n, N, (int)rank);


  /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   Create the eigensolver and set various options
   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

  ierr = EPSCreate(PETSC_COMM_WORLD,&eps);  CHKERRQ(ierr);
  ierr = EPSSetOperators(eps,M,NULL);       CHKERRQ(ierr);
  ierr = EPSSetProblemType(eps,EPS_HEP);   CHKERRQ(ierr);


  // EPSKRYLOVSCHUR(Default)/EPSARNOLDI/
  // does NOT work: EPSPOWER/EPSLANCZOS/EPSSUBSPACE
  ierr = EPSSetType(eps,EPSPOWER);          CHKERRQ(ierr);


  /*  Select portion of spectrum */

  if(option=="smallest")  // LOBPCG for smallest eigenvalue problem!
  { ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);CHKERRQ(ierr);  }
  else if(option=="largest")
  { ierr = EPSSetWhichEigenpairs(eps,EPS_LARGEST_REAL); CHKERRQ(ierr);  }
  else
  { ierr = EPSSetFromOptions(eps);  CHKERRQ(ierr);  }
  // end if-else


  // Set the tolerance and maximum iteration
  ierr = EPSSetTolerances(eps,  tol,  maxits);  CHKERRQ(ierr);
  ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS tol = %f, maxits =
%d\n",tol,maxits);CHKERRQ(ierr);


  /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   Solve the eigensystem and get the solution
   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */

  ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS solve starts
...\n");CHKERRQ(ierr);
  ierr = EPSSolve(eps);CHKERRQ(ierr);

----------------------------------------------------------------------------------


EPS tol = 0.000001, maxits = 100

EPS solve starts ...

[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------

[0]PETSC ERROR: Wrong value of eps->which

[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015

[0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
mcswl121.mcs.anl.gov by xzhao Fri Aug  7 10:31:38 2015

[0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
--with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
--download-fblaslapack --download-scalapack --download-mumps
--download-superlu_dist --download-hypre --download-ml --download-parmetis
--download-metis --download-triangle --download-chaco --download-elemental
--with-debugging=0

[0]PETSC ERROR: #1 EPSSetUp_Power() line 64 in
/Users/xzhao/software/slepc/slepc-3.5.4/src/eps/impls/power/power.c

[0]PETSC ERROR: #2 EPSSetUp() line 120 in
/Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssetup.c

[0]PETSC ERROR: #3 EPSSolve() line 88 in
/Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssolve.c

[0]PETSC ERROR: #4 compute_eigenvalue() line 318 in brownian_system.C
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150807/8e0de2f2/attachment.html>

From jroman at dsic.upv.es  Fri Aug  7 11:53:11 2015
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 7 Aug 2015 18:53:11 +0200
Subject: [petsc-users] SLEPc fails with POWER ITERATION method
In-Reply-To: <CAHOKZ654e1hg-YDGiHmxK25oMh7Pa1P52Wt-iaPDp=5Wv_Gr7A@mail.gmail.com>
References: <CAHOKZ654e1hg-YDGiHmxK25oMh7Pa1P52Wt-iaPDp=5Wv_Gr7A@mail.gmail.com>
Message-ID: <2C23A670-4E31-4EAE-9FFF-6CD9F8CBE086@dsic.upv.es>

The power method only works with which=EPS_LARGEST_MAGNITUDE (or which=EPS_TARGET_MAGNITUDE if doing shift-and-invert). The rationale is that the power iteration converges to the dominant eigenvalue (the one with largest absolute value).

Jose


> El 7/8/2015, a las 18:21, Xujun Zhao <xzhao99 at gmail.com> escribi?:
> 
> Hi all,
> 
> I am solving the max eigenvalue of a Shell matrix using SLEPc. the Shell operation is set MATOP_MULT with user-defined function u = M*f. It works with the Krylov-Schur and Arnoldi method, but fails when I use Power Iteration method and several others. This is strange, because some of those are supposed to work with any type of problem.
> 
> I also wrote a power iteration algorithm by myself, and it works well and obtains the same results with that from Krylov_Schur. I am curious why this doesn't work is SLEPc. The following are my code and error messages:
> 
> ----------------------------------------------------------------------------------
>   ierr = MatSetFromOptions(M);      CHKERRQ(ierr);
>   ierr = MatShellSetOperation(M,MATOP_MULT,(void(*)())_MatMult_Stokes);CHKERRQ(ierr);
> 
>   ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");
>   printf("--->test: n = %d, N = %d, rank = %d\n",n, N, (int)rank);
> 
> 
>   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>    Create the eigensolver and set various options
>    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
> 
>   ierr = EPSCreate(PETSC_COMM_WORLD,&eps);  CHKERRQ(ierr);
>   ierr = EPSSetOperators(eps,M,NULL);       CHKERRQ(ierr);
>   ierr = EPSSetProblemType(eps,EPS_HEP);   CHKERRQ(ierr);
> 
> 
>   // EPSKRYLOVSCHUR(Default)/EPSARNOLDI/
>   // does NOT work: EPSPOWER/EPSLANCZOS/EPSSUBSPACE
>   ierr = EPSSetType(eps,EPSPOWER);          CHKERRQ(ierr);
> 
> 
>   /*  Select portion of spectrum */
> 
>   if(option=="smallest")  // LOBPCG for smallest eigenvalue problem!
>   { ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);CHKERRQ(ierr);  }
>   else if(option=="largest")
>   { ierr = EPSSetWhichEigenpairs(eps,EPS_LARGEST_REAL); CHKERRQ(ierr);  }
>   else
>   { ierr = EPSSetFromOptions(eps);  CHKERRQ(ierr);  }
>   // end if-else
> 
> 
>   // Set the tolerance and maximum iteration
>   ierr = EPSSetTolerances(eps,  tol,  maxits);  CHKERRQ(ierr);
>   ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS tol = %f, maxits = %d\n",tol,maxits);CHKERRQ(ierr);
> 
> 
>   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>    Solve the eigensystem and get the solution
>    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
> 
>   ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS solve starts ...\n");CHKERRQ(ierr);
>   ierr = EPSSolve(eps);CHKERRQ(ierr);
> 
> ----------------------------------------------------------------------------------
> 
> 
> EPS tol = 0.000001, maxits = 100
> 
> EPS solve starts ...
> 
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> 
> [0]PETSC ERROR: Wrong value of eps->which
> 
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> 
> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> 
> [0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named mcswl121.mcs.anl.gov by xzhao Fri Aug  7 10:31:38 2015
> 
> [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps --download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0
> 
> [0]PETSC ERROR: #1 EPSSetUp_Power() line 64 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/impls/power/power.c
> 
> [0]PETSC ERROR: #2 EPSSetUp() line 120 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssetup.c
> 
> [0]PETSC ERROR: #3 EPSSolve() line 88 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssolve.c
> 
> [0]PETSC ERROR: #4 compute_eigenvalue() line 318 in brownian_system.C
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


From xzhao99 at gmail.com  Fri Aug  7 12:05:47 2015
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Fri, 7 Aug 2015 12:05:47 -0500
Subject: [petsc-users] SLEPc fails with POWER ITERATION method
In-Reply-To: <2C23A670-4E31-4EAE-9FFF-6CD9F8CBE086@dsic.upv.es>
References: <CAHOKZ654e1hg-YDGiHmxK25oMh7Pa1P52Wt-iaPDp=5Wv_Gr7A@mail.gmail.com>
	<2C23A670-4E31-4EAE-9FFF-6CD9F8CBE086@dsic.upv.es>
Message-ID: <CAHOKZ67LvQnSgHuSf5jO25HgVe+bUnk745+yndvOAM1pvF5XPw@mail.gmail.com>

Hi Jose,

Thank you for your answer. The problem now is solved with setting
EPS_LARGEST_MAGNITUDE.

Xujun

On Fri, Aug 7, 2015 at 11:53 AM, Jose E. Roman <jroman at dsic.upv.es> wrote:

> The power method only works with which=EPS_LARGEST_MAGNITUDE (or
> which=EPS_TARGET_MAGNITUDE if doing shift-and-invert). The rationale is
> that the power iteration converges to the dominant eigenvalue (the one with
> largest absolute value).
>
> Jose
>
>
> > El 7/8/2015, a las 18:21, Xujun Zhao <xzhao99 at gmail.com> escribi?:
> >
> > Hi all,
> >
> > I am solving the max eigenvalue of a Shell matrix using SLEPc. the Shell
> operation is set MATOP_MULT with user-defined function u = M*f. It works
> with the Krylov-Schur and Arnoldi method, but fails when I use Power
> Iteration method and several others. This is strange, because some of those
> are supposed to work with any type of problem.
> >
> > I also wrote a power iteration algorithm by myself, and it works well
> and obtains the same results with that from Krylov_Schur. I am curious why
> this doesn't work is SLEPc. The following are my code and error messages:
> >
> >
> ----------------------------------------------------------------------------------
> >   ierr = MatSetFromOptions(M);      CHKERRQ(ierr);
> >   ierr =
> MatShellSetOperation(M,MATOP_MULT,(void(*)())_MatMult_Stokes);CHKERRQ(ierr);
> >
> >   ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");
> >   printf("--->test: n = %d, N = %d, rank = %d\n",n, N, (int)rank);
> >
> >
> >   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >    Create the eigensolver and set various options
> >    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
> >
> >   ierr = EPSCreate(PETSC_COMM_WORLD,&eps);  CHKERRQ(ierr);
> >   ierr = EPSSetOperators(eps,M,NULL);       CHKERRQ(ierr);
> >   ierr = EPSSetProblemType(eps,EPS_HEP);   CHKERRQ(ierr);
> >
> >
> >   // EPSKRYLOVSCHUR(Default)/EPSARNOLDI/
> >   // does NOT work: EPSPOWER/EPSLANCZOS/EPSSUBSPACE
> >   ierr = EPSSetType(eps,EPSPOWER);          CHKERRQ(ierr);
> >
> >
> >   /*  Select portion of spectrum */
> >
> >   if(option=="smallest")  // LOBPCG for smallest eigenvalue problem!
> >   { ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);CHKERRQ(ierr);  }
> >   else if(option=="largest")
> >   { ierr = EPSSetWhichEigenpairs(eps,EPS_LARGEST_REAL); CHKERRQ(ierr);  }
> >   else
> >   { ierr = EPSSetFromOptions(eps);  CHKERRQ(ierr);  }
> >   // end if-else
> >
> >
> >   // Set the tolerance and maximum iteration
> >   ierr = EPSSetTolerances(eps,  tol,  maxits);  CHKERRQ(ierr);
> >   ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS tol = %f, maxits =
> %d\n",tol,maxits);CHKERRQ(ierr);
> >
> >
> >   /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >    Solve the eigensystem and get the solution
> >    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
> >
> >   ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS solve starts
> ...\n");CHKERRQ(ierr);
> >   ierr = EPSSolve(eps);CHKERRQ(ierr);
> >
> >
> ----------------------------------------------------------------------------------
> >
> >
> > EPS tol = 0.000001, maxits = 100
> >
> > EPS solve starts ...
> >
> > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >
> > [0]PETSC ERROR: Wrong value of eps->which
> >
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >
> > [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >
> > [0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
> mcswl121.mcs.anl.gov by xzhao Fri Aug  7 10:31:38 2015
> >
> > [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
> --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
> --download-fblaslapack --download-scalapack --download-mumps
> --download-superlu_dist --download-hypre --download-ml --download-parmetis
> --download-metis --download-triangle --download-chaco --download-elemental
> --with-debugging=0
> >
> > [0]PETSC ERROR: #1 EPSSetUp_Power() line 64 in
> /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/impls/power/power.c
> >
> > [0]PETSC ERROR: #2 EPSSetUp() line 120 in
> /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssetup.c
> >
> > [0]PETSC ERROR: #3 EPSSolve() line 88 in
> /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssolve.c
> >
> > [0]PETSC ERROR: #4 compute_eigenvalue() line 318 in brownian_system.C
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150807/b3d9b9bb/attachment.html>

From ustc.liu at gmail.com  Sat Aug  8 07:52:19 2015
From: ustc.liu at gmail.com (sheng liu)
Date: Sat, 8 Aug 2015 20:52:19 +0800
Subject: [petsc-users] Need to update matrix in every loop
Message-ID: <CAFc94bUK8eak7Ddv82DthDhkbErXd1mN2KEutW91AuwkqW8qzQ@mail.gmail.com>

Hello:
    I have a large sparse symmetric matrix ( about 1000000x1000000), and I
need about 10 eigenvalues near 0. The problem is: I need to run the same
program about 1000 times, each time I need to change the diagonal matrix
elements ( and they are generated randomly). Is there a fast way to
implement this problem? Thank you!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/eabc3740/attachment.html>

From bsmith at mcs.anl.gov  Sat Aug  8 12:52:05 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 8 Aug 2015 12:52:05 -0500
Subject: [petsc-users] Need to update matrix in every loop
In-Reply-To: <CAFc94bUK8eak7Ddv82DthDhkbErXd1mN2KEutW91AuwkqW8qzQ@mail.gmail.com>
References: <CAFc94bUK8eak7Ddv82DthDhkbErXd1mN2KEutW91AuwkqW8qzQ@mail.gmail.com>
Message-ID: <EB391C0A-10E0-4120-B62A-AE920C7BCBAA@mcs.anl.gov>


> On Aug 8, 2015, at 7:52 AM, sheng liu <ustc.liu at gmail.com> wrote:
> 
> Hello:
>     I have a large sparse symmetric matrix ( about 1000000x1000000), and I need about 10 eigenvalues near 0. The problem is: I need to run the same program about 1000 times, each time I need to change the diagonal matrix elements ( and they are generated randomly). Is there a fast way to implement this problem? Thank you!

  Does each run depend on the previous one or are they all independent?

  If they are independent I would introduce two levels of parallelism: On the outer level have different MPI communicators compute different random diagonal perturbations and on the inner level use a small amount of parallelism for each eigenvalue solve. The outer level of parallelism is embarrassingly parallel.

  Of course, for runs of the eigensolve use -log_summary to make sure it is running efficiently and tune the amount of parallelism in the eigensolve for best performance. 

   Barry


From mc0710 at gmail.com  Sat Aug  8 13:52:00 2015
From: mc0710 at gmail.com (Mani Chandra)
Date: Sat, 8 Aug 2015 13:52:00 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
Message-ID: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>

Hi,

I'm having trouble interfacing petsc to an application which I think is
related to the ordering of the nodes. Here's what I'm trying to do:

The application uses a structured grid with a global array having
dimensions N1 x N2, which is then decomposed into a local array with
dimensions NX1 x NX2.

I create a Petsc DMDA using

    DMDACreate2d(MPI_COMM_WORLD,
                 DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
                 DMDA_STENCIL_BOX,
                 N1, N2,
                 N1/NX1, N2/NX2,
                 1, nghost, PETSC_NULL, PETSC_NULL,
                 &dmda);

and then use this to create a vec:

  DMCreateGlobalVector(dmda, &vec);

Now I copy the local contents of the application array to the petsc array
using the following:

Let i, j be the application indices and iPetsc and jPetsc be petsc's
indices, then:

DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
                                         &iSize, &jSize, &kSize
                              );


double **arrayPetsc;
DMDAVecGetArray(dmda, vec, &arrayPetsc);

for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++, jPetsc++)
{
  for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++, iPetsc++)
  {
     arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
  }
}

DMDAVecRestoreArray(dmda, vec, &arrayPetsc);

Now if I VecView(vec, viewer) and look at the data that petsc has, it looks
right when run with 1 proc, but if I use 4 procs it's all messed up (see
attached plots).

I should probably be using the AO object but its not clear how. Could you
help me out?

Thanks,
Mani
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/c045b080/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1_proc.png
Type: image/png
Size: 31474 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/c045b080/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 4_proc.png
Type: image/png
Size: 33689 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/c045b080/attachment-0003.png>

From knepley at gmail.com  Sat Aug  8 14:19:55 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 8 Aug 2015 14:19:55 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
Message-ID: <CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>

On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com> wrote:

> Hi,
>
> I'm having trouble interfacing petsc to an application which I think is
> related to the ordering of the nodes. Here's what I'm trying to do:
>
> The application uses a structured grid with a global array having
> dimensions N1 x N2, which is then decomposed into a local array with
> dimensions NX1 x NX2.
>
> I create a Petsc DMDA using
>
>     DMDACreate2d(MPI_COMM_WORLD,
>                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
>                  DMDA_STENCIL_BOX,
>                  N1, N2,
>                  N1/NX1, N2/NX2,
>                  1, nghost, PETSC_NULL, PETSC_NULL,
>                  &dmda);
>
> and then use this to create a vec:
>
>   DMCreateGlobalVector(dmda, &vec);
>
> Now I copy the local contents of the application array to the petsc array
> using the following:
>
> Let i, j be the application indices and iPetsc and jPetsc be petsc's
> indices, then:
>
> DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
>                                          &iSize, &jSize, &kSize
>                               );
>
>
> double **arrayPetsc;
> DMDAVecGetArray(dmda, vec, &arrayPetsc);
>
> for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++, jPetsc++)
> {
>   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++, iPetsc++)
>   {
>      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
>   }
> }
>
> DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
>
> Now if I VecView(vec, viewer) and look at the data that petsc has, it
> looks right when run with 1 proc, but if I use 4 procs it's all messed up
> (see attached plots).
>
> I should probably be using the AO object but its not clear how. Could you
> help me out?
>

It looks like you have the global order of processes reversed, meaning you
have

  1   3

  0   2

and it should be

  2  3

  0  1

  Thanks,

      Matt


> Thanks,
> Mani
>
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/d95cbed1/attachment.html>

From mc0710 at gmail.com  Sat Aug  8 14:45:45 2015
From: mc0710 at gmail.com (Mani Chandra)
Date: Sat, 8 Aug 2015 14:45:45 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
Message-ID: <CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>

Thanks. Any suggestions for a fix?

Reorder the indices in arrayApplication?

On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>
>> Hi,
>>
>> I'm having trouble interfacing petsc to an application which I think is
>> related to the ordering of the nodes. Here's what I'm trying to do:
>>
>> The application uses a structured grid with a global array having
>> dimensions N1 x N2, which is then decomposed into a local array with
>> dimensions NX1 x NX2.
>>
>> I create a Petsc DMDA using
>>
>>     DMDACreate2d(MPI_COMM_WORLD,
>>                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
>>                  DMDA_STENCIL_BOX,
>>                  N1, N2,
>>                  N1/NX1, N2/NX2,
>>                  1, nghost, PETSC_NULL, PETSC_NULL,
>>                  &dmda);
>>
>> and then use this to create a vec:
>>
>>   DMCreateGlobalVector(dmda, &vec);
>>
>> Now I copy the local contents of the application array to the petsc array
>> using the following:
>>
>> Let i, j be the application indices and iPetsc and jPetsc be petsc's
>> indices, then:
>>
>> DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
>>                                          &iSize, &jSize, &kSize
>>                               );
>>
>>
>> double **arrayPetsc;
>> DMDAVecGetArray(dmda, vec, &arrayPetsc);
>>
>> for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++, jPetsc++)
>> {
>>   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++, iPetsc++)
>>   {
>>      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
>>   }
>> }
>>
>> DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
>>
>> Now if I VecView(vec, viewer) and look at the data that petsc has, it
>> looks right when run with 1 proc, but if I use 4 procs it's all messed up
>> (see attached plots).
>>
>> I should probably be using the AO object but its not clear how. Could you
>> help me out?
>>
>
> It looks like you have the global order of processes reversed, meaning you
> have
>
>   1   3
>
>   0   2
>
> and it should be
>
>   2  3
>
>   0  1
>
>   Thanks,
>
>       Matt
>
>
>> Thanks,
>> Mani
>>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/4216b6f5/attachment.html>

From knepley at gmail.com  Sat Aug  8 14:48:43 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 8 Aug 2015 14:48:43 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
	<CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
Message-ID: <CAMYG4GnEr4gW8XpDXvTgiut5YpRdX-_zqHOQBvJPsTfjNLHewA@mail.gmail.com>

On Sat, Aug 8, 2015 at 2:45 PM, Mani Chandra <mc0710 at gmail.com> wrote:

> Thanks. Any suggestions for a fix?
>

You have to deal with the right part of the domain in your application
code. I have no idea
how you are handling this, and its not in the code below.

   Matt


> Reorder the indices in arrayApplication?
>
> On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'm having trouble interfacing petsc to an application which I think is
>>> related to the ordering of the nodes. Here's what I'm trying to do:
>>>
>>> The application uses a structured grid with a global array having
>>> dimensions N1 x N2, which is then decomposed into a local array with
>>> dimensions NX1 x NX2.
>>>
>>> I create a Petsc DMDA using
>>>
>>>     DMDACreate2d(MPI_COMM_WORLD,
>>>                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
>>>                  DMDA_STENCIL_BOX,
>>>                  N1, N2,
>>>                  N1/NX1, N2/NX2,
>>>                  1, nghost, PETSC_NULL, PETSC_NULL,
>>>                  &dmda);
>>>
>>> and then use this to create a vec:
>>>
>>>   DMCreateGlobalVector(dmda, &vec);
>>>
>>> Now I copy the local contents of the application array to the petsc
>>> array using the following:
>>>
>>> Let i, j be the application indices and iPetsc and jPetsc be petsc's
>>> indices, then:
>>>
>>> DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
>>>                                          &iSize, &jSize, &kSize
>>>                               );
>>>
>>>
>>> double **arrayPetsc;
>>> DMDAVecGetArray(dmda, vec, &arrayPetsc);
>>>
>>> for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++, jPetsc++)
>>> {
>>>   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++, iPetsc++)
>>>   {
>>>      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
>>>   }
>>> }
>>>
>>> DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
>>>
>>> Now if I VecView(vec, viewer) and look at the data that petsc has, it
>>> looks right when run with 1 proc, but if I use 4 procs it's all messed up
>>> (see attached plots).
>>>
>>> I should probably be using the AO object but its not clear how. Could
>>> you help me out?
>>>
>>
>> It looks like you have the global order of processes reversed, meaning
>> you have
>>
>>   1   3
>>
>>   0   2
>>
>> and it should be
>>
>>   2  3
>>
>>   0  1
>>
>>   Thanks,
>>
>>       Matt
>>
>>
>>> Thanks,
>>> Mani
>>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/bdfa3dcb/attachment.html>

From bsmith at mcs.anl.gov  Sat Aug  8 15:03:11 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 8 Aug 2015 15:03:11 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
	<CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
Message-ID: <FCB5C7FA-4712-4635-A07D-7F64AA0B1EDF@mcs.anl.gov>


> On Aug 8, 2015, at 2:45 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> 
> Thanks. Any suggestions for a fix?

  Just flip the meaning of the x indices and the y indices in the PETSc parts of the code?

  Also run with a very different N1 and  N2 (instead of equal size) to better test the code coupling.

  Barry


> 
> Reorder the indices in arrayApplication?
> 
> On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> Hi,
> 
> I'm having trouble interfacing petsc to an application which I think is related to the ordering of the nodes. Here's what I'm trying to do:
> 
> The application uses a structured grid with a global array having dimensions N1 x N2, which is then decomposed into a local array with dimensions NX1 x NX2.
> 
> I create a Petsc DMDA using 
> 
>     DMDACreate2d(MPI_COMM_WORLD,
>                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
>                  DMDA_STENCIL_BOX,
>                  N1, N2,
>                  N1/NX1, N2/NX2,
>                  1, nghost, PETSC_NULL, PETSC_NULL,
>                  &dmda);
> 
> and then use this to create a vec:
> 
>   DMCreateGlobalVector(dmda, &vec);
> 
> Now I copy the local contents of the application array to the petsc array using the following:
> 
> Let i, j be the application indices and iPetsc and jPetsc be petsc's indices, then:
>   
> DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
>                                          &iSize, &jSize, &kSize
>                               );
> 
> 
> double **arrayPetsc;
> DMDAVecGetArray(dmda, vec, &arrayPetsc);
> 
> for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++, jPetsc++)
> {
>   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++, iPetsc++)
>   {
>      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
>   }
> }
> 
> DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
> 
> Now if I VecView(vec, viewer) and look at the data that petsc has, it looks right when run with 1 proc, but if I use 4 procs it's all messed up (see attached plots).
> 
> I should probably be using the AO object but its not clear how. Could you help me out?
> 
> It looks like you have the global order of processes reversed, meaning you have
> 
>   1   3
> 
>   0   2
> 
> and it should be
> 
>   2  3
> 
>   0  1  
> 
>   Thanks,
> 
>       Matt
>  
> Thanks,
> Mani
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 


From mc0710 at gmail.com  Sat Aug  8 15:08:24 2015
From: mc0710 at gmail.com (Mani Chandra)
Date: Sat, 8 Aug 2015 15:08:24 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <FCB5C7FA-4712-4635-A07D-7F64AA0B1EDF@mcs.anl.gov>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
	<CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
	<FCB5C7FA-4712-4635-A07D-7F64AA0B1EDF@mcs.anl.gov>
Message-ID: <CAJzU6sDB_2G5mw=7GkHaCUYq07nFnMz4z9rQJh2sJqyBvQqimg@mail.gmail.com>

Tried flipping the indices, I get a seg fault.

On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Aug 8, 2015, at 2:45 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> >
> > Thanks. Any suggestions for a fix?
>
>   Just flip the meaning of the x indices and the y indices in the PETSc
> parts of the code?
>
>   Also run with a very different N1 and  N2 (instead of equal size) to
> better test the code coupling.
>
>   Barry
>
>
> >
> > Reorder the indices in arrayApplication?
> >
> > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> > Hi,
> >
> > I'm having trouble interfacing petsc to an application which I think is
> related to the ordering of the nodes. Here's what I'm trying to do:
> >
> > The application uses a structured grid with a global array having
> dimensions N1 x N2, which is then decomposed into a local array with
> dimensions NX1 x NX2.
> >
> > I create a Petsc DMDA using
> >
> >     DMDACreate2d(MPI_COMM_WORLD,
> >                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
> >                  DMDA_STENCIL_BOX,
> >                  N1, N2,
> >                  N1/NX1, N2/NX2,
> >                  1, nghost, PETSC_NULL, PETSC_NULL,
> >                  &dmda);
> >
> > and then use this to create a vec:
> >
> >   DMCreateGlobalVector(dmda, &vec);
> >
> > Now I copy the local contents of the application array to the petsc
> array using the following:
> >
> > Let i, j be the application indices and iPetsc and jPetsc be petsc's
> indices, then:
> >
> > DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
> >                                          &iSize, &jSize, &kSize
> >                               );
> >
> >
> > double **arrayPetsc;
> > DMDAVecGetArray(dmda, vec, &arrayPetsc);
> >
> > for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++, jPetsc++)
> > {
> >   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++, iPetsc++)
> >   {
> >      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
> >   }
> > }
> >
> > DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
> >
> > Now if I VecView(vec, viewer) and look at the data that petsc has, it
> looks right when run with 1 proc, but if I use 4 procs it's all messed up
> (see attached plots).
> >
> > I should probably be using the AO object but its not clear how. Could
> you help me out?
> >
> > It looks like you have the global order of processes reversed, meaning
> you have
> >
> >   1   3
> >
> >   0   2
> >
> > and it should be
> >
> >   2  3
> >
> >   0  1
> >
> >   Thanks,
> >
> >       Matt
> >
> > Thanks,
> > Mani
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/c61f81cd/attachment-0001.html>

From bsmith at mcs.anl.gov  Sat Aug  8 15:12:43 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 8 Aug 2015 15:12:43 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <CAJzU6sDB_2G5mw=7GkHaCUYq07nFnMz4z9rQJh2sJqyBvQqimg@mail.gmail.com>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
	<CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
	<FCB5C7FA-4712-4635-A07D-7F64AA0B1EDF@mcs.anl.gov>
	<CAJzU6sDB_2G5mw=7GkHaCUYq07nFnMz4z9rQJh2sJqyBvQqimg@mail.gmail.com>
Message-ID: <89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov>


> On Aug 8, 2015, at 3:08 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> 
> Tried flipping the indices, I get a seg fault.

  You would have to be careful in exactly what you flip.  Note that the meaning of N1 and N2 etc would also be reversed between your code and the PETSc DMDA code.

  I would create a tiny DMDA and put entires like 1 2 3 4 ... into the array so you can track where the values go

  Barry

> 
> On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On Aug 8, 2015, at 2:45 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> >
> > Thanks. Any suggestions for a fix?
> 
>   Just flip the meaning of the x indices and the y indices in the PETSc parts of the code?
> 
>   Also run with a very different N1 and  N2 (instead of equal size) to better test the code coupling.
> 
>   Barry
> 
> 
> >
> > Reorder the indices in arrayApplication?
> >
> > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley <knepley at gmail.com> wrote:
> > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> > Hi,
> >
> > I'm having trouble interfacing petsc to an application which I think is related to the ordering of the nodes. Here's what I'm trying to do:
> >
> > The application uses a structured grid with a global array having dimensions N1 x N2, which is then decomposed into a local array with dimensions NX1 x NX2.
> >
> > I create a Petsc DMDA using
> >
> >     DMDACreate2d(MPI_COMM_WORLD,
> >                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
> >                  DMDA_STENCIL_BOX,
> >                  N1, N2,
> >                  N1/NX1, N2/NX2,
> >                  1, nghost, PETSC_NULL, PETSC_NULL,
> >                  &dmda);
> >
> > and then use this to create a vec:
> >
> >   DMCreateGlobalVector(dmda, &vec);
> >
> > Now I copy the local contents of the application array to the petsc array using the following:
> >
> > Let i, j be the application indices and iPetsc and jPetsc be petsc's indices, then:
> >
> > DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
> >                                          &iSize, &jSize, &kSize
> >                               );
> >
> >
> > double **arrayPetsc;
> > DMDAVecGetArray(dmda, vec, &arrayPetsc);
> >
> > for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++, jPetsc++)
> > {
> >   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++, iPetsc++)
> >   {
> >      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
> >   }
> > }
> >
> > DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
> >
> > Now if I VecView(vec, viewer) and look at the data that petsc has, it looks right when run with 1 proc, but if I use 4 procs it's all messed up (see attached plots).
> >
> > I should probably be using the AO object but its not clear how. Could you help me out?
> >
> > It looks like you have the global order of processes reversed, meaning you have
> >
> >   1   3
> >
> >   0   2
> >
> > and it should be
> >
> >   2  3
> >
> >   0  1
> >
> >   Thanks,
> >
> >       Matt
> >
> > Thanks,
> > Mani
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> >
> 
> 


From mc0710 at gmail.com  Sat Aug  8 16:56:02 2015
From: mc0710 at gmail.com (Mani Chandra)
Date: Sat, 8 Aug 2015 16:56:02 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
	<CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
	<FCB5C7FA-4712-4635-A07D-7F64AA0B1EDF@mcs.anl.gov>
	<CAJzU6sDB_2G5mw=7GkHaCUYq07nFnMz4z9rQJh2sJqyBvQqimg@mail.gmail.com>
	<89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov>
Message-ID: <CAJzU6sD0rz7fj8WeEk5XpATudz+N=Bxex1=0XpQ=wni=jVTSWQ@mail.gmail.com>

So basically one needs to correctly map

iPetsc, jPetsc -> iApplication, jApplication ?

Is there is any standard way to do this? Can I get petsc to automatically
follow the same parallel topology as the host application?

Thanks,
Mani

On Sat, Aug 8, 2015 at 3:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Aug 8, 2015, at 3:08 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> >
> > Tried flipping the indices, I get a seg fault.
>
>   You would have to be careful in exactly what you flip.  Note that the
> meaning of N1 and N2 etc would also be reversed between your code and the
> PETSc DMDA code.
>
>   I would create a tiny DMDA and put entires like 1 2 3 4 ... into the
> array so you can track where the values go
>
>   Barry
>
> >
> > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > > On Aug 8, 2015, at 2:45 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> > >
> > > Thanks. Any suggestions for a fix?
> >
> >   Just flip the meaning of the x indices and the y indices in the PETSc
> parts of the code?
> >
> >   Also run with a very different N1 and  N2 (instead of equal size) to
> better test the code coupling.
> >
> >   Barry
> >
> >
> > >
> > > Reorder the indices in arrayApplication?
> > >
> > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com> wrote:
> > > Hi,
> > >
> > > I'm having trouble interfacing petsc to an application which I think
> is related to the ordering of the nodes. Here's what I'm trying to do:
> > >
> > > The application uses a structured grid with a global array having
> dimensions N1 x N2, which is then decomposed into a local array with
> dimensions NX1 x NX2.
> > >
> > > I create a Petsc DMDA using
> > >
> > >     DMDACreate2d(MPI_COMM_WORLD,
> > >                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
> > >                  DMDA_STENCIL_BOX,
> > >                  N1, N2,
> > >                  N1/NX1, N2/NX2,
> > >                  1, nghost, PETSC_NULL, PETSC_NULL,
> > >                  &dmda);
> > >
> > > and then use this to create a vec:
> > >
> > >   DMCreateGlobalVector(dmda, &vec);
> > >
> > > Now I copy the local contents of the application array to the petsc
> array using the following:
> > >
> > > Let i, j be the application indices and iPetsc and jPetsc be petsc's
> indices, then:
> > >
> > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
> > >                                          &iSize, &jSize, &kSize
> > >                               );
> > >
> > >
> > > double **arrayPetsc;
> > > DMDAVecGetArray(dmda, vec, &arrayPetsc);
> > >
> > > for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++, jPetsc++)
> > > {
> > >   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++,
> iPetsc++)
> > >   {
> > >      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
> > >   }
> > > }
> > >
> > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
> > >
> > > Now if I VecView(vec, viewer) and look at the data that petsc has, it
> looks right when run with 1 proc, but if I use 4 procs it's all messed up
> (see attached plots).
> > >
> > > I should probably be using the AO object but its not clear how. Could
> you help me out?
> > >
> > > It looks like you have the global order of processes reversed, meaning
> you have
> > >
> > >   1   3
> > >
> > >   0   2
> > >
> > > and it should be
> > >
> > >   2  3
> > >
> > >   0  1
> > >
> > >   Thanks,
> > >
> > >       Matt
> > >
> > > Thanks,
> > > Mani
> > > --
> > > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > > -- Norbert Wiener
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/49bdbbc9/attachment.html>

From knepley at gmail.com  Sat Aug  8 16:58:49 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 8 Aug 2015 16:58:49 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <CAJzU6sD0rz7fj8WeEk5XpATudz+N=Bxex1=0XpQ=wni=jVTSWQ@mail.gmail.com>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
	<CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
	<FCB5C7FA-4712-4635-A07D-7F64AA0B1EDF@mcs.anl.gov>
	<CAJzU6sDB_2G5mw=7GkHaCUYq07nFnMz4z9rQJh2sJqyBvQqimg@mail.gmail.com>
	<89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov>
	<CAJzU6sD0rz7fj8WeEk5XpATudz+N=Bxex1=0XpQ=wni=jVTSWQ@mail.gmail.com>
Message-ID: <CAMYG4G=miJSw6Gb2xDQ+9QEDbC3kATZAfhT56LwdT71oPrquYw@mail.gmail.com>

On Sat, Aug 8, 2015 at 4:56 PM, Mani Chandra <mc0710 at gmail.com> wrote:

> So basically one needs to correctly map
>
> iPetsc, jPetsc -> iApplication, jApplication ?
>
> Is there is any standard way to do this? Can I get petsc to automatically
> follow the same parallel topology as the host application?
>

If you want to use DMDA, there is only one mapping of ranks, namely
lexicographic. However, every structured grid code I have
ever seen uses that mapping, perhaps with a permutation of the directions
{x, y, z}. Thus, the user needs to map the directions
in PETSc in the right order for the application. I am not sure how you
would automate this seeing as it depends on the application.

  Thanks,

     Matt


> Thanks,
> Mani
>
> On Sat, Aug 8, 2015 at 3:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> > On Aug 8, 2015, at 3:08 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>> >
>> > Tried flipping the indices, I get a seg fault.
>>
>>   You would have to be careful in exactly what you flip.  Note that the
>> meaning of N1 and N2 etc would also be reversed between your code and the
>> PETSc DMDA code.
>>
>>   I would create a tiny DMDA and put entires like 1 2 3 4 ... into the
>> array so you can track where the values go
>>
>>   Barry
>>
>> >
>> > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >
>> > > On Aug 8, 2015, at 2:45 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>> > >
>> > > Thanks. Any suggestions for a fix?
>> >
>> >   Just flip the meaning of the x indices and the y indices in the PETSc
>> parts of the code?
>> >
>> >   Also run with a very different N1 and  N2 (instead of equal size) to
>> better test the code coupling.
>> >
>> >   Barry
>> >
>> >
>> > >
>> > > Reorder the indices in arrayApplication?
>> > >
>> > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com>
>> wrote:
>> > > Hi,
>> > >
>> > > I'm having trouble interfacing petsc to an application which I think
>> is related to the ordering of the nodes. Here's what I'm trying to do:
>> > >
>> > > The application uses a structured grid with a global array having
>> dimensions N1 x N2, which is then decomposed into a local array with
>> dimensions NX1 x NX2.
>> > >
>> > > I create a Petsc DMDA using
>> > >
>> > >     DMDACreate2d(MPI_COMM_WORLD,
>> > >                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
>> > >                  DMDA_STENCIL_BOX,
>> > >                  N1, N2,
>> > >                  N1/NX1, N2/NX2,
>> > >                  1, nghost, PETSC_NULL, PETSC_NULL,
>> > >                  &dmda);
>> > >
>> > > and then use this to create a vec:
>> > >
>> > >   DMCreateGlobalVector(dmda, &vec);
>> > >
>> > > Now I copy the local contents of the application array to the petsc
>> array using the following:
>> > >
>> > > Let i, j be the application indices and iPetsc and jPetsc be petsc's
>> indices, then:
>> > >
>> > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
>> > >                                          &iSize, &jSize, &kSize
>> > >                               );
>> > >
>> > >
>> > > double **arrayPetsc;
>> > > DMDAVecGetArray(dmda, vec, &arrayPetsc);
>> > >
>> > > for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++,
>> jPetsc++)
>> > > {
>> > >   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++,
>> iPetsc++)
>> > >   {
>> > >      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
>> > >   }
>> > > }
>> > >
>> > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
>> > >
>> > > Now if I VecView(vec, viewer) and look at the data that petsc has, it
>> looks right when run with 1 proc, but if I use 4 procs it's all messed up
>> (see attached plots).
>> > >
>> > > I should probably be using the AO object but its not clear how. Could
>> you help me out?
>> > >
>> > > It looks like you have the global order of processes reversed,
>> meaning you have
>> > >
>> > >   1   3
>> > >
>> > >   0   2
>> > >
>> > > and it should be
>> > >
>> > >   2  3
>> > >
>> > >   0  1
>> > >
>> > >   Thanks,
>> > >
>> > >       Matt
>> > >
>> > > Thanks,
>> > > Mani
>> > > --
>> > > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > > -- Norbert Wiener
>> > >
>> >
>> >
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/2a36898f/attachment.html>

From jjraskin at mynicejob.com  Sat Aug  8 20:38:31 2015
From: jjraskin at mynicejob.com (Jeffery Raskin)
Date: Sat, 8 Aug 2015 18:38:31 -0700
Subject: [petsc-users] =?windows-1252?q?Many_Nurse_/_medical_positions_=28?=
	=?windows-1252?q?Travel=29?=
Message-ID: <162612d39c2b682dbab9a507002f99e3@mynicejob.com>

?Hi ,

Was curious if you or someone you may know was looking for a change.? We have 
the following position.? Also we have others, in various cities.

PICU NURSE - TRAVEL http://www.mynicejob.com/jobdescription.cfm?jobID=14478

ICU (RN) - http://www.mynicejob.com/jobdescription.cfm?jobID=14481

Med/Surg Telemetry (RN) http://www.mynicejob.com/jobdescription.cfm?jobID=14480

Also 400 travel nurse positions. med surg - $75? / ICU $90.? Housing is paid 
for.

Contact Jeffery Raskin - jjraskin at mynicejob.com? 

Title:   JobID:   Location:   Company Info:
*NURSING DEPT. MANAGER   14479   VALLEJO, California   MyNiceJob

*PICU NURSE - TRAVEL   14478   BALDWIN PARK, California   MyNiceJob

*LABOR & DELIVERY (RN) - PER DIEM   14477   IRVINE, California   MyNiceJob

*CATH LAB NURSE - PER DIEM   14476   SACRAMENTO, California   MyNiceJob

*RN-Emergency Department   14463   Uvalde, Travel Position   MyNiceJob

*Director, Analytics and Insights   14462   indianapolis, Indiana   MyNiceJob

*Manager of Outpatient Clinics (Nursing)   14460   King City , California   MyNiceJob

*Pharmacist   14459   King City, California   MyNiceJob

*Registered Nurse-ICU   14458   King City, California   MyNiceJob

*Registered Nurse - OB   14457   King City, California   MyNiceJob

*Physician Assistant   14456   King City, California   MyNiceJob

*Physical Therapist - FT/PT   14455   Pullman, Washington   MyNiceJob

*Manager Finance   14454   Baltimore, Maryland   MyNiceJob

*Director of MSU/ICU - FT   14453   Pullman, Washington   MyNiceJob

*Utilization Manager - Afterhours Program   14452   Remote , Louisiana   MyNiceJob

*Director, Service Coordination - RN/LVN   14451   Dallas, Texas   MyNiceJob

*Registered Nurse/ Licensed Practical Nurse   14450   silver city, New Mexico   MyNiceJob

*Manager of Surgery   14449   King City, California   MyNiceJob

*Director of Skilled Nursing Facility   14448   King City, California   MyNiceJob

*Infection Preventionist   14447   King City, California   MyNiceJob

*Director Community Services   14446   Palmdale, California   MyNiceJob

*Physician Practice Director- Uvalde Memorial Hosp   14445   San Antonio, Texas   MyNiceJob

*DEVELOPMENT MANAGER   14444   Los Angeles, California   MyNiceJob

*Outpatient Nurse Care Coordinator   14442   King City, California   MyNiceJob

*Director Quality Improvement   14441   Sunrise, Florida   MyNiceJob

*Family Nurse Practitioner   14440   King City, California   MyNiceJob

*Registered Nurse - Surgery   14439   King City, California   MyNiceJob


To stop receiving emails from our company; please reply to this email with 
the word remove in the subject line. Thank you.
         
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150808/2d80b4b7/attachment-0001.html>

From Fabian.Jakub at physik.uni-muenchen.de  Sun Aug  9 14:31:49 2015
From: Fabian.Jakub at physik.uni-muenchen.de (Fabian)
Date: Sun, 09 Aug 2015 21:31:49 +0200
Subject: [petsc-users] Mapping between application ordering and Petsc
 ordering
In-Reply-To: <CAMYG4G=miJSw6Gb2xDQ+9QEDbC3kATZAfhT56LwdT71oPrquYw@mail.gmail.com>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>	<CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>	<FCB5C7FA-4712-4635-A07D-7F64AA0B1EDF@mcs.anl.gov>	<CAJzU6sDB_2G5mw=7GkHaCUYq07nFnMz4z9rQJh2sJqyBvQqimg@mail.gmail.com>	<89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov>	<CAJzU6sD0rz7fj8WeEk5XpATudz+N=Bxex1=0XpQ=wni=jVTSWQ@mail.gmail.com>
	<CAMYG4G=miJSw6Gb2xDQ+9QEDbC3kATZAfhT56LwdT71oPrquYw@mail.gmail.com>
Message-ID: <55C7AAA5.3090708@physik.uni-muenchen.de>

If the problem is due to the rank-ordering, the following excerpt from 
the PETSc FAQ section may help:

<http://www.mcs.anl.gov/petsc/documentation/faq.html#da_mpi_cart>

The PETSc DA object decomposes the domain differently than the 
MPI_Cart_create() command. How can one use them together?

The MPI_Cart_create() first divides the mesh along the z direction, then 
the y, then the x. DMDA divides along the x, then y, then z. Thus, for 
example, rank 1 of the processes will be in a different part of the mesh 
for the two schemes. To resolve this you can create a new MPI 
communicator that you pass to DMDACreate() that renumbers the process 
ranks so that each physical process shares the same part of the mesh 
with both the DMDA and the MPI_Cart_create(). The code to determine the 
new numbering was provided by Rolf Kuiper.

// the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
// (no parallelization in direction 'dir' means dir_procs = 1)

MPI_Comm NewComm;
int MPI_Rank, NewRank, x,y,z;

// get rank from MPI ordering:
MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank);

// calculate coordinates of cpus in MPI ordering:
x = MPI_rank / (z_procs*y_procs);
y = (MPI_rank % (z_procs*y_procs)) / z_procs;
z = (MPI_rank % (z_procs*y_procs)) % z_procs;

// set new rank according to PETSc ordering:
NewRank = z*y_procs*x_procs + y*x_procs + x;

// create communicator with new ranks according to
PETSc ordering:
MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm);

// override the default communicator (was
MPI_COMM_WORLD as default)
PETSC_COMM_WORLD = NewComm;
     

On 08.08.2015 23:58, Matthew Knepley wrote:
> On Sat, Aug 8, 2015 at 4:56 PM, Mani Chandra <mc0710 at gmail.com 
> <mailto:mc0710 at gmail.com>> wrote:
>
>     So basically one needs to correctly map
>
>     iPetsc, jPetsc -> iApplication, jApplication ?
>
>     Is there is any standard way to do this? Can I get petsc to
>     automatically follow the same parallel topology as the host
>     application?
>
>
> If you want to use DMDA, there is only one mapping of ranks, namely 
> lexicographic. However, every structured grid code I have
> ever seen uses that mapping, perhaps with a permutation of the 
> directions {x, y, z}. Thus, the user needs to map the directions
> in PETSc in the right order for the application. I am not sure how you 
> would automate this seeing as it depends on the application.
>
>   Thanks,
>
>      Matt
>
>     Thanks,
>     Mani
>
>     On Sat, Aug 8, 2015 at 3:12 PM, Barry Smith <bsmith at mcs.anl.gov
>     <mailto:bsmith at mcs.anl.gov>> wrote:
>
>
>         > On Aug 8, 2015, at 3:08 PM, Mani Chandra <mc0710 at gmail.com
>         <mailto:mc0710 at gmail.com>> wrote:
>         >
>         > Tried flipping the indices, I get a seg fault.
>
>           You would have to be careful in exactly what you flip.  Note
>         that the meaning of N1 and N2 etc would also be reversed
>         between your code and the PETSc DMDA code.
>
>           I would create a tiny DMDA and put entires like 1 2 3 4 ...
>         into the array so you can track where the values go
>
>           Barry
>
>         >
>         > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith
>         <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
>         >
>         > > On Aug 8, 2015, at 2:45 PM, Mani Chandra <mc0710 at gmail.com
>         <mailto:mc0710 at gmail.com>> wrote:
>         > >
>         > > Thanks. Any suggestions for a fix?
>         >
>         >   Just flip the meaning of the x indices and the y indices
>         in the PETSc parts of the code?
>         >
>         >   Also run with a very different N1 and  N2 (instead of
>         equal size) to better test the code coupling.
>         >
>         >   Barry
>         >
>         >
>         > >
>         > > Reorder the indices in arrayApplication?
>         > >
>         > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley
>         <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>         > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra
>         <mc0710 at gmail.com <mailto:mc0710 at gmail.com>> wrote:
>         > > Hi,
>         > >
>         > > I'm having trouble interfacing petsc to an application
>         which I think is related to the ordering of the nodes. Here's
>         what I'm trying to do:
>         > >
>         > > The application uses a structured grid with a global array
>         having dimensions N1 x N2, which is then decomposed into a
>         local array with dimensions NX1 x NX2.
>         > >
>         > > I create a Petsc DMDA using
>         > >
>         > >     DMDACreate2d(MPI_COMM_WORLD,
>         > >                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
>         > >                  DMDA_STENCIL_BOX,
>         > >                  N1, N2,
>         > >                  N1/NX1, N2/NX2,
>         > >                  1, nghost, PETSC_NULL, PETSC_NULL,
>         > >                  &dmda);
>         > >
>         > > and then use this to create a vec:
>         > >
>         > >   DMCreateGlobalVector(dmda, &vec);
>         > >
>         > > Now I copy the local contents of the application array to
>         the petsc array using the following:
>         > >
>         > > Let i, j be the application indices and iPetsc and jPetsc
>         be petsc's indices, then:
>         > >
>         > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
>         > >   &iSize, &jSize, &kSize
>         > >                               );
>         > >
>         > >
>         > > double **arrayPetsc;
>         > > DMDAVecGetArray(dmda, vec, &arrayPetsc);
>         > >
>         > > for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize;
>         j++, jPetsc++)
>         > > {
>         > >   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize;
>         i++, iPetsc++)
>         > >   {
>         > >      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
>         > >   }
>         > > }
>         > >
>         > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
>         > >
>         > > Now if I VecView(vec, viewer) and look at the data that
>         petsc has, it looks right when run with 1 proc, but if I use 4
>         procs it's all messed up (see attached plots).
>         > >
>         > > I should probably be using the AO object but its not clear
>         how. Could you help me out?
>         > >
>         > > It looks like you have the global order of processes
>         reversed, meaning you have
>         > >
>         > >   1   3
>         > >
>         > >   0   2
>         > >
>         > > and it should be
>         > >
>         > >   2  3
>         > >
>         > >   0  1
>         > >
>         > >   Thanks,
>         > >
>         > >       Matt
>         > >
>         > > Thanks,
>         > > Mani
>         > > --
>         > > What most experimenters take for granted before they begin
>         their experiments is infinitely more interesting than any
>         results to which their experiments lead.
>         > > -- Norbert Wiener
>         > >
>         >
>         >
>
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150809/044a5df4/attachment.html>

From mc0710 at gmail.com  Sun Aug  9 16:57:15 2015
From: mc0710 at gmail.com (Mani Chandra)
Date: Sun, 9 Aug 2015 16:57:15 -0500
Subject: [petsc-users] Mapping between application ordering and Petsc
	ordering
In-Reply-To: <55C7AAA5.3090708@physik.uni-muenchen.de>
References: <CAJzU6sAkjXHpCsYAimDR2mpixtq7t7fP7bXSx49aRXkbSvUUWg@mail.gmail.com>
	<CAMYG4GkKTJWSosccBEYLb-aH2UUwFN5QeHqcUZ4KG9WxfCi4Zw@mail.gmail.com>
	<CAJzU6sCsAo5yqDM0ebKhgR=1xhBMNmqH2r0n6EK3neSTQafyjw@mail.gmail.com>
	<FCB5C7FA-4712-4635-A07D-7F64AA0B1EDF@mcs.anl.gov>
	<CAJzU6sDB_2G5mw=7GkHaCUYq07nFnMz4z9rQJh2sJqyBvQqimg@mail.gmail.com>
	<89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov>
	<CAJzU6sD0rz7fj8WeEk5XpATudz+N=Bxex1=0XpQ=wni=jVTSWQ@mail.gmail.com>
	<CAMYG4G=miJSw6Gb2xDQ+9QEDbC3kATZAfhT56LwdT71oPrquYw@mail.gmail.com>
	<55C7AAA5.3090708@physik.uni-muenchen.de>
Message-ID: <CAJzU6sDKGt+JXG-HdwbVEUWpwZ+yqLY0RkBFK2HT69TB9ze7Uw@mail.gmail.com>

Thank you! This was *exactly* what I was looking it. It fixed the problem.


On Sun, Aug 9, 2015 at 2:31 PM, Fabian <Fabian.Jakub at physik.uni-muenchen.de>
wrote:

> If the problem is due to the rank-ordering, the following excerpt from the
> PETSc FAQ section may help:
>
> <http://www.mcs.anl.gov/petsc/documentation/faq.html#da_mpi_cart>
> <http://www.mcs.anl.gov/petsc/documentation/faq.html#da_mpi_cart>
>
> The PETSc DA object decomposes the domain differently than the
> MPI_Cart_create() command. How can one use them together?
>
> The MPI_Cart_create() first divides the mesh along the z direction, then
> the y, then the x. DMDA divides along the x, then y, then z. Thus, for
> example, rank 1 of the processes will be in a different part of the mesh
> for the two schemes. To resolve this you can create a new MPI communicator
> that you pass to DMDACreate() that renumbers the process ranks so that each
> physical process shares the same part of the mesh with both the DMDA and
> the MPI_Cart_create(). The code to determine the new numbering was provided
> by Rolf Kuiper.
>
> // the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively
> // (no parallelization in direction 'dir' means dir_procs = 1)
>
> MPI_Comm NewComm;
> int MPI_Rank, NewRank, x,y,z;
>
> // get rank from MPI ordering:
> MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank);
>
> // calculate coordinates of cpus in MPI ordering:
> x = MPI_rank / (z_procs*y_procs);
> y = (MPI_rank % (z_procs*y_procs)) / z_procs;
> z = (MPI_rank % (z_procs*y_procs)) % z_procs;
>
> // set new rank according to PETSc ordering:
> NewRank = z*y_procs*x_procs + y*x_procs + x;
>
> // create communicator with new ranks according to
> PETSc ordering:
> MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm);
>
> // override the default communicator (was
> MPI_COMM_WORLD as default)
> PETSC_COMM_WORLD = NewComm;
>
>
>
>
> On 08.08.2015 23:58, Matthew Knepley wrote:
>
> On Sat, Aug 8, 2015 at 4:56 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>
>> So basically one needs to correctly map
>>
>> iPetsc, jPetsc -> iApplication, jApplication ?
>>
>> Is there is any standard way to do this? Can I get petsc to automatically
>> follow the same parallel topology as the host application?
>>
>
> If you want to use DMDA, there is only one mapping of ranks, namely
> lexicographic. However, every structured grid code I have
> ever seen uses that mapping, perhaps with a permutation of the directions
> {x, y, z}. Thus, the user needs to map the directions
> in PETSc in the right order for the application. I am not sure how you
> would automate this seeing as it depends on the application.
>
>   Thanks,
>
>      Matt
>
>
>> Thanks,
>> Mani
>>
>> On Sat, Aug 8, 2015 at 3:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>> > On Aug 8, 2015, at 3:08 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>>> >
>>> > Tried flipping the indices, I get a seg fault.
>>>
>>>   You would have to be careful in exactly what you flip.  Note that the
>>> meaning of N1 and N2 etc would also be reversed between your code and the
>>> PETSc DMDA code.
>>>
>>>   I would create a tiny DMDA and put entires like 1 2 3 4 ... into the
>>> array so you can track where the values go
>>>
>>>   Barry
>>>
>>> >
>>> > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> >
>>> > > On Aug 8, 2015, at 2:45 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>>> > >
>>> > > Thanks. Any suggestions for a fix?
>>> >
>>> >   Just flip the meaning of the x indices and the y indices in the
>>> PETSc parts of the code?
>>> >
>>> >   Also run with a very different N1 and  N2 (instead of equal size) to
>>> better test the code coupling.
>>> >
>>> >   Barry
>>> >
>>> >
>>> > >
>>> > > Reorder the indices in arrayApplication?
>>> > >
>>> > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>> > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra <mc0710 at gmail.com>
>>> wrote:
>>> > > Hi,
>>> > >
>>> > > I'm having trouble interfacing petsc to an application which I think
>>> is related to the ordering of the nodes. Here's what I'm trying to do:
>>> > >
>>> > > The application uses a structured grid with a global array having
>>> dimensions N1 x N2, which is then decomposed into a local array with
>>> dimensions NX1 x NX2.
>>> > >
>>> > > I create a Petsc DMDA using
>>> > >
>>> > >     DMDACreate2d(MPI_COMM_WORLD,
>>> > >                  DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC,
>>> > >                  DMDA_STENCIL_BOX,
>>> > >                  N1, N2,
>>> > >                  N1/NX1, N2/NX2,
>>> > >                  1, nghost, PETSC_NULL, PETSC_NULL,
>>> > >                  &dmda);
>>> > >
>>> > > and then use this to create a vec:
>>> > >
>>> > >   DMCreateGlobalVector(dmda, &vec);
>>> > >
>>> > > Now I copy the local contents of the application array to the petsc
>>> array using the following:
>>> > >
>>> > > Let i, j be the application indices and iPetsc and jPetsc be petsc's
>>> indices, then:
>>> > >
>>> > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart,
>>> > >                                          &iSize, &jSize, &kSize
>>> > >                               );
>>> > >
>>> > >
>>> > > double **arrayPetsc;
>>> > > DMDAVecGetArray(dmda, vec, &arrayPetsc);
>>> > >
>>> > > for (int j=0, jPetsc=jStart; j<NX2, jPetsc<jStart+jSize; j++,
>>> jPetsc++)
>>> > > {
>>> > >   for (int i=0, iPetsc=iStart; i<NX1, iPetsc<iStart+iSize; i++,
>>> iPetsc++)
>>> > >   {
>>> > >      arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i];
>>> > >   }
>>> > > }
>>> > >
>>> > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc);
>>> > >
>>> > > Now if I VecView(vec, viewer) and look at the data that petsc has,
>>> it looks right when run with 1 proc, but if I use 4 procs it's all messed
>>> up (see attached plots).
>>> > >
>>> > > I should probably be using the AO object but its not clear how.
>>> Could you help me out?
>>> > >
>>> > > It looks like you have the global order of processes reversed,
>>> meaning you have
>>> > >
>>> > >   1   3
>>> > >
>>> > >   0   2
>>> > >
>>> > > and it should be
>>> > >
>>> > >   2  3
>>> > >
>>> > >   0  1
>>> > >
>>> > >   Thanks,
>>> > >
>>> > >       Matt
>>> > >
>>> > > Thanks,
>>> > > Mani
>>> > > --
>>> > > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > > -- Norbert Wiener
>>> > >
>>> >
>>> >
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150809/1cda3839/attachment-0001.html>

From hgbk2008 at gmail.com  Mon Aug 10 03:57:52 2015
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Mon, 10 Aug 2015 10:57:52 +0200
Subject: [petsc-users] set the diagonal for zero row
Message-ID: <55C86790.9040401@gmail.com>

Dear list

What is the best way to search for complete zero rows of the matrix and 
set the diagonal to 1.0? In my thinking, the solution would be:
+ extract the max absolute values of each row by using MatGetRowMaxAbs
+ Compare the value with some tolerance and put into the zero row list
+ Extract the diagonal of the matrix by MatGetDiagonal
+ modify the vector of diagonal
+ set the diagonal back by using MatDiagonalSet

THis seem to be overly complicated. Is there an all-in-one solution, 
similar to MatZeroRowsColumns?

Best regards
Giang

From knepley at gmail.com  Mon Aug 10 07:02:10 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 10 Aug 2015 07:02:10 -0500
Subject: [petsc-users] set the diagonal for zero row
In-Reply-To: <55C86790.9040401@gmail.com>
References: <55C86790.9040401@gmail.com>
Message-ID: <CAMYG4GmQL7g8Y9iUW1651rh=PHm7dAoj-TrnWr_tpr1wdnFqew@mail.gmail.com>

On Mon, Aug 10, 2015 at 3:57 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:

> Dear list
>
> What is the best way to search for complete zero rows of the matrix and
> set the diagonal to 1.0? In my thinking, the solution would be:
> + extract the max absolute values of each row by using MatGetRowMaxAbs
> + Compare the value with some tolerance and put into the zero row list
> + Extract the diagonal of the matrix by MatGetDiagonal
> + modify the vector of diagonal
> + set the diagonal back by using MatDiagonalSet
>
> THis seem to be overly complicated. Is there an all-in-one solution,
> similar to MatZeroRowsColumns?
>

I don't think there is anything better than checking the MaxAbs, and
calling MatZeroRows() on the indices with a 0.0

  Thanks,

     Matt


> Best regards
> Giang
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150810/7150b2d9/attachment.html>

From jychang48 at gmail.com  Mon Aug 10 09:56:59 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Mon, 10 Aug 2015 09:56:59 -0500
Subject: [petsc-users] Augmented Lagrangian examples?
Message-ID: <CAP2=TMhs7gSHM4imbrCcyd6u25t9gm_S2PAMCKyO2OvF+EtHKw@mail.gmail.com>

Hi all,

1) I ran across this paper:

http://web.stanford.edu/~egawlik/pdf/GaMuSaWi2012.pdf

and was wondering if there are any current TAO examples that do this.

also

2) If I integrate this into an FEM (from DMPlex) I will need to assemble an
equality jacobian matrix and constraint vector. But the element-wise
constraints that I need to compute (e.g., the divergence) needs all degrees
of freedom within the element closure including the essential BCs DMPlex
removes from the global matrix/vector. So how can I work around this and/or
access said removed terms inside a TAO routine?

Thanks,
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150810/20b3c62e/attachment.html>

From knepley at gmail.com  Mon Aug 10 10:01:07 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 10 Aug 2015 10:01:07 -0500
Subject: [petsc-users] Augmented Lagrangian examples?
In-Reply-To: <CAP2=TMhs7gSHM4imbrCcyd6u25t9gm_S2PAMCKyO2OvF+EtHKw@mail.gmail.com>
References: <CAP2=TMhs7gSHM4imbrCcyd6u25t9gm_S2PAMCKyO2OvF+EtHKw@mail.gmail.com>
Message-ID: <CAMYG4G=JfaZv3V2_hPO7QxLik4QHZuqHnSDVcOugjOot0ZcaZw@mail.gmail.com>

On Mon, Aug 10, 2015 at 9:56 AM, Justin Chang <jychang48 at gmail.com> wrote:

> Hi all,
>
> 1) I ran across this paper:
>
> http://web.stanford.edu/~egawlik/pdf/GaMuSaWi2012.pdf
>
> and was wondering if there are any current TAO examples that do this.
>
> also
>
> 2) If I integrate this into an FEM (from DMPlex) I will need to assemble
> an equality jacobian matrix and constraint vector. But the element-wise
> constraints that I need to compute (e.g., the divergence) needs all degrees
> of freedom within the element closure including the essential BCs DMPlex
> removes from the global matrix/vector. So how can I work around this and/or
> access said removed terms inside a TAO routine?
>

Those are not really dofs, they are boundary values, so they are not in the
global system. The local vectors have
the boundary values, so you can calculate the correct constraint eqns to
put in the global system.

   Matt


> Thanks,
> Justin
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150810/19286a5a/attachment.html>

From jychang48 at gmail.com  Mon Aug 10 11:05:39 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Mon, 10 Aug 2015 11:05:39 -0500
Subject: [petsc-users] Augmented Lagrangian examples?
In-Reply-To: <CAMYG4G=JfaZv3V2_hPO7QxLik4QHZuqHnSDVcOugjOot0ZcaZw@mail.gmail.com>
References: <CAP2=TMhs7gSHM4imbrCcyd6u25t9gm_S2PAMCKyO2OvF+EtHKw@mail.gmail.com>
	<CAMYG4G=JfaZv3V2_hPO7QxLik4QHZuqHnSDVcOugjOot0ZcaZw@mail.gmail.com>
Message-ID: <CAP2=TMihawTvQL96p1Hx52p8rM7TwkDv7ecDx9ps8BMRMFhNfA@mail.gmail.com>

Matt,

So inside these TAO routines, if I wanted to include the boundary values,
would I follow the approaches in functions like
DMPlexComputeResidual/Jacobian_Internal? I assume I need something like:

DMGetLocalVector(dm,xlocal);
DMPlexInsertBoundaryValues(xlocal,...);
** use xlocal to compute equality constraints/jacobian **
DMRestoreLocalVector(dm,xlocal);

The Jacobian and equality constraints that I want to assemble are not the
same as the DM matrix use for the entire problem. I am guessing I will need
to use a different DS for the DM because, for example the stokes problem
with TH elements, I want to assemble an equality jacobian of size (no. of
cells) by (no. of velocity dofs), and an equality constraints vector of
size (no. of cells). How would I go about doing a problem like this?

Thanks,
Justin


On Mon, Aug 10, 2015 at 10:01 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, Aug 10, 2015 at 9:56 AM, Justin Chang <jychang48 at gmail.com> wrote:
>
>> Hi all,
>>
>> 1) I ran across this paper:
>>
>> http://web.stanford.edu/~egawlik/pdf/GaMuSaWi2012.pdf
>>
>> and was wondering if there are any current TAO examples that do this.
>>
>> also
>>
>> 2) If I integrate this into an FEM (from DMPlex) I will need to assemble
>> an equality jacobian matrix and constraint vector. But the element-wise
>> constraints that I need to compute (e.g., the divergence) needs all degrees
>> of freedom within the element closure including the essential BCs DMPlex
>> removes from the global matrix/vector. So how can I work around this and/or
>> access said removed terms inside a TAO routine?
>>
>
> Those are not really dofs, they are boundary values, so they are not in
> the global system. The local vectors have
> the boundary values, so you can calculate the correct constraint eqns to
> put in the global system.
>
>    Matt
>
>
>> Thanks,
>> Justin
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150810/78d90f7f/attachment.html>

From knepley at gmail.com  Mon Aug 10 14:26:15 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 10 Aug 2015 14:26:15 -0500
Subject: [petsc-users] Augmented Lagrangian examples?
In-Reply-To: <CAP2=TMihawTvQL96p1Hx52p8rM7TwkDv7ecDx9ps8BMRMFhNfA@mail.gmail.com>
References: <CAP2=TMhs7gSHM4imbrCcyd6u25t9gm_S2PAMCKyO2OvF+EtHKw@mail.gmail.com>
	<CAMYG4G=JfaZv3V2_hPO7QxLik4QHZuqHnSDVcOugjOot0ZcaZw@mail.gmail.com>
	<CAP2=TMihawTvQL96p1Hx52p8rM7TwkDv7ecDx9ps8BMRMFhNfA@mail.gmail.com>
Message-ID: <CAMYG4GkHQbpD4ovc1drLNZMrmv_wp3y2OC=QBDJ7jAxnMvdwXw@mail.gmail.com>

On Mon, Aug 10, 2015 at 11:05 AM, Justin Chang <jychang48 at gmail.com> wrote:

> Matt,
>
> So inside these TAO routines, if I wanted to include the boundary values,
> would I follow the approaches in functions like
> DMPlexComputeResidual/Jacobian_Internal? I assume I need something like:
>
> DMGetLocalVector(dm,xlocal);
> DMPlexInsertBoundaryValues(xlocal,...);
> ** use xlocal to compute equality constraints/jacobian **
> DMRestoreLocalVector(dm,xlocal);
>

Yes.


> The Jacobian and equality constraints that I want to assemble are not the
> same as the DM matrix use for the entire problem. I am guessing I will need
> to use a different DS for the DM because, for example the stokes problem
> with TH elements, I want to assemble an equality jacobian of size (no. of
> cells) by (no. of velocity dofs), and an equality constraints vector of
> size (no. of cells). How would I go about doing a problem like this?
>

Okay, now we get into some choices I made in Plex. The original version I
wrote could assemble rectangular matrices. This
was a huge complication that no one took advantage of, so I got rid of it.
Now I just assemble the entire problem, and if you
want pieces of it, I pull them out using MatGetSubmatrix(). I still believe
this is the cleanest thing to do.

Maybe you could schematically tell me what you want to do.

  Thanks,

    Matt


> Thanks,
> Justin
>
>
> On Mon, Aug 10, 2015 at 10:01 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Mon, Aug 10, 2015 at 9:56 AM, Justin Chang <jychang48 at gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> 1) I ran across this paper:
>>>
>>> http://web.stanford.edu/~egawlik/pdf/GaMuSaWi2012.pdf
>>>
>>> and was wondering if there are any current TAO examples that do this.
>>>
>>> also
>>>
>>> 2) If I integrate this into an FEM (from DMPlex) I will need to assemble
>>> an equality jacobian matrix and constraint vector. But the element-wise
>>> constraints that I need to compute (e.g., the divergence) needs all degrees
>>> of freedom within the element closure including the essential BCs DMPlex
>>> removes from the global matrix/vector. So how can I work around this and/or
>>> access said removed terms inside a TAO routine?
>>>
>>
>> Those are not really dofs, they are boundary values, so they are not in
>> the global system. The local vectors have
>> the boundary values, so you can calculate the correct constraint eqns to
>> put in the global system.
>>
>>    Matt
>>
>>
>>> Thanks,
>>> Justin
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150810/a5ae7c0b/attachment.html>

From aph at email.arizona.edu  Mon Aug 10 15:50:50 2015
From: aph at email.arizona.edu (Anthony Haas)
Date: Mon, 10 Aug 2015 13:50:50 -0700
Subject: [petsc-users] SIGSEGV in Superlu_dist
Message-ID: <55C90EAA.5060702@email.arizona.edu>

Hi Sherry,

I recently submitted a matrix for which I noticed that Superlu_dist was 
hanging when running on 4 processors with parallel symbolic 
factorization. I have been using the latest version of Superlu_dist and 
the code is not hanging anymore. However, I noticed that when running 
the same matrix (I have attached the matrix), the code crashes with the 
following SIGSEGV when running on 10 procs (with or without parallel 
symbolic factorization). It is probably overkill to run such a 'small' 
matrix on 10 procs but I thought that it might still be useful to report 
the problem?? See below for the error obtained when running with gdb and 
also a code snippet to reproduce the error.

Thanks,


Anthony


1) ERROR in GDB

Program received signal SIGSEGV, Segmentation fault.
0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
     A=0x14a6a70, info=0x19099f8)
     at 
/home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
368           colA_start = rstart + ajj[0]; /* the smallest global col 
index of A */
(gdb)


2) PORTION OF CODE TO REPRODUCE ERROR

     Subroutine HowBigLUCanBe(rank)

       IMPLICIT NONE

       integer(i4b),intent(in) :: rank
       integer(i4b)            :: i,ct
       real(dp)                :: begin,endd
       complex(dpc)            :: sigma

       PetscErrorCode ierr


       if (rank==0) call cpu_time(begin)

       if (rank==0) then
          write(*,*)
          write(*,*)'Testing How Big LU Can Be...'
          write(*,*)'============================'
          write(*,*)
       endif

       !sigma = (1.0d0,0.0d0)
       !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on 
exit A = A-sigma*B

       !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)

!.....Write Matrix to ASCII and Binary Format
       !call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
       !call MatView(DXX,viewer,ierr)
       !call PetscViewerDestroy(viewer,ierr)

       !call 
PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
       !call MatView(A,viewer,ierr)
       !call PetscViewerDestroy(viewer,ierr)

!...Load a Matrix in Binary Format
       call 
PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
       call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
       call MatSetType(DLOAD,MATAIJ,ierr)
       call MatLoad(DLOAD,viewer,ierr)
       call PetscViewerDestroy(viewer,ierr)

       !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)


!.....Create Linear Solver Context
       call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)

!.....Set operators. Here the matrix that defines the linear system also 
serves as the preconditioning matrix.
       !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) 
!aha commented and replaced by next line

       !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B
       call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here A = 
A-sigma*B

!.....Set Relative and Absolute Tolerances and Uses Default for 
Divergence Tol
       tol = 1.e-10
       call 
KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)

!.....Set the Direct (LU) Solver
       call KSPSetType(ksp,KSPPREONLY,ierr)
       call KSPGetPC(ksp,pc,ierr)
       call PCSetType(pc,PCLU,ierr)
       call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! 
MATSOLVERSUPERLU_DIST MATSOLVERMUMPS

!.....Create Right-Hand-Side Vector
       !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
       !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)

       call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)
       call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)

       call MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)

       allocate(xwork1(IendA-IstartA))
       allocate(loc(IendA-IstartA))

       ct=0
       do i=IstartA,IendA-1
          ct=ct+1
          loc(ct)=i
          xwork1(ct)=(1.0d0,0.0d0)
       enddo

       call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
       call VecZeroEntries(sol,ierr)

       deallocate(xwork1,loc)

!.....Assemble Vectors
       call VecAssemblyBegin(frhs,ierr)
       call VecAssemblyEnd(frhs,ierr)

!.....Solve the Linear System
       call KSPSolve(ksp,frhs,sol,ierr)

       !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)

       if (rank==0) then
          call cpu_time(endd)
          write(*,*)
          print '("Total time for HowBigLUCanBe = ",f21.3," 
seconds.")',endd-begin
       endif

       call SlepcFinalize(ierr)

       STOP


     end Subroutine HowBigLUCanBe

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Amat_binary.m
Type: text/x-objcsrc
Size: 7906356 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150810/eecf73d6/attachment-0001.bin>
-------------- next part --------------
-matload_block_size 1

From bsmith at mcs.anl.gov  Mon Aug 10 16:27:00 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 10 Aug 2015 16:27:00 -0500
Subject: [petsc-users] SIGSEGV in Superlu_dist
In-Reply-To: <55C90EAA.5060702@email.arizona.edu>
References: <55C90EAA.5060702@email.arizona.edu>
Message-ID: <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>


  Anthony,

   This crash is in PETSc code before it calls the SuperLU_DIST numeric factorization; likely we have a mistake such as assuming a process has at least one row of the matrix and need to fix it.

   Barry

> 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
>    A=0x14a6a70, info=0x19099f8)
>    at /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> 368           colA_start = rstart + ajj[0]; /* the smallest global col index of A */


> On Aug 10, 2015, at 3:50 PM, Anthony Haas <aph at email.arizona.edu> wrote:
> 
> Hi Sherry,
> 
> I recently submitted a matrix for which I noticed that Superlu_dist was hanging when running on 4 processors with parallel symbolic factorization. I have been using the latest version of Superlu_dist and the code is not hanging anymore. However, I noticed that when running the same matrix (I have attached the matrix), the code crashes with the following SIGSEGV when running on 10 procs (with or without parallel symbolic factorization). It is probably overkill to run such a 'small' matrix on 10 procs but I thought that it might still be useful to report the problem?? See below for the error obtained when running with gdb and also a code snippet to reproduce the error.
> 
> Thanks,
> 
> 
> Anthony
> 
> 
> 
> 1) ERROR in GDB
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
>    A=0x14a6a70, info=0x19099f8)
>    at /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> 368           colA_start = rstart + ajj[0]; /* the smallest global col index of A */
> (gdb)
> 
> 
> 
> 2) PORTION OF CODE TO REPRODUCE ERROR
> 
>    Subroutine HowBigLUCanBe(rank)
> 
>      IMPLICIT NONE
> 
>      integer(i4b),intent(in) :: rank
>      integer(i4b)            :: i,ct
>      real(dp)                :: begin,endd
>      complex(dpc)            :: sigma
> 
>      PetscErrorCode ierr
> 
> 
>      if (rank==0) call cpu_time(begin)
> 
>      if (rank==0) then
>         write(*,*)
>         write(*,*)'Testing How Big LU Can Be...'
>         write(*,*)'============================'
>         write(*,*)
>      endif
> 
>      !sigma = (1.0d0,0.0d0)
>      !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit A = A-sigma*B
> 
>      !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
> !.....Write Matrix to ASCII and Binary Format
>      !call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
>      !call MatView(DXX,viewer,ierr)
>      !call PetscViewerDestroy(viewer,ierr)
> 
>      !call PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
>      !call MatView(A,viewer,ierr)
>      !call PetscViewerDestroy(viewer,ierr)
> 
> !...Load a Matrix in Binary Format
>      call PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
>      call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
>      call MatSetType(DLOAD,MATAIJ,ierr)
>      call MatLoad(DLOAD,viewer,ierr)
>      call PetscViewerDestroy(viewer,ierr)
> 
>      !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
> 
> !.....Create Linear Solver Context
>      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> 
> !.....Set operators. Here the matrix that defines the linear system also serves as the preconditioning matrix.
>      !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha commented and replaced by next line
> 
>      !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B
>      call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here A = A-sigma*B
> 
> !.....Set Relative and Absolute Tolerances and Uses Default for Divergence Tol
>      tol = 1.e-10
>      call KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
> 
> !.....Set the Direct (LU) Solver
>      call KSPSetType(ksp,KSPPREONLY,ierr)
>      call KSPGetPC(ksp,pc,ierr)
>      call PCSetType(pc,PCLU,ierr)
>      call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! MATSOLVERSUPERLU_DIST MATSOLVERMUMPS
> 
> !.....Create Right-Hand-Side Vector
>      !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
>      !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)
> 
>      call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)
>      call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)
> 
>      call MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)
> 
>      allocate(xwork1(IendA-IstartA))
>      allocate(loc(IendA-IstartA))
> 
>      ct=0
>      do i=IstartA,IendA-1
>         ct=ct+1
>         loc(ct)=i
>         xwork1(ct)=(1.0d0,0.0d0)
>      enddo
> 
>      call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
>      call VecZeroEntries(sol,ierr)
> 
>      deallocate(xwork1,loc)
> 
> !.....Assemble Vectors
>      call VecAssemblyBegin(frhs,ierr)
>      call VecAssemblyEnd(frhs,ierr)
> 
> !.....Solve the Linear System
>      call KSPSolve(ksp,frhs,sol,ierr)
> 
>      !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)
> 
>      if (rank==0) then
>         call cpu_time(endd)
>         write(*,*)
>         print '("Total time for HowBigLUCanBe = ",f21.3," seconds.")',endd-begin
>      endif
> 
>      call SlepcFinalize(ierr)
> 
>      STOP
> 
> 
>    end Subroutine HowBigLUCanBe
> 
> <Amat_binary.m.info>


From hzhang at mcs.anl.gov  Mon Aug 10 16:58:19 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Mon, 10 Aug 2015 16:58:19 -0500
Subject: [petsc-users] SIGSEGV in Superlu_dist
In-Reply-To: <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>
References: <55C90EAA.5060702@email.arizona.edu>
	<8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>
Message-ID: <CAGCphBv81+zuC9R6DWL6rbs9VqPnna6CXZJ1NXL-zLUpuZzsxA@mail.gmail.com>

I'll fix this in the release if no one has done it yet.
Hong

On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Anthony,
>
>    This crash is in PETSc code before it calls the SuperLU_DIST numeric
> factorization; likely we have a mistake such as assuming a process has at
> least one row of the matrix and need to fix it.
>
>    Barry
>
> > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
> >    A=0x14a6a70, info=0x19099f8)
> >    at
> /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> > 368           colA_start = rstart + ajj[0]; /* the smallest global col
> index of A */
>
>
>
> > On Aug 10, 2015, at 3:50 PM, Anthony Haas <aph at email.arizona.edu> wrote:
> >
> > Hi Sherry,
> >
> > I recently submitted a matrix for which I noticed that Superlu_dist was
> hanging when running on 4 processors with parallel symbolic factorization.
> I have been using the latest version of Superlu_dist and the code is not
> hanging anymore. However, I noticed that when running the same matrix (I
> have attached the matrix), the code crashes with the following SIGSEGV when
> running on 10 procs (with or without parallel symbolic factorization). It
> is probably overkill to run such a 'small' matrix on 10 procs but I thought
> that it might still be useful to report the problem?? See below for the
> error obtained when running with gdb and also a code snippet to reproduce
> the error.
> >
> > Thanks,
> >
> >
> > Anthony
> >
> >
> >
> > 1) ERROR in GDB
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
> >    A=0x14a6a70, info=0x19099f8)
> >    at
> /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> > 368           colA_start = rstart + ajj[0]; /* the smallest global col
> index of A */
> > (gdb)
> >
> >
> >
> > 2) PORTION OF CODE TO REPRODUCE ERROR
> >
> >    Subroutine HowBigLUCanBe(rank)
> >
> >      IMPLICIT NONE
> >
> >      integer(i4b),intent(in) :: rank
> >      integer(i4b)            :: i,ct
> >      real(dp)                :: begin,endd
> >      complex(dpc)            :: sigma
> >
> >      PetscErrorCode ierr
> >
> >
> >      if (rank==0) call cpu_time(begin)
> >
> >      if (rank==0) then
> >         write(*,*)
> >         write(*,*)'Testing How Big LU Can Be...'
> >         write(*,*)'============================'
> >         write(*,*)
> >      endif
> >
> >      !sigma = (1.0d0,0.0d0)
> >      !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit
> A = A-sigma*B
> >
> >      !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> > !.....Write Matrix to ASCII and Binary Format
> >      !call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
> >      !call MatView(DXX,viewer,ierr)
> >      !call PetscViewerDestroy(viewer,ierr)
> >
> >      !call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
> >      !call MatView(A,viewer,ierr)
> >      !call PetscViewerDestroy(viewer,ierr)
> >
> > !...Load a Matrix in Binary Format
> >      call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
> >      call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
> >      call MatSetType(DLOAD,MATAIJ,ierr)
> >      call MatLoad(DLOAD,viewer,ierr)
> >      call PetscViewerDestroy(viewer,ierr)
> >
> >      !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> >
> > !.....Create Linear Solver Context
> >      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> >
> > !.....Set operators. Here the matrix that defines the linear system also
> serves as the preconditioning matrix.
> >      !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha
> commented and replaced by next line
> >
> >      !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B
> >      call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here A =
> A-sigma*B
> >
> > !.....Set Relative and Absolute Tolerances and Uses Default for
> Divergence Tol
> >      tol = 1.e-10
> >      call
> KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
> >
> > !.....Set the Direct (LU) Solver
> >      call KSPSetType(ksp,KSPPREONLY,ierr)
> >      call KSPGetPC(ksp,pc,ierr)
> >      call PCSetType(pc,PCLU,ierr)
> >      call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) !
> MATSOLVERSUPERLU_DIST MATSOLVERMUMPS
> >
> > !.....Create Right-Hand-Side Vector
> >      !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
> >      !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)
> >
> >      call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)
> >      call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)
> >
> >      call MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)
> >
> >      allocate(xwork1(IendA-IstartA))
> >      allocate(loc(IendA-IstartA))
> >
> >      ct=0
> >      do i=IstartA,IendA-1
> >         ct=ct+1
> >         loc(ct)=i
> >         xwork1(ct)=(1.0d0,0.0d0)
> >      enddo
> >
> >      call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
> >      call VecZeroEntries(sol,ierr)
> >
> >      deallocate(xwork1,loc)
> >
> > !.....Assemble Vectors
> >      call VecAssemblyBegin(frhs,ierr)
> >      call VecAssemblyEnd(frhs,ierr)
> >
> > !.....Solve the Linear System
> >      call KSPSolve(ksp,frhs,sol,ierr)
> >
> >      !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> >      if (rank==0) then
> >         call cpu_time(endd)
> >         write(*,*)
> >         print '("Total time for HowBigLUCanBe = ",f21.3,"
> seconds.")',endd-begin
> >      endif
> >
> >      call SlepcFinalize(ierr)
> >
> >      STOP
> >
> >
> >    end Subroutine HowBigLUCanBe
> >
> > <Amat_binary.m.info>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150810/9bbc1017/attachment.html>

From Mahir.Ulker-Kaustell at tyrens.se  Tue Aug 11 09:31:59 2015
From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se)
Date: Tue, 11 Aug 2015 14:31:59 +0000
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <alpine.LFD.2.20.1508071108120.20765@asterix>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
	<e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
	<CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>
	<63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se>
	<CAGCphBts_ppiRAKXz3zJPYYz+dwC4fsvOOi-4yhXLRdLRKLahg@mail.gmail.com>
	<429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se>
	<alpine.LFD.2.20.1508071108120.20765@asterix>
Message-ID: <a8e5fb98ebb94e82adbbfb40e37bf3ba@STHWS42.tyrens.se>

Yes! Doing:

$PETSC_DIR/$PETSC_ARCH/bin/mpiexec

instead of 

mpiexec

makes the program run as expected. 

Thank you all for your patience and encouragement.

Sherry: I have noticed that you have been involved in some publications related to my current work, i.e. wave propagation in elastic solids. What computation time would you expect using SuperLU to solve one linear system with say 800000 degrees of freedom and 4-8 processes (on a single node) with a finite element discretization?

Mahir


-----Original Message-----
From: Satish Balay [mailto:balay at mcs.anl.gov] 
Sent: den 7 augusti 2015 18:09
To: ?lker-Kaustell, Mahir
Cc: Hong; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

This usually happens if you use the wrong MPIEXEC

i.e use the mpiexec from the MPI you built PETSc with.

Satish

On Fri, 7 Aug 2015, Mahir.Ulker-Kaustell at tyrens.se wrote:

> Hong,
> 
> Running example 2 with the command line given below gives me two uniprocessor runs!?
> 
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=56, cols=56
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=56, cols=56
>     total: nonzeros=250, allocated nonzeros=280
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Norm of error 5.21214e-15 iterations 1
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=56, cols=56
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=56, cols=56
>     total: nonzeros=250, allocated nonzeros=280
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Norm of error 5.21214e-15 iterations 1
> 
> Mahir
> 
> From: Hong [mailto:hzhang at mcs.anl.gov]
> Sent: den 6 augusti 2015 16:36
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir:
> 
> I have been using PETSC_COMM_WORLD.
> 
> What do you get by running a petsc example, e.g.,
> petsc/src/ksp/ksp/examples/tutorials
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
> 
> KSP Object: 2 MPI processes
>   type: gmres
> ...
> 
> Hong
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 5 augusti 2015 17:11
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir:
> As you noticed, you ran the code in serial mode, not parallel.
> Check your code on input communicator, e.g., what input communicator do you use in
> KSPCreate(comm,&ksp)?
> 
> I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact'
> in serial mode, this option is ignored with a warning.
> 
> Hong
> 
> Hong,
> 
> If I set parsymbfact:
> 
> $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
> Invalid ISPEC at line 484 in file get_perm_c.c
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec detected that one or more processes exited with non-zero status, thus causing
> the job to be terminated. The first process to do so was:
> 
>   Process name: [[63679,1],0]
>   Exit code:    255
> --------------------------------------------------------------------------
> 
> Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view.
> 
> If I do not set it, I get a serial run even if I specify ?n 2:
> 
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
> ?
> KSP Object: 1 MPI processes
>   type: preonly
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using NONE norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=954, cols=954
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=954, cols=954
>     total: nonzeros=34223, allocated nonzeros=34223
>     total number of mallocs used during MatSetValues calls =0
>       using I-node routines: found 668 nodes, limit used is 5
> 
> I am running PETSc via Cygwin on a windows machine.
> When I installed PETSc the tests with different numbers of processes ran well.
> 
> Mahir
> 
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 3 augusti 2015 19:06
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> 
> I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs.
> 
> If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1:
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1
> 
> The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact.
> 
> Please run it with '-ksp_view' and see what
> 'SuperLU_DIST run parameters:' are being used, e.g.
> petsc/src/ksp/ksp/examples/tutorials (maint)
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view
> 
> ...
>   SuperLU_DIST run parameters:
>               Process grid nprow 2 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 1
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 2 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
> 
> I do not understand why your code uses matrix input mode = global.
> 
> Hong
> 
> 
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 3 augusti 2015 16:46
> To: Xiaoye S. Li
> Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list
> 
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> Sherry found the culprit. I can reproduce it:
> petsc/src/ksp/ksp/examples/tutorials
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> ...
> 
> PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes.
> Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run?
> 
> I'll add an error flag for these use cases.
> 
> Hong
> 
> On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
> I think I know the problem.   Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal().  This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).
> 
> That's why you get the following error:
> Invalid ISPEC at line 484 in file get_perm_c.c
> 
> You need to use distributed matrix input interface pzgssvx() (without ABglobal)
> 
> Sherry
> 
> 
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Hong and Sherry,
> 
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
> 
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c
> 
> Mahir
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 30 juli 2015 02:58
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye Li; PETSc users list
> 
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> Sherry fixed several bugs in superlu_dist-v4.1.
> The current petsc-release interfaces with superlu_dist-v4.0.
> We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
> 
> Here is how to do it:
> 1. download superlu_dist v4.1
> 2. remove existing PETSC_ARCH directory, then configure petsc with
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
> 3. build petsc
> 
> Let us know if the issue remains.
> 
> Hong
> 
> 
> ---------- Forwarded message ----------
> From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>
> Hong,
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> This has nothing to do with my bug fix.
> ?  Shall we ask him to try the new version, or try to get him matrix?
> Sherry
> ?
> 
> ---------- Forwarded message ----------
> From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
> Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
> The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
> Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?
> 
> If i use -mat_superlu_dist_parsymbfact the program crashes with
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [0]PETSC ERROR: ------------------------------------------------------------------------
> 
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with
> 
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
> col block 3006 -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [0]PETSC ERROR: ------------------------------------------------------------------------
> 
> 
> /Mahir
> 
> 
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 22 juli 2015 21:34
> To: Xiaoye S. Li
> Cc: ?lker-Kaustell, Mahir; petsc-users
> 
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> In Petsc/superlu_dist interface, we set default
> 
> options.ParSymbFact = NO;
> 
> When user raises the flag "-mat_superlu_dist_parsymbfact",
> we set
> 
>     options.ParSymbFact = YES;
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */
> 
> We do not change anything else.
> 
> Hong
> 
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>> wrote:
> I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.
> 
> The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.
> 
> I don't understand why you get the following error when you use
> ?-mat_superlu_dist_parsymbfact?.
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> 
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
> 
> ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
> ?-mat_superlu_dist_parsymbfact?
> ? ?  (the default is to use  sequential symbolic factorization.)
> 
> 
> Sherry
> 
> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> Thank you for your reply.
> 
> As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.
> 
> I am working in a Windows-environment and have installed PETSc through Cygwin.
> Apparently, there is no support for Valgrind in this OS.
> 
> If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?
> 
> 
> Best regards,
> Mahir
> 
> ______________________________________________
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
> ______________________________________________
> 
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
> Sent: den 22 juli 2015 02:57
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> 
>    Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> 
>    Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.
> 
>   Barry
> 
> 
> 
> 
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> 
> 
> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on?
> >
> > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
> >
> > Mahir
> >
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>
> >
> > From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> > Sent: den 20 juli 2015 18:12
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> > Sent: den 20 juli 2015 17:39
> > To: ?lker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
> >
> > Mahir
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

From hzhang at mcs.anl.gov  Tue Aug 11 11:58:24 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Tue, 11 Aug 2015 11:58:24 -0500
Subject: [petsc-users] SIGSEGV in Superlu_dist
In-Reply-To: <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>
References: <55C90EAA.5060702@email.arizona.edu>
	<8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>
Message-ID: <CAGCphBtMUJQ4XsayVCv5jFJ9=m3iwZTder2Y+hr7xXypRGknNQ@mail.gmail.com>

Anthony,
I pushed a fix
https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf

Once it passes our nightly tests, I'll merge it to petsc-maint, then
petsc-dev.
Thanks for reporting it!

Hong

On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Anthony,
>
>    This crash is in PETSc code before it calls the SuperLU_DIST numeric
> factorization; likely we have a mistake such as assuming a process has at
> least one row of the matrix and need to fix it.
>
>    Barry
>
> > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
> >    A=0x14a6a70, info=0x19099f8)
> >    at
> /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> > 368           colA_start = rstart + ajj[0]; /* the smallest global col
> index of A */
>
>
>
> > On Aug 10, 2015, at 3:50 PM, Anthony Haas <aph at email.arizona.edu> wrote:
> >
> > Hi Sherry,
> >
> > I recently submitted a matrix for which I noticed that Superlu_dist was
> hanging when running on 4 processors with parallel symbolic factorization.
> I have been using the latest version of Superlu_dist and the code is not
> hanging anymore. However, I noticed that when running the same matrix (I
> have attached the matrix), the code crashes with the following SIGSEGV when
> running on 10 procs (with or without parallel symbolic factorization). It
> is probably overkill to run such a 'small' matrix on 10 procs but I thought
> that it might still be useful to report the problem?? See below for the
> error obtained when running with gdb and also a code snippet to reproduce
> the error.
> >
> > Thanks,
> >
> >
> > Anthony
> >
> >
> >
> > 1) ERROR in GDB
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
> >    A=0x14a6a70, info=0x19099f8)
> >    at
> /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> > 368           colA_start = rstart + ajj[0]; /* the smallest global col
> index of A */
> > (gdb)
> >
> >
> >
> > 2) PORTION OF CODE TO REPRODUCE ERROR
> >
> >    Subroutine HowBigLUCanBe(rank)
> >
> >      IMPLICIT NONE
> >
> >      integer(i4b),intent(in) :: rank
> >      integer(i4b)            :: i,ct
> >      real(dp)                :: begin,endd
> >      complex(dpc)            :: sigma
> >
> >      PetscErrorCode ierr
> >
> >
> >      if (rank==0) call cpu_time(begin)
> >
> >      if (rank==0) then
> >         write(*,*)
> >         write(*,*)'Testing How Big LU Can Be...'
> >         write(*,*)'============================'
> >         write(*,*)
> >      endif
> >
> >      !sigma = (1.0d0,0.0d0)
> >      !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit
> A = A-sigma*B
> >
> >      !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> > !.....Write Matrix to ASCII and Binary Format
> >      !call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
> >      !call MatView(DXX,viewer,ierr)
> >      !call PetscViewerDestroy(viewer,ierr)
> >
> >      !call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
> >      !call MatView(A,viewer,ierr)
> >      !call PetscViewerDestroy(viewer,ierr)
> >
> > !...Load a Matrix in Binary Format
> >      call
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
> >      call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
> >      call MatSetType(DLOAD,MATAIJ,ierr)
> >      call MatLoad(DLOAD,viewer,ierr)
> >      call PetscViewerDestroy(viewer,ierr)
> >
> >      !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> >
> > !.....Create Linear Solver Context
> >      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> >
> > !.....Set operators. Here the matrix that defines the linear system also
> serves as the preconditioning matrix.
> >      !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha
> commented and replaced by next line
> >
> >      !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B
> >      call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here A =
> A-sigma*B
> >
> > !.....Set Relative and Absolute Tolerances and Uses Default for
> Divergence Tol
> >      tol = 1.e-10
> >      call
> KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
> >
> > !.....Set the Direct (LU) Solver
> >      call KSPSetType(ksp,KSPPREONLY,ierr)
> >      call KSPGetPC(ksp,pc,ierr)
> >      call PCSetType(pc,PCLU,ierr)
> >      call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) !
> MATSOLVERSUPERLU_DIST MATSOLVERMUMPS
> >
> > !.....Create Right-Hand-Side Vector
> >      !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
> >      !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)
> >
> >      call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)
> >      call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)
> >
> >      call MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)
> >
> >      allocate(xwork1(IendA-IstartA))
> >      allocate(loc(IendA-IstartA))
> >
> >      ct=0
> >      do i=IstartA,IendA-1
> >         ct=ct+1
> >         loc(ct)=i
> >         xwork1(ct)=(1.0d0,0.0d0)
> >      enddo
> >
> >      call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
> >      call VecZeroEntries(sol,ierr)
> >
> >      deallocate(xwork1,loc)
> >
> > !.....Assemble Vectors
> >      call VecAssemblyBegin(frhs,ierr)
> >      call VecAssemblyEnd(frhs,ierr)
> >
> > !.....Solve the Linear System
> >      call KSPSolve(ksp,frhs,sol,ierr)
> >
> >      !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >
> >      if (rank==0) then
> >         call cpu_time(endd)
> >         write(*,*)
> >         print '("Total time for HowBigLUCanBe = ",f21.3,"
> seconds.")',endd-begin
> >      endif
> >
> >      call SlepcFinalize(ierr)
> >
> >      STOP
> >
> >
> >    end Subroutine HowBigLUCanBe
> >
> > <Amat_binary.m.info>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/138dda66/attachment.html>

From gideon.simpson at gmail.com  Tue Aug 11 12:36:22 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 11 Aug 2015 13:36:22 -0400
Subject: [petsc-users] checking jacobian
Message-ID: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>

I?m a bit confused by the following options:

Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.

What flags do I pass it to get some output to diagnose my Jacobian error?

-gideon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/301f95db/attachment.html>

From knepley at gmail.com  Tue Aug 11 12:39:47 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 11 Aug 2015 12:39:47 -0500
Subject: [petsc-users] checking jacobian
In-Reply-To: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
Message-ID: <CAMYG4GkUE2USXyNB7cNDkWO1A75BZQMM-mGszQ3Lm=A9nwkE4A@mail.gmail.com>

On Tue, Aug 11, 2015 at 12:36 PM, Gideon Simpson <gideon.simpson at gmail.com>
wrote:

> I?m a bit confused by the following options:
>
> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show
> difference of hand-coded and finite difference Jacobian.
>
> What flags do I pass it to get some output to diagnose my Jacobian error?
>

I would start with

  -snes_check_jacobian_view ascii:bug.txt

   Matt


> -gideon
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/d52d26a1/attachment.html>

From jed at jedbrown.org  Tue Aug 11 12:40:40 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 11 Aug 2015 11:40:40 -0600
Subject: [petsc-users] checking jacobian
In-Reply-To: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
Message-ID: <87zj1xsoyv.fsf@jedbrown.org>

Gideon Simpson <gideon.simpson at gmail.com> writes:

> I?m a bit confused by the following options:
>
> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>
> What flags do I pass it to get some output to diagnose my Jacobian error?

Nothing to display ASCII to the screen.  You might use
"binary:thematrix" if you want to read it in with Python or MATLAB, for
example.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/b43a91d9/attachment.pgp>

From gideon.simpson at gmail.com  Tue Aug 11 12:49:12 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 11 Aug 2015 13:49:12 -0400
Subject: [petsc-users] checking jacobian
In-Reply-To: <87zj1xsoyv.fsf@jedbrown.org>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
Message-ID: <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>

Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated.

-gideon

> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org> wrote:
> 
> Gideon Simpson <gideon.simpson at gmail.com> writes:
> 
>> I?m a bit confused by the following options:
>> 
>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>> 
>> What flags do I pass it to get some output to diagnose my Jacobian error?
> 
> Nothing to display ASCII to the screen.  You might use
> "binary:thematrix" if you want to read it in with Python or MATLAB, for
> example.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/c8657767/attachment.html>

From knepley at gmail.com  Tue Aug 11 12:50:19 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 11 Aug 2015 12:50:19 -0500
Subject: [petsc-users] checking jacobian
In-Reply-To: <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
Message-ID: <CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>

On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <gideon.simpson at gmail.com>
wrote:

> Maybe it?s a quirk of the macports installation of petsc, but nothing
> seems to be getting generated.
>

Run with -options_left. Is it reading the option?

  Matt


> -gideon
>
> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org> wrote:
>
> Gideon Simpson <gideon.simpson at gmail.com> writes:
>
> I?m a bit confused by the following options:
>
> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show
> difference of hand-coded and finite difference Jacobian.
>
> What flags do I pass it to get some output to diagnose my Jacobian error?
>
>
> Nothing to display ASCII to the screen.  You might use
> "binary:thematrix" if you want to read it in with Python or MATLAB, for
> example.
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/4cf521c2/attachment.html>

From gideon.simpson at gmail.com  Tue Aug 11 12:51:19 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 11 Aug 2015 13:51:19 -0400
Subject: [petsc-users] checking jacobian
In-Reply-To: <CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
	<CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
Message-ID: <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>

#End of PETSc Option Table entries
There is one unused database option. It is:
Option left: name:-snes_check_jacobian_view (no value)

-gideon

> On Aug 11, 2015, at 1:50 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated.
> 
> Run with -options_left. Is it reading the option?
> 
>   Matt
>  
> -gideon
> 
>> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> wrote:
>> 
>> Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> writes:
>> 
>>> I?m a bit confused by the following options:
>>> 
>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>>> 
>>> What flags do I pass it to get some output to diagnose my Jacobian error?
>> 
>> Nothing to display ASCII to the screen.  You might use
>> "binary:thematrix" if you want to read it in with Python or MATLAB, for
>> example.
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/fdebc23b/attachment-0001.html>

From knepley at gmail.com  Tue Aug 11 13:03:05 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 11 Aug 2015 13:03:05 -0500
Subject: [petsc-users] checking jacobian
In-Reply-To: <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
	<CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
	<18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>
Message-ID: <CAMYG4G=c_C+dUznhkRitip72yYV8BKKuDWcPxGMVKMMf8iQkOw@mail.gmail.com>

On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson <gideon.simpson at gmail.com>
wrote:

> #End of PETSc Option Table entries
> There is one unused database option. It is:
> Option left: name:-snes_check_jacobian_view (no value)
>

This is the option for the newest release. What are you using?

   Matt


> -gideon
>
> On Aug 11, 2015, at 1:50 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <gideon.simpson at gmail.com
> > wrote:
>
>> Maybe it?s a quirk of the macports installation of petsc, but nothing
>> seems to be getting generated.
>>
>
> Run with -options_left. Is it reading the option?
>
>   Matt
>
>
>> -gideon
>>
>> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org> wrote:
>>
>> Gideon Simpson <gideon.simpson at gmail.com> writes:
>>
>> I?m a bit confused by the following options:
>>
>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show
>> difference of hand-coded and finite difference Jacobian.
>>
>> What flags do I pass it to get some output to diagnose my Jacobian error?
>>
>>
>> Nothing to display ASCII to the screen.  You might use
>> "binary:thematrix" if you want to read it in with Python or MATLAB, for
>> example.
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/5dbb8818/attachment.html>

From gideon.simpson at gmail.com  Tue Aug 11 13:04:19 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 11 Aug 2015 14:04:19 -0400
Subject: [petsc-users] checking jacobian
In-Reply-To: <CAMYG4G=c_C+dUznhkRitip72yYV8BKKuDWcPxGMVKMMf8iQkOw@mail.gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
	<CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
	<18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>
	<CAMYG4G=c_C+dUznhkRitip72yYV8BKKuDWcPxGMVKMMf8iQkOw@mail.gmail.com>
Message-ID: <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com>

Macports installation of 3.5.3.

-gideon

> On Aug 11, 2015, at 2:03 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> #End of PETSc Option Table entries
> There is one unused database option. It is:
> Option left: name:-snes_check_jacobian_view (no value)
> 
> This is the option for the newest release. What are you using?
> 
>    Matt
>  
> -gideon
> 
>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> 
>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
>> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated.
>> 
>> Run with -options_left. Is it reading the option?
>> 
>>   Matt
>>  
>> -gideon
>> 
>>> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> wrote:
>>> 
>>> Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> writes:
>>> 
>>>> I?m a bit confused by the following options:
>>>> 
>>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>>>> 
>>>> What flags do I pass it to get some output to diagnose my Jacobian error?
>>> 
>>> Nothing to display ASCII to the screen.  You might use
>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for
>>> example.
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/02d4a97b/attachment.html>

From knepley at gmail.com  Tue Aug 11 13:07:00 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 11 Aug 2015 13:07:00 -0500
Subject: [petsc-users] checking jacobian
In-Reply-To: <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
	<CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
	<18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>
	<CAMYG4G=c_C+dUznhkRitip72yYV8BKKuDWcPxGMVKMMf8iQkOw@mail.gmail.com>
	<877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com>
Message-ID: <CAMYG4G=DLVyD1KUHOhBheWL1qO2e3KBp+UWOXGdAMxBehN0cWw@mail.gmail.com>

On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson <gideon.simpson at gmail.com>
wrote:

> Macports installation of 3.5.3.
>

Use -help to find the option name. Maybe its -snes_test.

  Thanks,

    Matt


> -gideon
>
> On Aug 11, 2015, at 2:03 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson <gideon.simpson at gmail.com
> > wrote:
>
>> #End of PETSc Option Table entries
>> There is one unused database option. It is:
>> Option left: name:-snes_check_jacobian_view (no value)
>>
>
> This is the option for the newest release. What are you using?
>
>    Matt
>
>
>> -gideon
>>
>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <
>> gideon.simpson at gmail.com> wrote:
>>
>>> Maybe it?s a quirk of the macports installation of petsc, but nothing
>>> seems to be getting generated.
>>>
>>
>> Run with -options_left. Is it reading the option?
>>
>>   Matt
>>
>>
>>> -gideon
>>>
>>> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>
>>> Gideon Simpson <gideon.simpson at gmail.com> writes:
>>>
>>> I?m a bit confused by the following options:
>>>
>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show
>>> difference of hand-coded and finite difference Jacobian.
>>>
>>> What flags do I pass it to get some output to diagnose my Jacobian error?
>>>
>>>
>>> Nothing to display ASCII to the screen.  You might use
>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for
>>> example.
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/b7987e8c/attachment.html>

From aph at email.arizona.edu  Tue Aug 11 13:08:22 2015
From: aph at email.arizona.edu (Anthony Haas)
Date: Tue, 11 Aug 2015 11:08:22 -0700
Subject: [petsc-users] SIGSEGV in Superlu_dist
In-Reply-To: <CAGCphBtMUJQ4XsayVCv5jFJ9=m3iwZTder2Y+hr7xXypRGknNQ@mail.gmail.com>
References: <55C90EAA.5060702@email.arizona.edu>	<8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>
	<CAGCphBtMUJQ4XsayVCv5jFJ9=m3iwZTder2Y+hr7xXypRGknNQ@mail.gmail.com>
Message-ID: <55CA3A16.90206@email.arizona.edu>

Hi Hong,

Sorry for my late reply and thanks for the fix. Does that mean that I 
will be able to run that matrix on 10 procs in the future (petsc 3.6.2?)?

Thanks

Anthony


On 08/11/2015 09:58 AM, Hong wrote:
> Anthony,
> I pushed a fix
> https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf
>
> Once it passes our nightly tests, I'll merge it to petsc-maint, then 
> petsc-dev.
> Thanks for reporting it!
>
> Hong
>
> On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith <bsmith at mcs.anl.gov 
> <mailto:bsmith at mcs.anl.gov>> wrote:
>
>
>       Anthony,
>
>        This crash is in PETSc code before it calls the SuperLU_DIST
>     numeric factorization; likely we have a mistake such as assuming a
>     process has at least one row of the matrix and need to fix it.
>
>        Barry
>
>     > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
>     >    A=0x14a6a70, info=0x19099f8)
>     >    at
>     /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
>     > 368           colA_start = rstart + ajj[0]; /* the smallest
>     global col index of A */
>
>
>
>     > On Aug 10, 2015, at 3:50 PM, Anthony Haas <aph at email.arizona.edu
>     <mailto:aph at email.arizona.edu>> wrote:
>     >
>     > Hi Sherry,
>     >
>     > I recently submitted a matrix for which I noticed that
>     Superlu_dist was hanging when running on 4 processors with
>     parallel symbolic factorization. I have been using the latest
>     version of Superlu_dist and the code is not hanging anymore.
>     However, I noticed that when running the same matrix (I have
>     attached the matrix), the code crashes with the following SIGSEGV
>     when running on 10 procs (with or without parallel symbolic
>     factorization). It is probably overkill to run such a 'small'
>     matrix on 10 procs but I thought that it might still be useful to
>     report the problem?? See below for the error obtained when running
>     with gdb and also a code snippet to reproduce the error.
>     >
>     > Thanks,
>     >
>     >
>     > Anthony
>     >
>     >
>     >
>     > 1) ERROR in GDB
>     >
>     > Program received signal SIGSEGV, Segmentation fault.
>     > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
>     >    A=0x14a6a70, info=0x19099f8)
>     >    at
>     /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
>     > 368           colA_start = rstart + ajj[0]; /* the smallest
>     global col index of A */
>     > (gdb)
>     >
>     >
>     >
>     > 2) PORTION OF CODE TO REPRODUCE ERROR
>     >
>     >    Subroutine HowBigLUCanBe(rank)
>     >
>     >      IMPLICIT NONE
>     >
>     >      integer(i4b),intent(in) :: rank
>     >      integer(i4b)            :: i,ct
>     >      real(dp)                :: begin,endd
>     >      complex(dpc)            :: sigma
>     >
>     >      PetscErrorCode ierr
>     >
>     >
>     >      if (rank==0) call cpu_time(begin)
>     >
>     >      if (rank==0) then
>     >         write(*,*)
>     >         write(*,*)'Testing How Big LU Can Be...'
>     >         write(*,*)'============================'
>     >         write(*,*)
>     >      endif
>     >
>     >      !sigma = (1.0d0,0.0d0)
>     >      !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) !
>     on exit A = A-sigma*B
>     >
>     >      !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)
>     >
>     > !.....Write Matrix to ASCII and Binary Format
>     >      !call
>     PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
>     >      !call MatView(DXX,viewer,ierr)
>     >      !call PetscViewerDestroy(viewer,ierr)
>     >
>     >      !call
>     PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
>     >      !call MatView(A,viewer,ierr)
>     >      !call PetscViewerDestroy(viewer,ierr)
>     >
>     > !...Load a Matrix in Binary Format
>     >      call
>     PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
>     >      call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
>     >      call MatSetType(DLOAD,MATAIJ,ierr)
>     >      call MatLoad(DLOAD,viewer,ierr)
>     >      call PetscViewerDestroy(viewer,ierr)
>     >
>     >      !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)
>     >
>     >
>     > !.....Create Linear Solver Context
>     >      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
>     >
>     > !.....Set operators. Here the matrix that defines the linear
>     system also serves as the preconditioning matrix.
>     >      !call
>     KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha
>     commented and replaced by next line
>     >
>     >      !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A =
>     A-sigma*B
>     >      call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here
>     A = A-sigma*B
>     >
>     > !.....Set Relative and Absolute Tolerances and Uses Default for
>     Divergence Tol
>     >      tol = 1.e-10
>     >      call
>     KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
>     >
>     > !.....Set the Direct (LU) Solver
>     >      call KSPSetType(ksp,KSPPREONLY,ierr)
>     >      call KSPGetPC(ksp,pc,ierr)
>     >      call PCSetType(pc,PCLU,ierr)
>     >      call
>     PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) !
>     MATSOLVERSUPERLU_DIST MATSOLVERMUMPS
>     >
>     > !.....Create Right-Hand-Side Vector
>     >      !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
>     >      !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)
>     >
>     >      call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)
>     >      call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)
>     >
>     >      call
>     MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)
>     >
>     >      allocate(xwork1(IendA-IstartA))
>     >      allocate(loc(IendA-IstartA))
>     >
>     >      ct=0
>     >      do i=IstartA,IendA-1
>     >         ct=ct+1
>     >         loc(ct)=i
>     >         xwork1(ct)=(1.0d0,0.0d0)
>     >      enddo
>     >
>     >      call
>     VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
>     >      call VecZeroEntries(sol,ierr)
>     >
>     >      deallocate(xwork1,loc)
>     >
>     > !.....Assemble Vectors
>     >      call VecAssemblyBegin(frhs,ierr)
>     >      call VecAssemblyEnd(frhs,ierr)
>     >
>     > !.....Solve the Linear System
>     >      call KSPSolve(ksp,frhs,sol,ierr)
>     >
>     >      !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)
>     >
>     >      if (rank==0) then
>     >         call cpu_time(endd)
>     >         write(*,*)
>     >         print '("Total time for HowBigLUCanBe = ",f21.3,"
>     seconds.")',endd-begin
>     >      endif
>     >
>     >      call SlepcFinalize(ierr)
>     >
>     >      STOP
>     >
>     >
>     >    end Subroutine HowBigLUCanBe
>     >
>     > <Amat_binary.m.info <http://Amat_binary.m.info>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/76555f38/attachment-0001.html>

From gideon.simpson at gmail.com  Tue Aug 11 13:09:12 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 11 Aug 2015 14:09:12 -0400
Subject: [petsc-users] checking jacobian
In-Reply-To: <CAMYG4G=DLVyD1KUHOhBheWL1qO2e3KBp+UWOXGdAMxBehN0cWw@mail.gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
	<CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
	<18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>
	<CAMYG4G=c_C+dUznhkRitip72yYV8BKKuDWcPxGMVKMMf8iQkOw@mail.gmail.com>
	<877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com>
	<CAMYG4G=DLVyD1KUHOhBheWL1qO2e3KBp+UWOXGdAMxBehN0cWw@mail.gmail.com>
Message-ID: <B4BB1601-1178-4442-A176-8698A5774843@gmail.com>

I don?t see it listed in -help, but I do get

./blowup -xmax 50 -nx 1000 -snes_check_jacobian
      Testing hand-coded Jacobian, if the ratio is O(1.e-8), the hand-coded Jacobian is probably correct.
      Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
    63386.1 = ||J - Jfd||//J|| 63386.1  = ||J - Jfd||

-gideon

> On Aug 11, 2015, at 2:07 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> Macports installation of 3.5.3.
> 
> Use -help to find the option name. Maybe its -snes_test.
> 
>   Thanks,
> 
>     Matt
>  
> -gideon
> 
>> On Aug 11, 2015, at 2:03 PM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> 
>> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
>> #End of PETSc Option Table entries
>> There is one unused database option. It is:
>> Option left: name:-snes_check_jacobian_view (no value)
>> 
>> This is the option for the newest release. What are you using?
>> 
>>    Matt
>>  
>> -gideon
>> 
>>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>>> 
>>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
>>> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated.
>>> 
>>> Run with -options_left. Is it reading the option?
>>> 
>>>   Matt
>>>  
>>> -gideon
>>> 
>>>> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> wrote:
>>>> 
>>>> Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> writes:
>>>> 
>>>>> I?m a bit confused by the following options:
>>>>> 
>>>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>>>>> 
>>>>> What flags do I pass it to get some output to diagnose my Jacobian error?
>>>> 
>>>> Nothing to display ASCII to the screen.  You might use
>>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for
>>>> example.
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/49bf6800/attachment.html>

From knepley at gmail.com  Tue Aug 11 13:11:06 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 11 Aug 2015 13:11:06 -0500
Subject: [petsc-users] checking jacobian
In-Reply-To: <B4BB1601-1178-4442-A176-8698A5774843@gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
	<CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
	<18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>
	<CAMYG4G=c_C+dUznhkRitip72yYV8BKKuDWcPxGMVKMMf8iQkOw@mail.gmail.com>
	<877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com>
	<CAMYG4G=DLVyD1KUHOhBheWL1qO2e3KBp+UWOXGdAMxBehN0cWw@mail.gmail.com>
	<B4BB1601-1178-4442-A176-8698A5774843@gmail.com>
Message-ID: <CAMYG4G=X_-i4y0rM5N7A=ZDtZO9-5DM_SVfFDMyvBr090RqXTw@mail.gmail.com>

On Tue, Aug 11, 2015 at 1:09 PM, Gideon Simpson <gideon.simpson at gmail.com>
wrote:

> I don?t see it listed in -help, but I do get
>
> ./blowup -xmax 50 -nx 1000 -snes_check_jacobian
>       Testing hand-coded Jacobian, if the ratio is O(1.e-8), the
> hand-coded Jacobian is probably correct.
>       Run with -snes_check_jacobian_view [viewer][:filename][:format] to
> show difference of hand-coded and finite difference Jacobian.
>     63386.1 = ||J - Jfd||//J|| 63386.1  = ||J - Jfd||
>

It must be broken in 3.5.3 for some reason. I would use the latest release.

   Matt


> -gideon
>
> On Aug 11, 2015, at 2:07 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
>
>> Macports installation of 3.5.3.
>>
>
> Use -help to find the option name. Maybe its -snes_test.
>
>   Thanks,
>
>     Matt
>
>
>> -gideon
>>
>> On Aug 11, 2015, at 2:03 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>
>> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson <
>> gideon.simpson at gmail.com> wrote:
>>
>>> #End of PETSc Option Table entries
>>> There is one unused database option. It is:
>>> Option left: name:-snes_check_jacobian_view (no value)
>>>
>>
>> This is the option for the newest release. What are you using?
>>
>>    Matt
>>
>>
>>> -gideon
>>>
>>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>
>>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <
>>> gideon.simpson at gmail.com> wrote:
>>>
>>>> Maybe it?s a quirk of the macports installation of petsc, but nothing
>>>> seems to be getting generated.
>>>>
>>>
>>> Run with -options_left. Is it reading the option?
>>>
>>>   Matt
>>>
>>>
>>>> -gideon
>>>>
>>>> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>>
>>>> Gideon Simpson <gideon.simpson at gmail.com> writes:
>>>>
>>>> I?m a bit confused by the following options:
>>>>
>>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show
>>>> difference of hand-coded and finite difference Jacobian.
>>>>
>>>> What flags do I pass it to get some output to diagnose my Jacobian
>>>> error?
>>>>
>>>>
>>>> Nothing to display ASCII to the screen.  You might use
>>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for
>>>> example.
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/5983fbe5/attachment.html>

From balay at mcs.anl.gov  Tue Aug 11 13:33:08 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 11 Aug 2015 13:33:08 -0500
Subject: [petsc-users] SIGSEGV in Superlu_dist
In-Reply-To: <55CA3A16.90206@email.arizona.edu>
References: <55C90EAA.5060702@email.arizona.edu>
	<8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>
	<CAGCphBtMUJQ4XsayVCv5jFJ9=m3iwZTder2Y+hr7xXypRGknNQ@mail.gmail.com>
	<55CA3A16.90206@email.arizona.edu>
Message-ID: <alpine.LFD.2.20.1508111329310.3796@asterix>

yes - the patch will be in petsc 3.6.2.

However - you can grab the patch right now - and start using it

If using a 3.6.1 tarball - you can do download the (raw) patch from
the url below and apply with:

cd petsc-3.6.1
patch -Np1 < patchfile

If using a git clone - you can do:

git fetch
git checkout ceeba3afeff0c18262ed13ef92e2508ca68b0ecf

Satish

On Tue, 11 Aug 2015, Anthony Haas wrote:

> Hi Hong,
> 
> Sorry for my late reply and thanks for the fix. Does that mean that I will be
> able to run that matrix on 10 procs in the future (petsc 3.6.2?)?
> 
> Thanks
> 
> Anthony
> 
> 
> On 08/11/2015 09:58 AM, Hong wrote:
> > Anthony,
> > I pushed a fix
> > https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf
> > 
> > Once it passes our nightly tests, I'll merge it to petsc-maint, then
> > petsc-dev.
> > Thanks for reporting it!
> > 
> > Hong
> > 
> > On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith <bsmith at mcs.anl.gov
> > <mailto:bsmith at mcs.anl.gov>> wrote:
> > 
> > 
> >       Anthony,
> > 
> >        This crash is in PETSc code before it calls the SuperLU_DIST
> >     numeric factorization; likely we have a mistake such as assuming a
> >     process has at least one row of the matrix and need to fix it.
> > 
> >        Barry
> > 
> >     > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
> >     >    A=0x14a6a70, info=0x19099f8)
> >     >    at
> >     /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> >     > 368           colA_start = rstart + ajj[0]; /* the smallest
> >     global col index of A */
> > 
> > 
> > 
> >     > On Aug 10, 2015, at 3:50 PM, Anthony Haas <aph at email.arizona.edu
> >     <mailto:aph at email.arizona.edu>> wrote:
> >     >
> >     > Hi Sherry,
> >     >
> >     > I recently submitted a matrix for which I noticed that
> >     Superlu_dist was hanging when running on 4 processors with
> >     parallel symbolic factorization. I have been using the latest
> >     version of Superlu_dist and the code is not hanging anymore.
> >     However, I noticed that when running the same matrix (I have
> >     attached the matrix), the code crashes with the following SIGSEGV
> >     when running on 10 procs (with or without parallel symbolic
> >     factorization). It is probably overkill to run such a 'small'
> >     matrix on 10 procs but I thought that it might still be useful to
> >     report the problem?? See below for the error obtained when running
> >     with gdb and also a code snippet to reproduce the error.
> >     >
> >     > Thanks,
> >     >
> >     >
> >     > Anthony
> >     >
> >     >
> >     >
> >     > 1) ERROR in GDB
> >     >
> >     > Program received signal SIGSEGV, Segmentation fault.
> >     > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50,
> >     >    A=0x14a6a70, info=0x19099f8)
> >     >    at
> >     /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> >     > 368           colA_start = rstart + ajj[0]; /* the smallest
> >     global col index of A */
> >     > (gdb)
> >     >
> >     >
> >     >
> >     > 2) PORTION OF CODE TO REPRODUCE ERROR
> >     >
> >     >    Subroutine HowBigLUCanBe(rank)
> >     >
> >     >      IMPLICIT NONE
> >     >
> >     >      integer(i4b),intent(in) :: rank
> >     >      integer(i4b)            :: i,ct
> >     >      real(dp)                :: begin,endd
> >     >      complex(dpc)            :: sigma
> >     >
> >     >      PetscErrorCode ierr
> >     >
> >     >
> >     >      if (rank==0) call cpu_time(begin)
> >     >
> >     >      if (rank==0) then
> >     >         write(*,*)
> >     >         write(*,*)'Testing How Big LU Can Be...'
> >     >         write(*,*)'============================'
> >     >         write(*,*)
> >     >      endif
> >     >
> >     >      !sigma = (1.0d0,0.0d0)
> >     >      !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) !
> >     on exit A = A-sigma*B
> >     >
> >     >      !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >     >
> >     > !.....Write Matrix to ASCII and Binary Format
> >     >      !call
> >     PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
> >     >      !call MatView(DXX,viewer,ierr)
> >     >      !call PetscViewerDestroy(viewer,ierr)
> >     >
> >     >      !call
> >     PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
> >     >      !call MatView(A,viewer,ierr)
> >     >      !call PetscViewerDestroy(viewer,ierr)
> >     >
> >     > !...Load a Matrix in Binary Format
> >     >      call
> >     PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
> >     >      call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
> >     >      call MatSetType(DLOAD,MATAIJ,ierr)
> >     >      call MatLoad(DLOAD,viewer,ierr)
> >     >      call PetscViewerDestroy(viewer,ierr)
> >     >
> >     >      !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >     >
> >     >
> >     > !.....Create Linear Solver Context
> >     >      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> >     >
> >     > !.....Set operators. Here the matrix that defines the linear
> >     system also serves as the preconditioning matrix.
> >     >      !call
> >     KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha
> >     commented and replaced by next line
> >     >
> >     >      !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A =
> >     A-sigma*B
> >     >      call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here
> >     A = A-sigma*B
> >     >
> >     > !.....Set Relative and Absolute Tolerances and Uses Default for
> >     Divergence Tol
> >     >      tol = 1.e-10
> >     >      call
> >     KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
> >     >
> >     > !.....Set the Direct (LU) Solver
> >     >      call KSPSetType(ksp,KSPPREONLY,ierr)
> >     >      call KSPGetPC(ksp,pc,ierr)
> >     >      call PCSetType(pc,PCLU,ierr)
> >     >      call
> >     PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) !
> >     MATSOLVERSUPERLU_DIST MATSOLVERMUMPS
> >     >
> >     > !.....Create Right-Hand-Side Vector
> >     >      !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
> >     >      !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)
> >     >
> >     >      call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)
> >     >      call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)
> >     >
> >     >      call
> >     MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)
> >     >
> >     >      allocate(xwork1(IendA-IstartA))
> >     >      allocate(loc(IendA-IstartA))
> >     >
> >     >      ct=0
> >     >      do i=IstartA,IendA-1
> >     >         ct=ct+1
> >     >         loc(ct)=i
> >     >         xwork1(ct)=(1.0d0,0.0d0)
> >     >      enddo
> >     >
> >     >      call
> >     VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
> >     >      call VecZeroEntries(sol,ierr)
> >     >
> >     >      deallocate(xwork1,loc)
> >     >
> >     > !.....Assemble Vectors
> >     >      call VecAssemblyBegin(frhs,ierr)
> >     >      call VecAssemblyEnd(frhs,ierr)
> >     >
> >     > !.....Solve the Linear System
> >     >      call KSPSolve(ksp,frhs,sol,ierr)
> >     >
> >     >      !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)
> >     >
> >     >      if (rank==0) then
> >     >         call cpu_time(endd)
> >     >         write(*,*)
> >     >         print '("Total time for HowBigLUCanBe = ",f21.3,"
> >     seconds.")',endd-begin
> >     >      endif
> >     >
> >     >      call SlepcFinalize(ierr)
> >     >
> >     >      STOP
> >     >
> >     >
> >     >    end Subroutine HowBigLUCanBe
> >     >
> >     > <Amat_binary.m.info <http://Amat_binary.m.info>>
> > 
> > 
> 
> 


From mc0710 at gmail.com  Tue Aug 11 14:10:09 2015
From: mc0710 at gmail.com (Mani Chandra)
Date: Tue, 11 Aug 2015 14:10:09 -0500
Subject: [petsc-users] Petsc+Chombo example
Message-ID: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>

Hi,

Is there an example where Petsc's SNES has been used with Chombo, and
perhaps with an automatic Jacobian assembly? I'd like to know if Petsc can
pick out the number of colors of a Chombo data structure like it can do
with a DMDA.

Thanks,
Mani
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/3f8b2ba8/attachment.html>

From jed at jedbrown.org  Tue Aug 11 14:18:05 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 11 Aug 2015 13:18:05 -0600
Subject: [petsc-users] Petsc+Chombo example
In-Reply-To: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
References: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
Message-ID: <87wpx1skgi.fsf@jedbrown.org>

Mani Chandra <mc0710 at gmail.com> writes:

> Is there an example where Petsc's SNES has been used with Chombo, and
> perhaps with an automatic Jacobian assembly? I'd like to know if Petsc can
> pick out the number of colors of a Chombo data structure like it can do
> with a DMDA.

You'll have to ask Chombo about this.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/5f7a19d7/attachment.pgp>

From knepley at gmail.com  Tue Aug 11 14:25:57 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 11 Aug 2015 14:25:57 -0500
Subject: [petsc-users] Petsc+Chombo example
In-Reply-To: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
References: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
Message-ID: <CAMYG4GnAkWia-Wxc8LA1R+n-pL1SZOShdLY-_y423TkS6hOTBQ@mail.gmail.com>

On Tue, Aug 11, 2015 at 2:10 PM, Mani Chandra <mc0710 at gmail.com> wrote:

> Hi,
>
> Is there an example where Petsc's SNES has been used with Chombo, and
> perhaps with an automatic Jacobian assembly? I'd like to know if Petsc can
> pick out the number of colors of a Chombo data structure like it can do
> with a DMDA.
>

The specific kinds of colorings for structured grids also assume a
colocated discretization which
I am not sure Chombo uses. However, the greedy colorings which only use the
matrix will work.

  Thanks,

    Matt


> Thanks,
> Mani
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/f8e3e40e/attachment.html>

From bsmith at mcs.anl.gov  Tue Aug 11 16:40:19 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 11 Aug 2015 16:40:19 -0500
Subject: [petsc-users] checking jacobian
In-Reply-To: <B4BB1601-1178-4442-A176-8698A5774843@gmail.com>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
	<CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
	<18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>
	<CAMYG4G=c_C+dUznhkRitip72yYV8BKKuDWcPxGMVKMMf8iQkOw@mail.gmail.com>
	<877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com>
	<CAMYG4G=DLVyD1KUHOhBheWL1qO2e3KBp+UWOXGdAMxBehN0cWw@mail.gmail.com>
	<B4BB1601-1178-4442-A176-8698A5774843@gmail.com>
Message-ID: <97A16F65-5714-4946-B2A1-32AEA315BFA1@mcs.anl.gov>

  You also have to KEEP the -snes_check_jacobian option

  Barry

> On Aug 11, 2015, at 1:09 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> I don?t see it listed in -help, but I do get
> 
> ./blowup -xmax 50 -nx 1000 -snes_check_jacobian
>       Testing hand-coded Jacobian, if the ratio is O(1.e-8), the hand-coded Jacobian is probably correct.
>       Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>     63386.1 = ||J - Jfd||//J|| 63386.1  = ||J - Jfd||
> 
> -gideon
> 
>> On Aug 11, 2015, at 2:07 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> 
>> On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> Macports installation of 3.5.3.
>> 
>> Use -help to find the option name. Maybe its -snes_test.
>> 
>>   Thanks,
>> 
>>     Matt
>>  
>> -gideon
>> 
>>> On Aug 11, 2015, at 2:03 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>> 
>>> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>> #End of PETSc Option Table entries
>>> There is one unused database option. It is:
>>> Option left: name:-snes_check_jacobian_view (no value)
>>> 
>>> This is the option for the newest release. What are you using?
>>> 
>>>    Matt
>>>  
>>> -gideon
>>> 
>>>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>> 
>>>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated.
>>>> 
>>>> Run with -options_left. Is it reading the option?
>>>> 
>>>>   Matt
>>>>  
>>>> -gideon
>>>> 
>>>>> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>>> 
>>>>> Gideon Simpson <gideon.simpson at gmail.com> writes:
>>>>> 
>>>>>> I?m a bit confused by the following options:
>>>>>> 
>>>>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>>>>>> 
>>>>>> What flags do I pass it to get some output to diagnose my Jacobian error?
>>>>> 
>>>>> Nothing to display ASCII to the screen.  You might use
>>>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for
>>>>> example.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
> 


From xsli at lbl.gov  Tue Aug 11 18:49:05 2015
From: xsli at lbl.gov (Xiaoye S. Li)
Date: Tue, 11 Aug 2015 16:49:05 -0700
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <a8e5fb98ebb94e82adbbfb40e37bf3ba@STHWS42.tyrens.se>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
	<e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
	<CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>
	<63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se>
	<CAGCphBts_ppiRAKXz3zJPYYz+dwC4fsvOOi-4yhXLRdLRKLahg@mail.gmail.com>
	<429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se>
	<alpine.LFD.2.20.1508071108120.20765@asterix>
	<a8e5fb98ebb94e82adbbfb40e37bf3ba@STHWS42.tyrens.se>
Message-ID: <CAFvbobX12PpdUqJ_NkXQm8gsh0QRfJxrCND7CH8YmnBxWepy7Q@mail.gmail.com>

?It's hard to say. For 3D problems, you may get a fill factor about 30x-50x
(can be larger or smaller depending on problem.)   The time may be in
seconds, or minutes at most.

Sherry

On Tue, Aug 11, 2015 at 7:31 AM, Mahir.Ulker-Kaustell at tyrens.se <
Mahir.Ulker-Kaustell at tyrens.se> wrote:

> Yes! Doing:
>
> $PETSC_DIR/$PETSC_ARCH/bin/mpiexec
>
> instead of
>
> mpiexec
>
> makes the program run as expected.
>
> Thank you all for your patience and encouragement.
>
> Sherry: I have noticed that you have been involved in some publications
> related to my current work, i.e. wave propagation in elastic solids. What
> computation time would you expect using SuperLU to solve one linear system
> with say 800000 degrees of freedom and 4-8 processes (on a single node)
> with a finite element discretization?
>
> Mahir
>
>
>
>
>
> -----Original Message-----
> From: Satish Balay [mailto:balay at mcs.anl.gov]
> Sent: den 7 augusti 2015 18:09
> To: ?lker-Kaustell, Mahir
> Cc: Hong; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> This usually happens if you use the wrong MPIEXEC
>
> i.e use the mpiexec from the MPI you built PETSc with.
>
> Satish
>
> On Fri, 7 Aug 2015, Mahir.Ulker-Kaustell at tyrens.se wrote:
>
> > Hong,
> >
> > Running example 2 with the command line given below gives me two
> uniprocessor runs!?
> >
> > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package
> superlu_dist -ksp_view
> > KSP Object: 1 MPI processes
> >   type: gmres
> >     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >     GMRES: happy breakdown tolerance 1e-30
> >   maximum iterations=10000, initial guess is zero
> >   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
> >   left preconditioning
> >   using PRECONDITIONED norm type for convergence test
> > PC Object: 1 MPI processes
> >   type: lu
> >     LU: out-of-place factorization
> >     tolerance for zero pivot 2.22045e-14
> >     matrix ordering: nd
> >     factor fill ratio given 0, needed 0
> >       Factored matrix follows:
> >         Mat Object:         1 MPI processes
> >           type: seqaij
> >           rows=56, cols=56
> >           package used to perform factorization: superlu_dist
> >           total: nonzeros=0, allocated nonzeros=0
> >           total number of mallocs used during MatSetValues calls =0
> >             SuperLU_DIST run parameters:
> >               Process grid nprow 1 x npcol 1
> >               Equilibrate matrix TRUE
> >               Matrix input mode 0
> >               Replace tiny pivots TRUE
> >               Use iterative refinement FALSE
> >               Processors in row 1 col partition 1
> >               Row permutation LargeDiag
> >               Column permutation METIS_AT_PLUS_A
> >               Parallel symbolic factorization FALSE
> >               Repeated factorization SamePattern_SameRowPerm
> >   linear system matrix = precond matrix:
> >   Mat Object:   1 MPI processes
> >     type: seqaij
> >     rows=56, cols=56
> >     total: nonzeros=250, allocated nonzeros=280
> >     total number of mallocs used during MatSetValues calls =0
> >       not using I-node routines
> > Norm of error 5.21214e-15 iterations 1
> > KSP Object: 1 MPI processes
> >   type: gmres
> >     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >     GMRES: happy breakdown tolerance 1e-30
> >   maximum iterations=10000, initial guess is zero
> >   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
> >   left preconditioning
> >   using PRECONDITIONED norm type for convergence test
> > PC Object: 1 MPI processes
> >   type: lu
> >     LU: out-of-place factorization
> >     tolerance for zero pivot 2.22045e-14
> >     matrix ordering: nd
> >     factor fill ratio given 0, needed 0
> >       Factored matrix follows:
> >         Mat Object:         1 MPI processes
> >           type: seqaij
> >           rows=56, cols=56
> >           package used to perform factorization: superlu_dist
> >           total: nonzeros=0, allocated nonzeros=0
> >           total number of mallocs used during MatSetValues calls =0
> >             SuperLU_DIST run parameters:
> >               Process grid nprow 1 x npcol 1
> >               Equilibrate matrix TRUE
> >               Matrix input mode 0
> >               Replace tiny pivots TRUE
> >               Use iterative refinement FALSE
> >               Processors in row 1 col partition 1
> >               Row permutation LargeDiag
> >               Column permutation METIS_AT_PLUS_A
> >               Parallel symbolic factorization FALSE
> >               Repeated factorization SamePattern_SameRowPerm
> >   linear system matrix = precond matrix:
> >   Mat Object:   1 MPI processes
> >     type: seqaij
> >     rows=56, cols=56
> >     total: nonzeros=250, allocated nonzeros=280
> >     total number of mallocs used during MatSetValues calls =0
> >       not using I-node routines
> > Norm of error 5.21214e-15 iterations 1
> >
> > Mahir
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov]
> > Sent: den 6 augusti 2015 16:36
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; Xiaoye S. Li; PETSc users list
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> >
> > I have been using PETSC_COMM_WORLD.
> >
> > What do you get by running a petsc example, e.g.,
> > petsc/src/ksp/ksp/examples/tutorials
> > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package
> superlu_dist -ksp_view
> >
> > KSP Object: 2 MPI processes
> >   type: gmres
> > ...
> >
> > Hong
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> > Sent: den 5 augusti 2015 17:11
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; Xiaoye S. Li; PETSc users list
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > As you noticed, you ran the code in serial mode, not parallel.
> > Check your code on input communicator, e.g., what input communicator do
> you use in
> > KSPCreate(comm,&ksp)?
> >
> > I have added error flag to superlu_dist interface (released version).
> When user uses '-mat_superlu_dist_parsymbfact'
> > in serial mode, this option is ignored with a warning.
> >
> > Hong
> >
> > Hong,
> >
> > If I set parsymbfact:
> >
> > $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
> > Invalid ISPEC at line 484 in file get_perm_c.c
> > Invalid ISPEC at line 484 in file get_perm_c.c
> > -------------------------------------------------------
> > Primary job  terminated normally, but 1 process returned
> > a non-zero exit code.. Per user-direction, the job has been aborted.
> > -------------------------------------------------------
> >
> --------------------------------------------------------------------------
> > mpiexec detected that one or more processes exited with non-zero status,
> thus causing
> > the job to be terminated. The first process to do so was:
> >
> >   Process name: [[63679,1],0]
> >   Exit code:    255
> >
> --------------------------------------------------------------------------
> >
> > Since the program does not finish the call to KSPSolve(), we do not get
> any information about the KSP from ?ksp_view.
> >
> > If I do not set it, I get a serial run even if I specify ?n 2:
> >
> > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -ksp_view
> > ?
> > KSP Object: 1 MPI processes
> >   type: preonly
> >   maximum iterations=10000, initial guess is zero
> >   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >   left preconditioning
> >   using NONE norm type for convergence test
> > PC Object: 1 MPI processes
> >   type: lu
> >     LU: out-of-place factorization
> >     tolerance for zero pivot 2.22045e-14
> >     matrix ordering: nd
> >     factor fill ratio given 0, needed 0
> >       Factored matrix follows:
> >         Mat Object:         1 MPI processes
> >           type: seqaij
> >           rows=954, cols=954
> >           package used to perform factorization: superlu_dist
> >           total: nonzeros=0, allocated nonzeros=0
> >           total number of mallocs used during MatSetValues calls =0
> >             SuperLU_DIST run parameters:
> >               Process grid nprow 1 x npcol 1
> >               Equilibrate matrix TRUE
> >               Matrix input mode 0
> >               Replace tiny pivots TRUE
> >               Use iterative refinement FALSE
> >               Processors in row 1 col partition 1
> >               Row permutation LargeDiag
> >               Column permutation METIS_AT_PLUS_A
> >               Parallel symbolic factorization FALSE
> >               Repeated factorization SamePattern_SameRowPerm
> >   linear system matrix = precond matrix:
> >   Mat Object:   1 MPI processes
> >     type: seqaij
> >     rows=954, cols=954
> >     total: nonzeros=34223, allocated nonzeros=34223
> >     total number of mallocs used during MatSetValues calls =0
> >       using I-node routines: found 668 nodes, limit used is 5
> >
> > I am running PETSc via Cygwin on a windows machine.
> > When I installed PETSc the tests with different numbers of processes ran
> well.
> >
> > Mahir
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> > Sent: den 3 augusti 2015 19:06
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; Xiaoye S. Li; PETSc users list
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir,
> >
> >
> > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL
> for parallel runs.
> >
> > If I use 2 processors, the program runs if I use
> ?mat_superlu_dist_parsymbfact=1:
> > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput
> GLOBAL -mat_superlu_dist_parsymbfact=1
> >
> > The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so
> your code runs well without parsymbfact.
> >
> > Please run it with '-ksp_view' and see what
> > 'SuperLU_DIST run parameters:' are being used, e.g.
> > petsc/src/ksp/ksp/examples/tutorials (maint)
> > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package
> superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view
> >
> > ...
> >   SuperLU_DIST run parameters:
> >               Process grid nprow 2 x npcol 1
> >               Equilibrate matrix TRUE
> >               Matrix input mode 1
> >               Replace tiny pivots TRUE
> >               Use iterative refinement FALSE
> >               Processors in row 2 col partition 1
> >               Row permutation LargeDiag
> >               Column permutation METIS_AT_PLUS_A
> >               Parallel symbolic factorization FALSE
> >               Repeated factorization SamePattern_SameRowPerm
> >
> > I do not understand why your code uses matrix input mode = global.
> >
> > Hong
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> > Sent: den 3 augusti 2015 16:46
> > To: Xiaoye S. Li
> > Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list
> >
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir,
> >
> > Sherry found the culprit. I can reproduce it:
> > petsc/src/ksp/ksp/examples/tutorials
> > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package
> superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> > Invalid ISPEC at line 484 in file get_perm_c.c
> > -------------------------------------------------------
> > Primary job  terminated normally, but 1 process returned
> > a non-zero exit code.. Per user-direction, the job has been aborted.
> > -------------------------------------------------------
> > ...
> >
> > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when
> using more than one processes.
> > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or
> set matinput=GLOBAL for parallel run?
> >
> > I'll add an error flag for these use cases.
> >
> > Hong
> >
> > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:
> xsli at lbl.gov>> wrote:
> > I think I know the problem.   Since zdistribute.c is called, I guess you
> are using the global (replicated) matrix input interface,
> pzgssvx_ABglobal().  This interface does not allow you to use parallel
> symbolic factorization (since matrix is centralized).
> >
> > That's why you get the following error:
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > You need to use distributed matrix input interface pzgssvx() (without
> ABglobal)
> >
> > Sherry
> >
> >
> > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> > Hong and Sherry,
> >
> > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem
> remains:
> >
> > If I use -mat_superlu_dist_parsymbfact, the program crashes with:
> Invalid ISPEC at line 484 in file get_perm_c.c
> > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the
> program crashes with:  Calloc fails for SPA dense[]. at line 438 in file
> zdistribute.c
> >
> > Mahir
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> > Sent: den 30 juli 2015 02:58
> > To: ?lker-Kaustell, Mahir
> > Cc: Xiaoye Li; PETSc users list
> >
> > Subject: Fwd: [petsc-users] SuperLU MPI-problem
> >
> > Mahir,
> >
> > Sherry fixed several bugs in superlu_dist-v4.1.
> > The current petsc-release interfaces with superlu_dist-v4.0.
> > We do not know whether the reported issue (attached below) has been
> resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
> >
> > Here is how to do it:
> > 1. download superlu_dist v4.1
> > 2. remove existing PETSC_ARCH directory, then configure petsc with
> > '--download-superlu_dist=superlu_dist_4.1.tar.gz'
> > 3. build petsc
> >
> > Let us know if the issue remains.
> >
> > Hong
> >
> >
> > ---------- Forwarded message ----------
> > From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov>>
> > Date: Wed, Jul 29, 2015 at 2:24 PM
> > Subject: Fwd: [petsc-users] SuperLU MPI-problem
> > To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>
> > Hong,
> > I am cleaning the mailbox, and saw this unresolved issue.  I am not sure
> whether the new fix to parallel symbolic factorization solves the problem.
> What bothers be is that he is getting the following error:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> > This has nothing to do with my bug fix.
> > ?  Shall we ask him to try the new version, or try to get him matrix?
> > Sherry
> > ?
> >
> > ---------- Forwarded message ----------
> > From: Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se>>
> > Date: Wed, Jul 22, 2015 at 1:32 PM
> > Subject: RE: [petsc-users] SuperLU MPI-problem
> > To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>, "Xiaoye S.
> Li" <xsli at lbl.gov<mailto:xsli at lbl.gov>>
> > Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov
> >>
> > The 1000 was just a conservative guess. The number of non-zeros per row
> is in the tens in general but certain constraints lead to non-diagonal
> streaks in the sparsity-pattern.
> > Is it the reordering of the matrix that is killing me here? How can I
> set options.ColPerm?
> >
> > If i use -mat_superlu_dist_parsymbfact the program crashes with
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> > -------------------------------------------------------
> > Primary job  terminated normally, but 1 process returned
> > a non-zero exit code.. Per user-direction, the job has been aborted.
> > -------------------------------------------------------
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > [0]PETSC ERROR: Signal received
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:23 2015
> > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
> > [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> > [unset]: aborting job:
> > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> >
> > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat
> later) with
> >
> > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
> > col block 3006 -------------------------------------------------------
> > Primary job  terminated normally, but 1 process returned
> > a non-zero exit code.. Per user-direction, the job has been aborted.
> > -------------------------------------------------------
> > col block 1924 [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > [0]PETSC ERROR: Signal received
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
> muk Wed Jul 22 21:59:58 2015
> > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
> --with-scalar-type=complex --download-fblaspack --download-mpich
> --download-scalapack --download-mumps --download-metis --download-parmetis
> --download-superlu --download-superlu_dist --download-fftw
> > [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> > [unset]: aborting job:
> > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> >
> >
> > /Mahir
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> > Sent: den 22 juli 2015 21:34
> > To: Xiaoye S. Li
> > Cc: ?lker-Kaustell, Mahir; petsc-users
> >
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > In Petsc/superlu_dist interface, we set default
> >
> > options.ParSymbFact = NO;
> >
> > When user raises the flag "-mat_superlu_dist_parsymbfact",
> > we set
> >
> >     options.ParSymbFact = YES;
> >     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for
> ParSymbFact regardless of user ordering setting */
> >
> > We do not change anything else.
> >
> > Hong
> >
> > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:
> xsli at lbl.gov>> wrote:
> > I am trying to understand your problem. You said you are solving Naviers
> equation (elastodynamics) in the frequency domain, using finite element
> discretization.  I wonder why you have about 1000 nonzeros per row.
> Usually in many PDE discretized matrices, the number of nonzeros per row is
> in the tens (even for 3D problems), not in the thousands.   So, your matrix
> is quite a bit denser than many sparse matrices we deal with.
> >
> > The number of nonzeros in the L and U factors is much more than that in
> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be
> as bad as 50-100x fill ratio for 3D.  But since your matrix starts much
> denser (i.e., the underlying graph has many connections), it may not lend
> to any good ordering strategy to preserve sparsity of L and U; that is, the
> L and U fill ratio may be large.
> >
> > I don't understand why you get the following error when you use
> > ?-mat_superlu_dist_parsymbfact?.
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
> >
> > ?Hong -- in order to use parallel symbolic factorization, is it
> sufficient to specify only
> > ?-mat_superlu_dist_parsymbfact?
> > ? ?  (the default is to use  sequential symbolic factorization.)
> >
> >
> > Sherry
> >
> > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> > Thank you for your reply.
> >
> > As you have probably figured out already, I am not a computational
> scientist. I am a researcher in civil engineering (railways for high-speed
> traffic), trying to produce some, from my perspective, fairly large
> parametric studies based on finite element discretizations.
> >
> > I am working in a Windows-environment and have installed PETSc through
> Cygwin.
> > Apparently, there is no support for Valgrind in this OS.
> >
> > If I have understood you correct, the memory issues are related to
> superLU and given my background, there is not much I can do. Is this
> correct?
> >
> >
> > Best regards,
> > Mahir
> >
> > ______________________________________________
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se>
> > ______________________________________________
> >
> > -----Original Message-----
> > From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>]
> > Sent: den 22 juli 2015 02:57
> > To: ?lker-Kaustell, Mahir
> > Cc: Xiaoye S. Li; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> >
> >    Run the program under valgrind
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use
> the option -mat_superlu_dist_parsymbfact I get many scary memory problems
> some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> >
> >    Note that I consider it unacceptable for running programs to EVER use
> uninitialized values; until these are all cleaned up I won't trust any runs
> like this.
> >
> >   Barry
> >
> >
> >
> >
> > ==42050== Conditional jump or move depends on uninitialised value(s)
> > ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> > ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> > ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a stack allocation
> > ==42050==    at 0x10155751B: get_perm_c_parmetis
> (get_perm_c_parmetis.c:96)
> > ==42050==
> > ==42050== Conditional jump or move depends on uninitialised value(s)
> > ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> > ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> > ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> > ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> > ==42050==    by 0x101557F60: get_perm_c_parmetis
> (get_perm_c_parmetis.c:285)
> > ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a stack allocation
> > ==42050==    at 0x10155751B: get_perm_c_parmetis
> (get_perm_c_parmetis.c:96)
> > ==42050==
> > ==42049== Syscall param writev(vector[...]) points to uninitialised
> byte(s)
> > ==42049==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> > ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> > ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> > ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> > ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend
> (ch3u_eager.c:556)
> > ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> > ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> > ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> > ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> > ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> > ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> > ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> > ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> > ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> > ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048== Syscall param writev(vector[...]) points to uninitialised
> byte(s)
> > ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size
> 752,720 alloc'd
> > ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> > ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> > ==42048==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> > ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> > ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> > ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> > ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> > ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> > ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend
> (ch3u_eager.c:556)
> > ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> > ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> > ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> > ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> > ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> > ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> > ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> > ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> > ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> > ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42049==    by 0x100001B3C: main (in ./ex19)
> > ==42049==  Uninitialised value was created by a heap allocation
> > ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> > ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> > ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> > ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> > ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> > ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> > ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> > ==42049==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> > ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size
> 752,720 alloc'd
> > ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42049==    by 0x100001B3C: main (in ./ex19)
> > ==42049==
> > ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> > ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> > ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> > ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> > ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> > ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> > ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42048==    by 0x100001B3C: main (in ./ex19)
> > ==42048==  Uninitialised value was created by a heap allocation
> > ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> > ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> > ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> > ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> > ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> > ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> > ==42048==    by 0x101557CFC: get_perm_c_parmetis
> (get_perm_c_parmetis.c:241)
> > ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42048==    by 0x100001B3C: main (in ./ex19)
> > ==42048==
> > ==42048== Syscall param write(buf) points to uninitialised byte(s)
> > ==42048==    at 0x102DA1C22: write (in
> /usr/lib/system/libsystem_kernel.dylib)
> > ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> > ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> > ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:257)
> > ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> > ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> > ==42048==    by 0x10155802F: get_perm_c_parmetis
> (get_perm_c_parmetis.c:299)
> > ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42048==    by 0x100001B3C: main (in ./ex19)
> > ==42048==  Address 0x104810704 is on thread 1's stack
> > ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
> (ch3u_eager.c:218)
> > ==42048==  Uninitialised value was created by a heap allocation
> > ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> > ==42048==    by 0x101557AB9: get_perm_c_parmetis
> (get_perm_c_parmetis.c:185)
> > ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42048==    by 0x100001B3C: main (in ./ex19)
> > ==42048==
> > ==42050== Conditional jump or move depends on uninitialised value(s)
> > ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> > ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> > ==42050==    by 0x10150A5C6: ddist_psymbtonum
> (pdsymbfact_distdata.c:1275)
> > ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a stack allocation
> > ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> > ==42050==
> > ==42050== Conditional jump or move depends on uninitialised value(s)
> > ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> > ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> > ==42050==    by 0x10150A5C6: ddist_psymbtonum
> (pdsymbfact_distdata.c:1275)
> > ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a stack allocation
> > ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> > ==42050==
> > ==42050== Conditional jump or move depends on uninitialised value(s)
> > ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> > ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> > ==42050==    by 0x10150A5C6: ddist_psymbtonum
> (pdsymbfact_distdata.c:1275)
> > ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a stack allocation
> > ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> > ==42050==
> > ==42050== Conditional jump or move depends on uninitialised value(s)
> > ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> > ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> > ==42050==    by 0x10150A5C6: ddist_psymbtonum
> (pdsymbfact_distdata.c:1275)
> > ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a stack allocation
> > ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> > ==42050==
> > ==42050== Conditional jump or move depends on uninitialised value(s)
> > ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> > ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> > ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> > ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> > ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> > ==42050==    by 0x10150A5C6: ddist_psymbtonum
> (pdsymbfact_distdata.c:1275)
> > ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a stack allocation
> > ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> > ==42050==
> > ==42050== Syscall param writev(vector[...]) points to uninitialised
> byte(s)
> > ==42050==    at 0x102DA1C3A: writev (in
> /usr/lib/system/libsystem_kernel.dylib)
> > ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> > ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> > ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> > ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend
> (ch3u_eager.c:556)
> > ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> > ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> > ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> > ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> > ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size
> 131,072 alloc'd
> > ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> > ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> > ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> > ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a heap allocation
> > ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> > ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> > ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> > ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==
> > ==42048== Conditional jump or move depends on uninitialised value(s)
> > ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> > ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42048==    by 0x100001B3C: main (in ./ex19)
> > ==42048==  Uninitialised value was created by a heap allocation
> > ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> > ==42048==    by 0x10150ABE2: ddist_psymbtonum
> (pdsymbfact_distdata.c:1332)
> > ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42048==    by 0x100001B3C: main (in ./ex19)
> > ==42048==
> > ==42049== Conditional jump or move depends on uninitialised value(s)
> > ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> > ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> > ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42049==    by 0x100001B3C: main (in ./ex19)
> > ==42049==  Uninitialised value was created by a heap allocation
> > ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> > ==42049==    by 0x10150ABE2: ddist_psymbtonum
> (pdsymbfact_distdata.c:1332)
> > ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42049==    by 0x100001B3C: main (in ./ex19)
> > ==42049==
> > ==42048== Conditional jump or move depends on uninitialised value(s)
> > ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> > ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42049== Conditional jump or move depends on uninitialised value(s)
> > ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42048==    by 0x100001B3C: main (in ./ex19)
> > ==42048==  Uninitialised value was created by a heap allocation
> > ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> > ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> > ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> > ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42048==    by 0x10150ABE2: ddist_psymbtonum
> (pdsymbfact_distdata.c:1332)
> > ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42049==    by 0x100001B3C: main (in ./ex19)
> > ==42049==  Uninitialised value was created by a heap allocation
> > ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42048==    by 0x100001B3C: main (in ./ex19)
> > ==42048==
> > ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> > ==42049==    by 0x10150ABE2: ddist_psymbtonum
> (pdsymbfact_distdata.c:1332)
> > ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42049==    by 0x100001B3C: main (in ./ex19)
> > ==42049==
> > ==42050== Conditional jump or move depends on uninitialised value(s)
> > ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> > ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==  Uninitialised value was created by a heap allocation
> > ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> > ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> > ==42050==    by 0x10150B241: ddist_psymbtonum
> (pdsymbfact_distdata.c:1389)
> > ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> > ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
> (superlu_dist.c:414)
> > ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> > ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> > ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> > ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> > ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> > ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> > ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> > ==42050==    by 0x100001B3C: main (in ./ex19)
> > ==42050==
> >
> >
> > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
> > >
> > > Ok. So I have been creating the full factorization on each process.
> That gives me some hope!
> > >
> > > I followed your suggestion and tried to use the runtime option
> ?-mat_superlu_dist_parsymbfact?.
> > > However, now the program crashes with:
> > >
> > > Invalid ISPEC at line 484 in file get_perm_c.c
> > >
> > > And so on?
> > >
> > > From the SuperLU manual; I should give the option either YES or NO,
> however -mat_superlu_dist_parsymbfact YES makes the program crash in the
> same way as above.
> > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in
> the PETSc documentation
> > >
> > > Mahir
> > >
> > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr,
> Tyr?ns AB
> > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:
> Mahir.Ulker-Kaustell at tyrens.se>
> > >
> > > From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>]
> > > Sent: den 20 juli 2015 18:12
> > > To: ?lker-Kaustell, Mahir
> > > Cc: Hong; petsc-users
> > > Subject: Re: [petsc-users] SuperLU MPI-problem
> > >
> > > The default SuperLU_DIST setting is to serial symbolic factorization.
> Therefore, what matters is how much memory do you have per MPI task?
> > >
> > > The code failed to malloc memory during redistribution of matrix A to
> {L\U} data struction (using result of serial symbolic factorization.)
> > >
> > > You can use parallel symbolic factorization, by runtime option:
> '-mat_superlu_dist_parsymbfact'
> > >
> > > Sherry Li
> > >
> > >
> > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se
> <mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se
> <mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> > > Hong:
> > >
> > > Previous experiences with this equation have shown that it is very
> difficult to solve it iteratively. Hence the use of a direct solver.
> > >
> > > The large test problem I am trying to solve has slightly less than
> 10^6 degrees of freedom. The matrices are derived from finite elements so
> they are sparse.
> > > The machine I am working on has 128GB ram. I have estimated the memory
> needed to less than 20GB, so if the solver needs twice or even three times
> as much, it should still work well. Or have I completely misunderstood
> something here?
> > >
> > > Mahir
> > >
> > >
> > >
> > > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> > > Sent: den 20 juli 2015 17:39
> > > To: ?lker-Kaustell, Mahir
> > > Cc: petsc-users
> > > Subject: Re: [petsc-users] SuperLU MPI-problem
> > >
> > > Mahir:
> > > Direct solvers consume large amount of memory. Suggest to try
> followings:
> > >
> > > 1. A sparse iterative solver if  [-omega^2M + K] is not too
> ill-conditioned. You may test it using the small matrix.
> > >
> > > 2. Incrementally increase your matrix sizes. Try different matrix
> orderings.
> > > Do you get memory crash in the 1st symbolic factorization?
> > > In your case, matrix data structure stays same when omega changes, so
> you only need to do one matrix symbolic factorization and reuse it.
> > >
> > > 3. Use a machine that gives larger memory.
> > >
> > > Hong
> > >
> > > Dear Petsc-Users,
> > >
> > > I am trying to use PETSc to solve a set of linear equations arising
> from Naviers equation (elastodynamics) in the frequency domain.
> > > The frequency dependency of the problem requires that the system
> > >
> > >                              [-omega^2M + K]u = F
> > >
> > > where M and K are constant, square, positive definite matrices (mass
> and stiffness respectively) is solved for each frequency omega of interest.
> > > K is a complex matrix, including material damping.
> > >
> > > I have written a PETSc program which solves this problem for a small
> (1000 degrees of freedom) test problem on one or several processors, but it
> keeps crashing when I try it on my full scale (in the order of 10^6 degrees
> of freedom) problem.
> > >
> > > The program crashes at KSPSetUp() and from what I can see in the error
> messages, it appears as if it consumes too much memory.
> > >
> > > I would guess that similar problems have occurred in this mail-list,
> so I am hoping that someone can push  me in the right direction?
> > >
> > > Mahir
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/69696ce0/attachment-0001.html>

From fdkong.jd at gmail.com  Tue Aug 11 20:53:47 2015
From: fdkong.jd at gmail.com (Fande Kong)
Date: Tue, 11 Aug 2015 20:53:47 -0500
Subject: [petsc-users] petsc_gen_xdmf.py errors
Message-ID: <CAN5Wd-Jrphu3U1L+Rmtk+xuVB4J5cXuQ-A2FxYD9k5xYerwarw@mail.gmail.com>

Hi all,

I tried to use petsc_gen_xdmf.py to generate a xml file for visulaztion
using paraview. I got the following errors:

./petsc_gen_xdmf.py sol.h5
Traceback (most recent call last):
  File "./petsc_gen_xdmf.py", line 236, in <module>
    generateXdmf(sys.argv[1])
  File "./petsc_gen_xdmf.py", line 231, in generateXdmf
    Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners,
cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields)
  File "./petsc_gen_xdmf.py", line 190, in write
    for vf in vfields: self.writeField(fp, len(time), t, cellDim, spaceDim,
'/vertex_fields/'+vf[0], vf, 'Node')
  File "./petsc_gen_xdmf.py", line 164, in writeField
    self.writeFieldComponents(fp, numSteps, timestep, spaceDim, name, f,
domain)
  File "./petsc_gen_xdmf.py", line 120, in writeFieldComponents
    dims   = '1 %d 1' % (numSteps, dof, bs)
TypeError: not all arguments converted during string formatting


The hdf5 file is attached. Originally from Matthew. Configuration and make
log files are also attached.

Fande Kong,

Thanks,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/579e445a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sol.h5
Type: application/octet-stream
Size: 246288 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/579e445a/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 6236054 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/579e445a/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log
Type: application/octet-stream
Size: 104776 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/579e445a/attachment-0005.obj>

From gideon.simpson at gmail.com  Tue Aug 11 22:40:12 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 11 Aug 2015 23:40:12 -0400
Subject: [petsc-users] checking jacobian
In-Reply-To: <97A16F65-5714-4946-B2A1-32AEA315BFA1@mcs.anl.gov>
References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com>
	<87zj1xsoyv.fsf@jedbrown.org>
	<5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com>
	<CAMYG4Gn3Wka+o-OojKKoeCZGPo5AJEC=NYK51cFC_9CB6HpEDg@mail.gmail.com>
	<18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com>
	<CAMYG4G=c_C+dUznhkRitip72yYV8BKKuDWcPxGMVKMMf8iQkOw@mail.gmail.com>
	<877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com>
	<CAMYG4G=DLVyD1KUHOhBheWL1qO2e3KBp+UWOXGdAMxBehN0cWw@mail.gmail.com>
	<B4BB1601-1178-4442-A176-8698A5774843@gmail.com>
	<97A16F65-5714-4946-B2A1-32AEA315BFA1@mcs.anl.gov>
Message-ID: <03AE51B1-F431-4534-B04F-B80BEBF2EFE1@gmail.com>

Barry?s comment resolved my issue.

-gideon

> On Aug 11, 2015, at 5:40 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>  You also have to KEEP the -snes_check_jacobian option
> 
>  Barry
> 
>> On Aug 11, 2015, at 1:09 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> I don?t see it listed in -help, but I do get
>> 
>> ./blowup -xmax 50 -nx 1000 -snes_check_jacobian
>>      Testing hand-coded Jacobian, if the ratio is O(1.e-8), the hand-coded Jacobian is probably correct.
>>      Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>>    63386.1 = ||J - Jfd||//J|| 63386.1  = ||J - Jfd||
>> 
>> -gideon
>> 
>>> On Aug 11, 2015, at 2:07 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>> 
>>> On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>> Macports installation of 3.5.3.
>>> 
>>> Use -help to find the option name. Maybe its -snes_test.
>>> 
>>>  Thanks,
>>> 
>>>    Matt
>>> 
>>> -gideon
>>> 
>>>> On Aug 11, 2015, at 2:03 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>> 
>>>> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>> #End of PETSc Option Table entries
>>>> There is one unused database option. It is:
>>>> Option left: name:-snes_check_jacobian_view (no value)
>>>> 
>>>> This is the option for the newest release. What are you using?
>>>> 
>>>>   Matt
>>>> 
>>>> -gideon
>>>> 
>>>>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>> 
>>>>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated.
>>>>> 
>>>>> Run with -options_left. Is it reading the option?
>>>>> 
>>>>>  Matt
>>>>> 
>>>>> -gideon
>>>>> 
>>>>>> On Aug 11, 2015, at 1:40 PM, Jed Brown <jed at jedbrown.org> wrote:
>>>>>> 
>>>>>> Gideon Simpson <gideon.simpson at gmail.com> writes:
>>>>>> 
>>>>>>> I?m a bit confused by the following options:
>>>>>>> 
>>>>>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian.
>>>>>>> 
>>>>>>> What flags do I pass it to get some output to diagnose my Jacobian error?
>>>>>> 
>>>>>> Nothing to display ASCII to the screen.  You might use
>>>>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for
>>>>>> example.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150811/d259d36f/attachment.html>

From Mahir.Ulker-Kaustell at tyrens.se  Wed Aug 12 02:02:39 2015
From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se)
Date: Wed, 12 Aug 2015 07:02:39 +0000
Subject: [petsc-users] SuperLU MPI-problem
In-Reply-To: <CAFvbobX12PpdUqJ_NkXQm8gsh0QRfJxrCND7CH8YmnBxWepy7Q@mail.gmail.com>
References: <bf7884f114634f2290dbeba70c7aa426@STHWS42.tyrens.se>
	<CAGCphBvsxZcbO1z7dUAuJGocndqaSrKgyR2m=0FqM7W7d3sXWw@mail.gmail.com>
	<1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se>
	<CAFvbobUD5UAgqsPY3Mdbi_fyvjKwkbVzrX_=H=Hiq081jZBgSg@mail.gmail.com>
	<CAGCphBs32hf+q9wXXgEido1zHFKzKcmvuYRrwJu5RhfRSnhzZA@mail.gmail.com>
	<ff43a81feffd4a21ab60343cefede923@STHWS42.tyrens.se>
	<CAFvbobX_uR2wWcB9qg+hiqS5=0CJ+LvawfLb27_gx4kZ676GCg@mail.gmail.com>
	<CAGCphBuvpqOO2W3A6PSx1ieQUysufVgYe+9a+ULmW=pWmMSAtg@mail.gmail.com>
	<c15e55386f8c4896a4f2b5bd52038ddf@STHWS42.tyrens.se>
	<CAGCphBv6CbYp6T6TX_kj=ARU33=ij5UB47c1e4od2EP+hKRxaA@mail.gmail.com>
	<e735735e00e9402b810d56a0513c1882@STHWS42.tyrens.se>
	<CAGCphBu9HnySvkZsMCYTtdRTNSrabFv1D2zcT=dNDAVB5TT9ZQ@mail.gmail.com>
	<63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se>
	<CAGCphBts_ppiRAKXz3zJPYYz+dwC4fsvOOi-4yhXLRdLRKLahg@mail.gmail.com>
	<429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se>
	<alpine.LFD.2.20.1508071108120.20765@asterix>
	<a8e5fb98ebb94e82adbbfb40e37bf3ba@STHWS42.tyrens.se>,
	<CAFvbobX12PpdUqJ_NkXQm8gsh0QRfJxrCND7CH8YmnBxWepy7Q@mail.gmail.com>
Message-ID: <1439362955895.53919@tyrens.se>

Ok, thank you. I have 1-2 minutes in a commercial code and was of course hoping that PETScSuperLU would be at least that fast.

Currently PETSc/SuperLU is around 10 times slower, so I have to dig a little deeper, but now I know it will be worthwhile...


Mahir


________________________________
Fr?n: Xiaoye S. Li <xsli at lbl.gov>
Skickat: den 12 augusti 2015 01:49
Till: ?lker-Kaustell, Mahir
Kopia: petsc-users
?mne: Re: [petsc-users] SuperLU MPI-problem

?It's hard to say. For 3D problems, you may get a fill factor about 30x-50x (can be larger or smaller depending on problem.)   The time may be in seconds, or minutes at most.

Sherry

On Tue, Aug 11, 2015 at 7:31 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
Yes! Doing:

$PETSC_DIR/$PETSC_ARCH/bin/mpiexec

instead of

mpiexec

makes the program run as expected.

Thank you all for your patience and encouragement.

Sherry: I have noticed that you have been involved in some publications related to my current work, i.e. wave propagation in elastic solids. What computation time would you expect using SuperLU to solve one linear system with say 800000 degrees of freedom and 4-8 processes (on a single node) with a finite element discretization?

Mahir


-----Original Message-----
From: Satish Balay [mailto:balay at mcs.anl.gov<mailto:balay at mcs.anl.gov>]
Sent: den 7 augusti 2015 18:09
To: ?lker-Kaustell, Mahir
Cc: Hong; PETSc users list
Subject: Re: [petsc-users] SuperLU MPI-problem

This usually happens if you use the wrong MPIEXEC

i.e use the mpiexec from the MPI you built PETSc with.

Satish

On Fri, 7 Aug 2015, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se> wrote:

> Hong,
>
> Running example 2 with the command line given below gives me two uniprocessor runs!?
>
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=56, cols=56
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=56, cols=56
>     total: nonzeros=250, allocated nonzeros=280
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Norm of error 5.21214e-15 iterations 1
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=56, cols=56
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=56, cols=56
>     total: nonzeros=250, allocated nonzeros=280
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Norm of error 5.21214e-15 iterations 1
>
> Mahir
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>]
> Sent: den 6 augusti 2015 16:36
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir:
>
> I have been using PETSC_COMM_WORLD.
>
> What do you get by running a petsc example, e.g.,
> petsc/src/ksp/ksp/examples/tutorials
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
>
> KSP Object: 2 MPI processes
>   type: gmres
> ...
>
> Hong
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov><mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>]
> Sent: den 5 augusti 2015 17:11
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir:
> As you noticed, you ran the code in serial mode, not parallel.
> Check your code on input communicator, e.g., what input communicator do you use in
> KSPCreate(comm,&ksp)?
>
> I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact'
> in serial mode, this option is ignored with a warning.
>
> Hong
>
> Hong,
>
> If I set parsymbfact:
>
> $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
> Invalid ISPEC at line 484 in file get_perm_c.c
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec detected that one or more processes exited with non-zero status, thus causing
> the job to be terminated. The first process to do so was:
>
>   Process name: [[63679,1],0]
>   Exit code:    255
> --------------------------------------------------------------------------
>
> Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view.
>
> If I do not set it, I get a serial run even if I specify ?n 2:
>
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view
> ?
> KSP Object: 1 MPI processes
>   type: preonly
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using NONE norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=954, cols=954
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=954, cols=954
>     total: nonzeros=34223, allocated nonzeros=34223
>     total number of mallocs used during MatSetValues calls =0
>       using I-node routines: found 668 nodes, limit used is 5
>
> I am running PETSc via Cygwin on a windows machine.
> When I installed PETSc the tests with different numbers of processes ran well.
>
> Mahir
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov><mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>]
> Sent: den 3 augusti 2015 19:06
> To: ?lker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir,
>
>
> I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs.
>
> If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1:
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1
>
> The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact.
>
> Please run it with '-ksp_view' and see what
> 'SuperLU_DIST run parameters:' are being used, e.g.
> petsc/src/ksp/ksp/examples/tutorials (maint)
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view
>
> ...
>   SuperLU_DIST run parameters:
>               Process grid nprow 2 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 1
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 2 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>
> I do not understand why your code uses matrix input mode = global.
>
> Hong
>
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov><mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>]
> Sent: den 3 augusti 2015 16:46
> To: Xiaoye S. Li
> Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list
>
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> Mahir,
>
> Sherry found the culprit. I can reproduce it:
> petsc/src/ksp/ksp/examples/tutorials
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
>
> Invalid ISPEC at line 484 in file get_perm_c.c
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> ...
>
> PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes.
> Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run?
>
> I'll add an error flag for these use cases.
>
> Hong
>
> On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov><mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>>> wrote:
> I think I know the problem.   Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal().  This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).
>
> That's why you get the following error:
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> You need to use distributed matrix input interface pzgssvx() (without ABglobal)
>
> Sherry
>
>
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>> wrote:
> Hong and Sherry,
>
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
>
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c
>
> Mahir
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov><mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>]
> Sent: den 30 juli 2015 02:58
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye Li; PETSc users list
>
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
>
> Mahir,
>
> Sherry fixed several bugs in superlu_dist-v4.1.
> The current petsc-release interfaces with superlu_dist-v4.0.
> We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
>
> Here is how to do it:
> 1. download superlu_dist v4.1
> 2. remove existing PETSC_ARCH directory, then configure petsc with
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
> 3. build petsc
>
> Let us know if the issue remains.
>
> Hong
>
>
> ---------- Forwarded message ----------
> From: Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov><mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>>>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov><mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>>
> Hong,
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure whether the new fix to parallel symbolic factorization solves the problem.  What bothers be is that he is getting the following error:
>
> Invalid ISPEC at line 484 in file get_perm_c.c
> This has nothing to do with my bug fix.
> ?  Shall we ask him to try the new version, or try to get him matrix?
> Sherry
> ?
>
> ---------- Forwarded message ----------
> From: Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov><mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>>, "Xiaoye S. Li" <xsli at lbl.gov<mailto:xsli at lbl.gov><mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>>>
> Cc: petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov><mailto:petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>>
> The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.
> Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm?
>
> If i use -mat_superlu_dist_parsymbfact the program crashes with
>
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [0]PETSC ERROR: ------------------------------------------------------------------------
>
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with
>
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
> col block 3006 -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [0]PETSC ERROR: ------------------------------------------------------------------------
>
>
> /Mahir
>
>
> From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov><mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>]
> Sent: den 22 juli 2015 21:34
> To: Xiaoye S. Li
> Cc: ?lker-Kaustell, Mahir; petsc-users
>
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
> In Petsc/superlu_dist interface, we set default
>
> options.ParSymbFact = NO;
>
> When user raises the flag "-mat_superlu_dist_parsymbfact",
> we set
>
>     options.ParSymbFact = YES;
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */
>
> We do not change anything else.
>
> Hong
>
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov<mailto:xsli at lbl.gov><mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>>> wrote:
> I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization.  I wonder why you have about 1000 nonzeros per row.  Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands.   So, your matrix is quite a bit denser than many sparse matrices we deal with.
>
> The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large.
>
> I don't understand why you get the following error when you use
> ?-mat_superlu_dist_parsymbfact?.
>
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
>
> ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only
> ?-mat_superlu_dist_parsymbfact?
> ? ?  (the default is to use  sequential symbolic factorization.)
>
>
> Sherry
>
> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>> wrote:
> Thank you for your reply.
>
> As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.
>
> I am working in a Windows-environment and have installed PETSc through Cygwin.
> Apparently, there is no support for Valgrind in this OS.
>
> If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?
>
>
> Best regards,
> Mahir
>
> ______________________________________________
> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
> ______________________________________________
>
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov><mailto:bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>]
> Sent: den 22 juli 2015 02:57
> To: ?lker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
>
>
>    Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>
>    Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.
>
>   Barry
>
>
>
>
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
>
>
> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on?
> >
> > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.
> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation
> >
> > Mahir
> >
> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB
> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>
> >
> > From: Xiaoye S. Li [mailto:xsli at lbl.gov<mailto:xsli at lbl.gov><mailto:xsli at lbl.gov<mailto:xsli at lbl.gov>>]
> > Sent: den 20 juli 2015 18:12
> > To: ?lker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>> <Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se><mailto:Mahir.Ulker-Kaustell at tyrens.se<mailto:Mahir.Ulker-Kaustell at tyrens.se>>> wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov><mailto:hzhang at mcs.anl.gov<mailto:hzhang at mcs.anl.gov>>]
> > Sent: den 20 juli 2015 17:39
> > To: ?lker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push  me in the right direction?
> >
> > Mahir
>
>
>
>
>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/06dfbf69/attachment-0001.html>

From torquil at gmail.com  Wed Aug 12 09:29:20 2015
From: torquil at gmail.com (=?UTF-8?Q?Torquil_Macdonald_S=c3=b8rensen?=)
Date: Wed, 12 Aug 2015 16:29:20 +0200
Subject: [petsc-users] Duplicate options
Message-ID: <55CB5840.2050109@gmail.com>

Hi!

Is it intentional that Petsc prints duplicates of the matrix-related
options in the following test program that creates two matrices?:

-------------
Mat A;
ierr = MatCreate(PETSC_COMM_SELF, &A); CHKERRQ(ierr);
ierr = MatSetType(A, MATSEQAIJ); CHKERRQ(ierr);

Mat B;
ierr = MatCreate(PETSC_COMM_SELF, &B); CHKERRQ(ierr);
ierr = MatSetType(B, MATSEQAIJ); CHKERRQ(ierr);
--------------

The output when running with -help contains:

Options for SEQAIJ matrix -------------------------------------------------
  -mat_no_unroll: <FALSE> Do not optimize for inodes (slower) (None)
  -mat_no_inode: <FALSE> Do not optimize for inodes -slower- (None)
  -mat_inode_limit <5>: Do not use inodes larger then this value (None)
Options for SEQAIJ matrix -------------------------------------------------
  -mat_no_unroll: <FALSE> Do not optimize for inodes (slower) (None)
  -mat_no_inode: <FALSE> Do not optimize for inodes -slower- (None)
  -mat_inode_limit <5>: Do not use inodes larger then this value (None)


The section "Options for SEQAIJ matrix" is repeated. The reason I ask is
because I have another Petsc program that prints an enormous amount of
duplicate lines when running with -help. I found this old thread from
2006 about the same problem:

http://lists.mcs.anl.gov/pipermail/petsc-users/2006-October/000737.html

Best regards
Torquil S?rensen

From jed at jedbrown.org  Wed Aug 12 09:39:29 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 12 Aug 2015 08:39:29 -0600
Subject: [petsc-users] Duplicate options
In-Reply-To: <55CB5840.2050109@gmail.com>
References: <55CB5840.2050109@gmail.com>
Message-ID: <87h9o4sh9a.fsf@jedbrown.org>

Torquil Macdonald S?rensen <torquil at gmail.com> writes:
> The section "Options for SEQAIJ matrix" is repeated. The reason I ask is
> because I have another Petsc program that prints an enormous amount of
> duplicate lines when running with -help. I found this old thread from
> 2006 about the same problem:
>
> http://lists.mcs.anl.gov/pipermail/petsc-users/2006-October/000737.html

Sadly, this is still a known problem and it got worse when we became
more consistent about printing options for things like matrices and
vectors (which rarely have prefixes and for which many options used to
be hidden).  Fixing it is somewhat at odds with our desire to remove
global variables whenever possible, but I think it needs to be fixed.  I
tend to filter -help output with grep, FWIW.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/f9fcc96e/attachment.pgp>

From mfadams at lbl.gov  Wed Aug 12 09:56:44 2015
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 12 Aug 2015 10:56:44 -0400
Subject: [petsc-users] Petsc+Chombo example
In-Reply-To: <CAMYG4GnAkWia-Wxc8LA1R+n-pL1SZOShdLY-_y423TkS6hOTBQ@mail.gmail.com>
References: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
	<CAMYG4GnAkWia-Wxc8LA1R+n-pL1SZOShdLY-_y423TkS6hOTBQ@mail.gmail.com>
Message-ID: <CADOhEh7yi4OUBYGc0ghcfeiuzt2xJOZHi3bvXD7uA5sSiwsbzA@mail.gmail.com>

On Tue, Aug 11, 2015 at 3:25 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Aug 11, 2015 at 2:10 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>
>> Hi,
>>
>> Is there an example where Petsc's SNES has been used with Chombo, and
>> perhaps with an automatic Jacobian assembly? I'd like to know if Petsc can
>> pick out the number of colors of a Chombo data structure like it can do
>> with a DMDA.
>>
>
> The specific kinds of colorings for structured grids also assume a
> colocated discretization which
> I am not sure Chombo uses. However, the greedy colorings which only use
> the matrix will work.
>

Chombo (me) creates an MPIAIJ matrix.  So automatic Jacobian assembly
should work.

I have put a SNES in a Chombo code, but did not use automatic Jacobian
assembly.

Mark


>
>   Thanks,
>
>     Matt
>
>
>> Thanks,
>> Mani
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/ce2890bc/attachment.html>

From knepley at gmail.com  Wed Aug 12 10:10:46 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 12 Aug 2015 10:10:46 -0500
Subject: [petsc-users] Duplicate options
In-Reply-To: <87h9o4sh9a.fsf@jedbrown.org>
References: <55CB5840.2050109@gmail.com>
	<87h9o4sh9a.fsf@jedbrown.org>
Message-ID: <CAMYG4GkvYikOMTw+pYLjY818zJHhBOHphEuZru7C34h5x9ME1A@mail.gmail.com>

On Wed, Aug 12, 2015 at 9:39 AM, Jed Brown <jed at jedbrown.org> wrote:

> Torquil Macdonald S?rensen <torquil at gmail.com> writes:
> > The section "Options for SEQAIJ matrix" is repeated. The reason I ask is
> > because I have another Petsc program that prints an enormous amount of
> > duplicate lines when running with -help. I found this old thread from
> > 2006 about the same problem:
> >
> > http://lists.mcs.anl.gov/pipermail/petsc-users/2006-October/000737.html
>
> Sadly, this is still a known problem and it got worse when we became
> more consistent about printing options for things like matrices and
> vectors (which rarely have prefixes and for which many options used to
> be hidden).  Fixing it is somewhat at odds with our desire to remove
> global variables whenever possible, but I think it needs to be fixed.  I
> tend to filter -help output with grep, FWIW.
>

What will we use to uniquely identify a block of options? I hate the idea
of a random string.
Its too easy to mess up. Should we use a class+type_name?

   Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/f13db3a3/attachment.html>

From jed at jedbrown.org  Wed Aug 12 10:17:56 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 12 Aug 2015 09:17:56 -0600
Subject: [petsc-users] Duplicate options
In-Reply-To: <CAMYG4GkvYikOMTw+pYLjY818zJHhBOHphEuZru7C34h5x9ME1A@mail.gmail.com>
References: <55CB5840.2050109@gmail.com> <87h9o4sh9a.fsf@jedbrown.org>
	<CAMYG4GkvYikOMTw+pYLjY818zJHhBOHphEuZru7C34h5x9ME1A@mail.gmail.com>
Message-ID: <878u9gsfh7.fsf@jedbrown.org>

Matthew Knepley <knepley at gmail.com> writes:
> What will we use to uniquely identify a block of options? I hate the idea
> of a random string.
> Its too easy to mess up. Should we use a class+type_name?

Prefix is relevant.  We could hash the contents to make the comparison
fixed-length.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/88a0296c/attachment.pgp>

From knepley at gmail.com  Wed Aug 12 10:19:33 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 12 Aug 2015 10:19:33 -0500
Subject: [petsc-users] Duplicate options
In-Reply-To: <878u9gsfh7.fsf@jedbrown.org>
References: <55CB5840.2050109@gmail.com> <87h9o4sh9a.fsf@jedbrown.org>
	<CAMYG4GkvYikOMTw+pYLjY818zJHhBOHphEuZru7C34h5x9ME1A@mail.gmail.com>
	<878u9gsfh7.fsf@jedbrown.org>
Message-ID: <CAMYG4GnFjFLwys=B87iD09K2QO=T41fcnykJdEbMvV0RNQ0KeQ@mail.gmail.com>

On Wed, Aug 12, 2015 at 10:17 AM, Jed Brown <jed at jedbrown.org> wrote:

> Matthew Knepley <knepley at gmail.com> writes:
> > What will we use to uniquely identify a block of options? I hate the idea
> > of a random string.
> > Its too easy to mess up. Should we use a class+type_name?
>
> Prefix is relevant.  We could hash the contents to make the comparison
> fixed-length.
>

So, SHA1(class, type_name, prefix)? I could live with that. Then we
maintain a
khash table of those we have seen while printing.

   Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/c5c8abf8/attachment.html>

From jed at jedbrown.org  Wed Aug 12 10:27:48 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 12 Aug 2015 09:27:48 -0600
Subject: [petsc-users] Duplicate options
In-Reply-To: <CAMYG4GnFjFLwys=B87iD09K2QO=T41fcnykJdEbMvV0RNQ0KeQ@mail.gmail.com>
References: <55CB5840.2050109@gmail.com> <87h9o4sh9a.fsf@jedbrown.org>
	<CAMYG4GkvYikOMTw+pYLjY818zJHhBOHphEuZru7C34h5x9ME1A@mail.gmail.com>
	<878u9gsfh7.fsf@jedbrown.org>
	<CAMYG4GnFjFLwys=B87iD09K2QO=T41fcnykJdEbMvV0RNQ0KeQ@mail.gmail.com>
Message-ID: <87614ksf0r.fsf@jedbrown.org>

Matthew Knepley <knepley at gmail.com> writes:
> So, SHA1(class, type_name, prefix)? I could live with that. Then we
> maintain a khash table of those we have seen while printing.

Yeah, ultimately with a reader lock for thread safety.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/258275d1/attachment-0001.pgp>

From ustc.liu at gmail.com  Wed Aug 12 10:35:45 2015
From: ustc.liu at gmail.com (sheng liu)
Date: Wed, 12 Aug 2015 23:35:45 +0800
Subject: [petsc-users] Need to update matrix in every loop
In-Reply-To: <EB391C0A-10E0-4120-B62A-AE920C7BCBAA@mcs.anl.gov>
References: <CAFc94bUK8eak7Ddv82DthDhkbErXd1mN2KEutW91AuwkqW8qzQ@mail.gmail.com>
	<EB391C0A-10E0-4120-B62A-AE920C7BCBAA@mcs.anl.gov>
Message-ID: <CAFc94bWRhFTYaA7OPdwBpui1eEtccCnAEWMEgM05oQ7VUZvxWw@mail.gmail.com>

Thank you very much! I have another question. If I need all the eigenvalues
of the sparse matrix, which solver should I use? Thanks!

2015-08-09 1:52 GMT+08:00 Barry Smith <bsmith at mcs.anl.gov>:

>
> > On Aug 8, 2015, at 7:52 AM, sheng liu <ustc.liu at gmail.com> wrote:
> >
> > Hello:
> >     I have a large sparse symmetric matrix ( about 1000000x1000000), and
> I need about 10 eigenvalues near 0. The problem is: I need to run the same
> program about 1000 times, each time I need to change the diagonal matrix
> elements ( and they are generated randomly). Is there a fast way to
> implement this problem? Thank you!
>
>   Does each run depend on the previous one or are they all independent?
>
>   If they are independent I would introduce two levels of parallelism: On
> the outer level have different MPI communicators compute different random
> diagonal perturbations and on the inner level use a small amount of
> parallelism for each eigenvalue solve. The outer level of parallelism is
> embarrassingly parallel.
>
>   Of course, for runs of the eigensolve use -log_summary to make sure it
> is running efficiently and tune the amount of parallelism in the eigensolve
> for best performance.
>
>    Barry
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/fb533ee4/attachment.html>

From jed at jedbrown.org  Wed Aug 12 10:46:21 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 12 Aug 2015 09:46:21 -0600
Subject: [petsc-users] Need to update matrix in every loop
In-Reply-To: <CAFc94bWRhFTYaA7OPdwBpui1eEtccCnAEWMEgM05oQ7VUZvxWw@mail.gmail.com>
References: <CAFc94bUK8eak7Ddv82DthDhkbErXd1mN2KEutW91AuwkqW8qzQ@mail.gmail.com>
	<EB391C0A-10E0-4120-B62A-AE920C7BCBAA@mcs.anl.gov>
	<CAFc94bWRhFTYaA7OPdwBpui1eEtccCnAEWMEgM05oQ7VUZvxWw@mail.gmail.com>
Message-ID: <87y4hgqzle.fsf@jedbrown.org>

sheng liu <ustc.liu at gmail.com> writes:

> Thank you very much! I have another question. If I need all the eigenvalues
> of the sparse matrix, which solver should I use? Thanks!

That's O(n^3) with n=1e6.  Better find a way to not need all the
eigenvalues or to make the system smaller.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/551f7c6b/attachment.pgp>

From hzhang at mcs.anl.gov  Wed Aug 12 10:58:50 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Wed, 12 Aug 2015 10:58:50 -0500
Subject: [petsc-users] SIGSEGV in Superlu_dist
In-Reply-To: <alpine.LFD.2.20.1508111329310.3796@asterix>
References: <55C90EAA.5060702@email.arizona.edu>
	<8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>
	<CAGCphBtMUJQ4XsayVCv5jFJ9=m3iwZTder2Y+hr7xXypRGknNQ@mail.gmail.com>
	<55CA3A16.90206@email.arizona.edu>
	<alpine.LFD.2.20.1508111329310.3796@asterix>
Message-ID: <CAGCphBt5nJvWKVep5agr0WF4a1K1Gpra1DTUUY=0xTcxD7rz7w@mail.gmail.com>

Anthony,
I just patched petsc-maint branch.

Your matrix Amat_binary.m has empty diagonal blocks. Most petsc solvers
require matrix diagonal entries to be allocated as 'non-zero', i.e., insert
zero values to these zero entries. I would suggest you add zeros to
 Amat_binary.m during its buildup. This would enable petsc solvers, as well
as other packages.

Again, thanks for bug reporting.

Hong

On Tue, Aug 11, 2015 at 1:33 PM, Satish Balay <balay at mcs.anl.gov> wrote:

> yes - the patch will be in petsc 3.6.2.
>
> However - you can grab the patch right now - and start using it
>
> If using a 3.6.1 tarball - you can do download the (raw) patch from
> the url below and apply with:
>
> cd petsc-3.6.1
> patch -Np1 < patchfile
>
> If using a git clone - you can do:
>
> git fetch
> git checkout ceeba3afeff0c18262ed13ef92e2508ca68b0ecf
>
> Satish
>
> On Tue, 11 Aug 2015, Anthony Haas wrote:
>
> > Hi Hong,
> >
> > Sorry for my late reply and thanks for the fix. Does that mean that I
> will be
> > able to run that matrix on 10 procs in the future (petsc 3.6.2?)?
> >
> > Thanks
> >
> > Anthony
> >
> >
> > On 08/11/2015 09:58 AM, Hong wrote:
> > > Anthony,
> > > I pushed a fix
> > >
> https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf
> > >
> > > Once it passes our nightly tests, I'll merge it to petsc-maint, then
> > > petsc-dev.
> > > Thanks for reporting it!
> > >
> > > Hong
> > >
> > > On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith <bsmith at mcs.anl.gov
> > > <mailto:bsmith at mcs.anl.gov>> wrote:
> > >
> > >
> > >       Anthony,
> > >
> > >        This crash is in PETSc code before it calls the SuperLU_DIST
> > >     numeric factorization; likely we have a mistake such as assuming a
> > >     process has at least one row of the matrix and need to fix it.
> > >
> > >        Barry
> > >
> > >     > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST
> (F=0x1922b50,
> > >     >    A=0x14a6a70, info=0x19099f8)
> > >     >    at
> > >
>  /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> > >     > 368           colA_start = rstart + ajj[0]; /* the smallest
> > >     global col index of A */
> > >
> > >
> > >
> > >     > On Aug 10, 2015, at 3:50 PM, Anthony Haas <aph at email.arizona.edu
> > >     <mailto:aph at email.arizona.edu>> wrote:
> > >     >
> > >     > Hi Sherry,
> > >     >
> > >     > I recently submitted a matrix for which I noticed that
> > >     Superlu_dist was hanging when running on 4 processors with
> > >     parallel symbolic factorization. I have been using the latest
> > >     version of Superlu_dist and the code is not hanging anymore.
> > >     However, I noticed that when running the same matrix (I have
> > >     attached the matrix), the code crashes with the following SIGSEGV
> > >     when running on 10 procs (with or without parallel symbolic
> > >     factorization). It is probably overkill to run such a 'small'
> > >     matrix on 10 procs but I thought that it might still be useful to
> > >     report the problem?? See below for the error obtained when running
> > >     with gdb and also a code snippet to reproduce the error.
> > >     >
> > >     > Thanks,
> > >     >
> > >     >
> > >     > Anthony
> > >     >
> > >     >
> > >     >
> > >     > 1) ERROR in GDB
> > >     >
> > >     > Program received signal SIGSEGV, Segmentation fault.
> > >     > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST
> (F=0x1922b50,
> > >     >    A=0x14a6a70, info=0x19099f8)
> > >     >    at
> > >
>  /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368
> > >     > 368           colA_start = rstart + ajj[0]; /* the smallest
> > >     global col index of A */
> > >     > (gdb)
> > >     >
> > >     >
> > >     >
> > >     > 2) PORTION OF CODE TO REPRODUCE ERROR
> > >     >
> > >     >    Subroutine HowBigLUCanBe(rank)
> > >     >
> > >     >      IMPLICIT NONE
> > >     >
> > >     >      integer(i4b),intent(in) :: rank
> > >     >      integer(i4b)            :: i,ct
> > >     >      real(dp)                :: begin,endd
> > >     >      complex(dpc)            :: sigma
> > >     >
> > >     >      PetscErrorCode ierr
> > >     >
> > >     >
> > >     >      if (rank==0) call cpu_time(begin)
> > >     >
> > >     >      if (rank==0) then
> > >     >         write(*,*)
> > >     >         write(*,*)'Testing How Big LU Can Be...'
> > >     >         write(*,*)'============================'
> > >     >         write(*,*)
> > >     >      endif
> > >     >
> > >     >      !sigma = (1.0d0,0.0d0)
> > >     >      !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) !
> > >     on exit A = A-sigma*B
> > >     >
> > >     >      !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr)
> > >     >
> > >     > !.....Write Matrix to ASCII and Binary Format
> > >     >      !call
> > >     PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr)
> > >     >      !call MatView(DXX,viewer,ierr)
> > >     >      !call PetscViewerDestroy(viewer,ierr)
> > >     >
> > >     >      !call
> > >
>  PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr)
> > >     >      !call MatView(A,viewer,ierr)
> > >     >      !call PetscViewerDestroy(viewer,ierr)
> > >     >
> > >     > !...Load a Matrix in Binary Format
> > >     >      call
> > >
>  PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr)
> > >     >      call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr)
> > >     >      call MatSetType(DLOAD,MATAIJ,ierr)
> > >     >      call MatLoad(DLOAD,viewer,ierr)
> > >     >      call PetscViewerDestroy(viewer,ierr)
> > >     >
> > >     >      !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr)
> > >     >
> > >     >
> > >     > !.....Create Linear Solver Context
> > >     >      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> > >     >
> > >     > !.....Set operators. Here the matrix that defines the linear
> > >     system also serves as the preconditioning matrix.
> > >     >      !call
> > >     KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha
> > >     commented and replaced by next line
> > >     >
> > >     >      !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A =
> > >     A-sigma*B
> > >     >      call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here
> > >     A = A-sigma*B
> > >     >
> > >     > !.....Set Relative and Absolute Tolerances and Uses Default for
> > >     Divergence Tol
> > >     >      tol = 1.e-10
> > >     >      call
> > >
>  KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr)
> > >     >
> > >     > !.....Set the Direct (LU) Solver
> > >     >      call KSPSetType(ksp,KSPPREONLY,ierr)
> > >     >      call KSPGetPC(ksp,pc,ierr)
> > >     >      call PCSetType(pc,PCLU,ierr)
> > >     >      call
> > >     PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) !
> > >     MATSOLVERSUPERLU_DIST MATSOLVERMUMPS
> > >     >
> > >     > !.....Create Right-Hand-Side Vector
> > >     >      !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr)
> > >     >      !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr)
> > >     >
> > >     >      call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr)
> > >     >      call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr)
> > >     >
> > >     >      call
> > >     MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr)
> > >     >
> > >     >      allocate(xwork1(IendA-IstartA))
> > >     >      allocate(loc(IendA-IstartA))
> > >     >
> > >     >      ct=0
> > >     >      do i=IstartA,IendA-1
> > >     >         ct=ct+1
> > >     >         loc(ct)=i
> > >     >         xwork1(ct)=(1.0d0,0.0d0)
> > >     >      enddo
> > >     >
> > >     >      call
> > >     VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr)
> > >     >      call VecZeroEntries(sol,ierr)
> > >     >
> > >     >      deallocate(xwork1,loc)
> > >     >
> > >     > !.....Assemble Vectors
> > >     >      call VecAssemblyBegin(frhs,ierr)
> > >     >      call VecAssemblyEnd(frhs,ierr)
> > >     >
> > >     > !.....Solve the Linear System
> > >     >      call KSPSolve(ksp,frhs,sol,ierr)
> > >     >
> > >     >      !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr)
> > >     >
> > >     >      if (rank==0) then
> > >     >         call cpu_time(endd)
> > >     >         write(*,*)
> > >     >         print '("Total time for HowBigLUCanBe = ",f21.3,"
> > >     seconds.")',endd-begin
> > >     >      endif
> > >     >
> > >     >      call SlepcFinalize(ierr)
> > >     >
> > >     >      STOP
> > >     >
> > >     >
> > >     >    end Subroutine HowBigLUCanBe
> > >     >
> > >     > <Amat_binary.m.info <http://Amat_binary.m.info>>
> > >
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/ebf966af/attachment.html>

From mc0710 at gmail.com  Wed Aug 12 13:29:32 2015
From: mc0710 at gmail.com (Mani Chandra)
Date: Wed, 12 Aug 2015 13:29:32 -0500
Subject: [petsc-users] Petsc+Chombo example
In-Reply-To: <CADOhEh7yi4OUBYGc0ghcfeiuzt2xJOZHi3bvXD7uA5sSiwsbzA@mail.gmail.com>
References: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
	<CAMYG4GnAkWia-Wxc8LA1R+n-pL1SZOShdLY-_y423TkS6hOTBQ@mail.gmail.com>
	<CADOhEh7yi4OUBYGc0ghcfeiuzt2xJOZHi3bvXD7uA5sSiwsbzA@mail.gmail.com>
Message-ID: <CAJzU6sAz6R_p3neEgYkXf+DRKL6oZF2kr256TdQchG=OzJAQzw@mail.gmail.com>

Hi,


> Chombo (me) creates an MPIAIJ matrix.  So automatic Jacobian assembly
> should work.
>
> I have put a SNES in a Chombo code, but did not use automatic Jacobian
> assembly.
>

Do you have an example?

Thanks,
Mani
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/c4a54674/attachment.html>

From aph at email.arizona.edu  Wed Aug 12 14:01:32 2015
From: aph at email.arizona.edu (Anthony Haas)
Date: Wed, 12 Aug 2015 12:01:32 -0700
Subject: [petsc-users] SIGSEGV in Superlu_dist
In-Reply-To: <CAGCphBt5nJvWKVep5agr0WF4a1K1Gpra1DTUUY=0xTcxD7rz7w@mail.gmail.com>
References: <55C90EAA.5060702@email.arizona.edu>	<8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>	<CAGCphBtMUJQ4XsayVCv5jFJ9=m3iwZTder2Y+hr7xXypRGknNQ@mail.gmail.com>	<55CA3A16.90206@email.arizona.edu>	<alpine.LFD.2.20.1508111329310.3796@asterix>
	<CAGCphBt5nJvWKVep5agr0WF4a1K1Gpra1DTUUY=0xTcxD7rz7w@mail.gmail.com>
Message-ID: <55CB980C.8040507@email.arizona.edu>

Hi Hong,

I have attached a schematic of my matrices. I solve a generalized EVP in 
shift-and-invert mode. As you will see, I have indeed a zero diagonal 
block in the matrices A and B (blocks 4,4). I guess I could just add the 
zero entries to the diagonal elements of A? Is that strictly necessary 
when using a direct (LU) solver? Can you please give me a short 
explanation of why empty diagonal blocks can be problematic?

Is the patch still available at: 
https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf 
?

Thanks again,

Anthony


On 08/12/2015 08:58 AM, Hong wrote:
> Anthony,
> I just patched petsc-maint branch.
>
> Your matrix Amat_binary.m has empty diagonal blocks. Most petsc 
> solvers require matrix diagonal entries to be allocated as 'non-zero', 
> i.e., insert zero values to these zero entries. I would suggest you 
> add zeros to  Amat_binary.m during its buildup. This would enable 
> petsc solvers, as well as other packages.
>
> Again, thanks for bug reporting.
>
> Hong
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: BiGlobal-temporal-A-and-B.pdf
Type: application/pdf
Size: 316365 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/ef9e6881/attachment-0001.pdf>

From knepley at gmail.com  Wed Aug 12 14:17:08 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 12 Aug 2015 14:17:08 -0500
Subject: [petsc-users] SIGSEGV in Superlu_dist
In-Reply-To: <55CB980C.8040507@email.arizona.edu>
References: <55C90EAA.5060702@email.arizona.edu>
	<8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov>
	<CAGCphBtMUJQ4XsayVCv5jFJ9=m3iwZTder2Y+hr7xXypRGknNQ@mail.gmail.com>
	<55CA3A16.90206@email.arizona.edu>
	<alpine.LFD.2.20.1508111329310.3796@asterix>
	<CAGCphBt5nJvWKVep5agr0WF4a1K1Gpra1DTUUY=0xTcxD7rz7w@mail.gmail.com>
	<55CB980C.8040507@email.arizona.edu>
Message-ID: <CAMYG4G=DzE8i9dxH6OdV9sBfO_B+vCgbG5J+Wd7-j47dD0p_mw@mail.gmail.com>

On Wed, Aug 12, 2015 at 2:01 PM, Anthony Haas <aph at email.arizona.edu> wrote:

> Hi Hong,
>
> I have attached a schematic of my matrices. I solve a generalized EVP in
> shift-and-invert mode. As you will see, I have indeed a zero diagonal block
> in the matrices A and B (blocks 4,4). I guess I could just add the zero
> entries to the diagonal elements of A? Is that strictly necessary when
> using a direct (LU) solver? Can you please give me a short explanation of
> why empty diagonal blocks can be problematic?
>

Yes, the sparse numbering schemes use the location of the diagonal for
faster indexing in many places.

   Matt


> Is the patch still available at:
> https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf
> ?
>
> Thanks again,
>
> Anthony
>
>
>
>
>
>
> On 08/12/2015 08:58 AM, Hong wrote:
>
>> Anthony,
>> I just patched petsc-maint branch.
>>
>> Your matrix Amat_binary.m has empty diagonal blocks. Most petsc solvers
>> require matrix diagonal entries to be allocated as 'non-zero', i.e., insert
>> zero values to these zero entries. I would suggest you add zeros to
>> Amat_binary.m during its buildup. This would enable petsc solvers, as well
>> as other packages.
>>
>> Again, thanks for bug reporting.
>>
>> Hong
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/1fbd1470/attachment.html>

From jychang48 at gmail.com  Wed Aug 12 14:22:34 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Wed, 12 Aug 2015 14:22:34 -0500
Subject: [petsc-users] petsc_gen_xdmf.py errors
In-Reply-To: <CAN5Wd-Jrphu3U1L+Rmtk+xuVB4J5cXuQ-A2FxYD9k5xYerwarw@mail.gmail.com>
References: <CAN5Wd-Jrphu3U1L+Rmtk+xuVB4J5cXuQ-A2FxYD9k5xYerwarw@mail.gmail.com>
Message-ID: <CAP2=TMivmeTvw8p9WcynapgSADq4hr9TWQ41rau41yDN9GuPiA@mail.gmail.com>

Fande,

Your sol.h5 is old and outdated. If you generate a more recent sol.h5 with
the following SNES example inside your PETSC_DIR:

./src/snes/examples/tutorial/ex12 -run_type test -refinement_limit 0.0
-bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5
-vec_view hdf5:sol.h5::append
./bin/petsc_gen_xdmf.py sol.h5

the resulting sol.xmf is compatible with Paraview

Thanks,
Justin


On Tue, Aug 11, 2015 at 8:53 PM, Fande Kong <fdkong.jd at gmail.com> wrote:

> Hi all,
>
> I tried to use petsc_gen_xdmf.py to generate a xml file for visulaztion
> using paraview. I got the following errors:
>
> ./petsc_gen_xdmf.py sol.h5
> Traceback (most recent call last):
>   File "./petsc_gen_xdmf.py", line 236, in <module>
>     generateXdmf(sys.argv[1])
>   File "./petsc_gen_xdmf.py", line 231, in generateXdmf
>     Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners,
> cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields)
>   File "./petsc_gen_xdmf.py", line 190, in write
>     for vf in vfields: self.writeField(fp, len(time), t, cellDim,
> spaceDim, '/vertex_fields/'+vf[0], vf, 'Node')
>   File "./petsc_gen_xdmf.py", line 164, in writeField
>     self.writeFieldComponents(fp, numSteps, timestep, spaceDim, name, f,
> domain)
>   File "./petsc_gen_xdmf.py", line 120, in writeFieldComponents
>     dims   = '1 %d 1' % (numSteps, dof, bs)
> TypeError: not all arguments converted during string formatting
>
>
> The hdf5 file is attached. Originally from Matthew. Configuration and make
> log files are also attached.
>
> Fande Kong,
>
> Thanks,
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/861af41b/attachment.html>

From jychang48 at gmail.com  Wed Aug 12 14:39:35 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Wed, 12 Aug 2015 14:39:35 -0500
Subject: [petsc-users] petsc_gen_xdmf.py errors
In-Reply-To: <CAP2=TMivmeTvw8p9WcynapgSADq4hr9TWQ41rau41yDN9GuPiA@mail.gmail.com>
References: <CAN5Wd-Jrphu3U1L+Rmtk+xuVB4J5cXuQ-A2FxYD9k5xYerwarw@mail.gmail.com>
	<CAP2=TMivmeTvw8p9WcynapgSADq4hr9TWQ41rau41yDN9GuPiA@mail.gmail.com>
Message-ID: <CAP2=TMhGEyB7=CrOr6+C6RQMbKDUtK74xe8v6YtTEftj1yXYcA@mail.gmail.com>

Sorry there was a typo. Here's the correct run:

./src/snes/examples/tutorials/ex12 -run_type test -refinement_limit 0.0
-bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5
-vec_view hdf5:sol.h5::append

On Wed, Aug 12, 2015 at 2:22 PM, Justin Chang <jychang48 at gmail.com> wrote:

> Fande,
>
> Your sol.h5 is old and outdated. If you generate a more recent sol.h5 with
> the following SNES example inside your PETSC_DIR:
>
> ./src/snes/examples/tutorial/ex12 -run_type test -refinement_limit 0.0
> -bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5
> -vec_view hdf5:sol.h5::append
> ./bin/petsc_gen_xdmf.py sol.h5
>
> the resulting sol.xmf is compatible with Paraview
>
> Thanks,
> Justin
>
>
> On Tue, Aug 11, 2015 at 8:53 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
>
>> Hi all,
>>
>> I tried to use petsc_gen_xdmf.py to generate a xml file for visulaztion
>> using paraview. I got the following errors:
>>
>> ./petsc_gen_xdmf.py sol.h5
>> Traceback (most recent call last):
>>   File "./petsc_gen_xdmf.py", line 236, in <module>
>>     generateXdmf(sys.argv[1])
>>   File "./petsc_gen_xdmf.py", line 231, in generateXdmf
>>     Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners,
>> cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields)
>>   File "./petsc_gen_xdmf.py", line 190, in write
>>     for vf in vfields: self.writeField(fp, len(time), t, cellDim,
>> spaceDim, '/vertex_fields/'+vf[0], vf, 'Node')
>>   File "./petsc_gen_xdmf.py", line 164, in writeField
>>     self.writeFieldComponents(fp, numSteps, timestep, spaceDim, name, f,
>> domain)
>>   File "./petsc_gen_xdmf.py", line 120, in writeFieldComponents
>>     dims   = '1 %d 1' % (numSteps, dof, bs)
>> TypeError: not all arguments converted during string formatting
>>
>>
>> The hdf5 file is attached. Originally from Matthew. Configuration and
>> make log files are also attached.
>>
>> Fande Kong,
>>
>> Thanks,
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/9cf784f6/attachment.html>

From mfadams at lbl.gov  Wed Aug 12 17:56:59 2015
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 12 Aug 2015 18:56:59 -0400
Subject: [petsc-users] Petsc+Chombo example
In-Reply-To: <CAJzU6sAz6R_p3neEgYkXf+DRKL6oZF2kr256TdQchG=OzJAQzw@mail.gmail.com>
References: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
	<CAMYG4GnAkWia-Wxc8LA1R+n-pL1SZOShdLY-_y423TkS6hOTBQ@mail.gmail.com>
	<CADOhEh7yi4OUBYGc0ghcfeiuzt2xJOZHi3bvXD7uA5sSiwsbzA@mail.gmail.com>
	<CAJzU6sAz6R_p3neEgYkXf+DRKL6oZF2kr256TdQchG=OzJAQzw@mail.gmail.com>
Message-ID: <CADOhEh7UgLOw2+BJ_oybRjK7wn54Wkc7-K+t1WE3gUCdp+Opbg@mail.gmail.com>

On Wed, Aug 12, 2015 at 2:29 PM, Mani Chandra <mc0710 at gmail.com> wrote:

> Hi,
>
>
>> Chombo (me) creates an MPIAIJ matrix.  So automatic Jacobian assembly
>> should work.
>>
>> I have put a SNES in a Chombo code, but did not use automatic Jacobian
>> assembly.
>>
>
> Do you have an example?
>

If you want to make a SNES solver then you need an "apply" call back
function and a way to map Chombo vectors with PETSc vectors.

Chombo has a level solver (classes derived from PetscSolver) and an AMR
composite matrix constructor class (classes derived from PetscCompGrid) in
lib/src/AMRElliptic.  These two class each create these maps, providing
methods to "putChomboInPetsc", and so forth.
 lib/src/AMRElliptic/PetscSolverI.H has an apply_mfree() method that is a
callback function that you give to PETSc to apply an operator.  There are
examples in Chombo on how to use/construct these two classes, or two
installations of them.  Each of these classes has a Poisson and a 2D
Viscous Tensor instantiation.

You probably want to look at PETSc SNES examples if you are not familiar
with SNES to get an idea of what you need to provide.  Then, look at the
appropriate Chombo class as a place start.  I am guessing that you will
want to write your own solver and just use these classes to get these
mapping methods.  Wrapping a Chombo operator (apply) and solver in a SNES
is not hard and PetscSolverI.H has examples.

These codes only have one user each (and they are both ANAG staff members),
so they are pretty immature codes.

Mark


> Thanks,
> Mani
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/03c5bf76/attachment.html>

From fdkong.jd at gmail.com  Wed Aug 12 19:09:07 2015
From: fdkong.jd at gmail.com (Fande Kong)
Date: Wed, 12 Aug 2015 19:09:07 -0500
Subject: [petsc-users] petsc_gen_xdmf.py errors
In-Reply-To: <CAP2=TMhGEyB7=CrOr6+C6RQMbKDUtK74xe8v6YtTEftj1yXYcA@mail.gmail.com>
References: <CAN5Wd-Jrphu3U1L+Rmtk+xuVB4J5cXuQ-A2FxYD9k5xYerwarw@mail.gmail.com>
	<CAP2=TMivmeTvw8p9WcynapgSADq4hr9TWQ41rau41yDN9GuPiA@mail.gmail.com>
	<CAP2=TMhGEyB7=CrOr6+C6RQMbKDUtK74xe8v6YtTEftj1yXYcA@mail.gmail.com>
Message-ID: <CAN5Wd-KB3qJn92CubCL55e3NJfDTqt_uCi04893C685+GpBC3A@mail.gmail.com>

Thanks, Justin,

I can get a correct xml file now.


Thanks,

Fande Kong,

On Wed, Aug 12, 2015 at 2:39 PM, Justin Chang <jychang48 at gmail.com> wrote:

> Sorry there was a typo. Here's the correct run:
>
> ./src/snes/examples/tutorials/ex12 -run_type test -refinement_limit 0.0
> -bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5
> -vec_view hdf5:sol.h5::append
>
> On Wed, Aug 12, 2015 at 2:22 PM, Justin Chang <jychang48 at gmail.com> wrote:
>
>> Fande,
>>
>> Your sol.h5 is old and outdated. If you generate a more recent sol.h5
>> with the following SNES example inside your PETSC_DIR:
>>
>> ./src/snes/examples/tutorial/ex12 -run_type test -refinement_limit 0.0
>> -bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5
>> -vec_view hdf5:sol.h5::append
>> ./bin/petsc_gen_xdmf.py sol.h5
>>
>> the resulting sol.xmf is compatible with Paraview
>>
>> Thanks,
>> Justin
>>
>>
>> On Tue, Aug 11, 2015 at 8:53 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I tried to use petsc_gen_xdmf.py to generate a xml file for visulaztion
>>> using paraview. I got the following errors:
>>>
>>> ./petsc_gen_xdmf.py sol.h5
>>> Traceback (most recent call last):
>>>   File "./petsc_gen_xdmf.py", line 236, in <module>
>>>     generateXdmf(sys.argv[1])
>>>   File "./petsc_gen_xdmf.py", line 231, in generateXdmf
>>>     Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells,
>>> numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields,
>>> cfields)
>>>   File "./petsc_gen_xdmf.py", line 190, in write
>>>     for vf in vfields: self.writeField(fp, len(time), t, cellDim,
>>> spaceDim, '/vertex_fields/'+vf[0], vf, 'Node')
>>>   File "./petsc_gen_xdmf.py", line 164, in writeField
>>>     self.writeFieldComponents(fp, numSteps, timestep, spaceDim, name, f,
>>> domain)
>>>   File "./petsc_gen_xdmf.py", line 120, in writeFieldComponents
>>>     dims   = '1 %d 1' % (numSteps, dof, bs)
>>> TypeError: not all arguments converted during string formatting
>>>
>>>
>>> The hdf5 file is attached. Originally from Matthew. Configuration and
>>> make log files are also attached.
>>>
>>> Fande Kong,
>>>
>>> Thanks,
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/b7c893a6/attachment-0001.html>

From mc0710 at gmail.com  Wed Aug 12 21:52:56 2015
From: mc0710 at gmail.com (Mani Chandra)
Date: Wed, 12 Aug 2015 21:52:56 -0500
Subject: [petsc-users] Petsc+Chombo example
In-Reply-To: <CADOhEh7UgLOw2+BJ_oybRjK7wn54Wkc7-K+t1WE3gUCdp+Opbg@mail.gmail.com>
References: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
	<CAMYG4GnAkWia-Wxc8LA1R+n-pL1SZOShdLY-_y423TkS6hOTBQ@mail.gmail.com>
	<CADOhEh7yi4OUBYGc0ghcfeiuzt2xJOZHi3bvXD7uA5sSiwsbzA@mail.gmail.com>
	<CAJzU6sAz6R_p3neEgYkXf+DRKL6oZF2kr256TdQchG=OzJAQzw@mail.gmail.com>
	<CADOhEh7UgLOw2+BJ_oybRjK7wn54Wkc7-K+t1WE3gUCdp+Opbg@mail.gmail.com>
Message-ID: <CAJzU6sBteA7x3Jg-=hqKjDbFudcJsCgFqc7jr1xeSL1MJdKE0A@mail.gmail.com>

Thanks for the information. I'm familiar with SNES+DMDA, just not sure it
would work with Chombo. But I'll give it a shot.

Cheers,
Mani

On Wed, Aug 12, 2015 at 5:56 PM, Mark Adams <mfadams at lbl.gov> wrote:

>
>
> On Wed, Aug 12, 2015 at 2:29 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>
>> Hi,
>>
>>
>>> Chombo (me) creates an MPIAIJ matrix.  So automatic Jacobian assembly
>>> should work.
>>>
>>> I have put a SNES in a Chombo code, but did not use automatic Jacobian
>>> assembly.
>>>
>>
>> Do you have an example?
>>
>
> If you want to make a SNES solver then you need an "apply" call back
> function and a way to map Chombo vectors with PETSc vectors.
>
> Chombo has a level solver (classes derived from PetscSolver) and an AMR
> composite matrix constructor class (classes derived from PetscCompGrid) in
> lib/src/AMRElliptic.  These two class each create these maps, providing
> methods to "putChomboInPetsc", and so forth.
>  lib/src/AMRElliptic/PetscSolverI.H has an apply_mfree() method that is a
> callback function that you give to PETSc to apply an operator.  There are
> examples in Chombo on how to use/construct these two classes, or two
> installations of them.  Each of these classes has a Poisson and a 2D
> Viscous Tensor instantiation.
>
> You probably want to look at PETSc SNES examples if you are not familiar
> with SNES to get an idea of what you need to provide.  Then, look at the
> appropriate Chombo class as a place start.  I am guessing that you will
> want to write your own solver and just use these classes to get these
> mapping methods.  Wrapping a Chombo operator (apply) and solver in a SNES
> is not hard and PetscSolverI.H has examples.
>
> These codes only have one user each (and they are both ANAG staff
> members), so they are pretty immature codes.
>
> Mark
>
>
>> Thanks,
>> Mani
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150812/dcbe8be1/attachment.html>

From mfadams at lbl.gov  Thu Aug 13 08:36:41 2015
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 13 Aug 2015 09:36:41 -0400
Subject: [petsc-users] Petsc+Chombo example
In-Reply-To: <CAJzU6sBteA7x3Jg-=hqKjDbFudcJsCgFqc7jr1xeSL1MJdKE0A@mail.gmail.com>
References: <CAJzU6sB6T7+g6HAxt=-tmrppYsQiQhJaPCMVUpqk1hdgj6Tjmg@mail.gmail.com>
	<CAMYG4GnAkWia-Wxc8LA1R+n-pL1SZOShdLY-_y423TkS6hOTBQ@mail.gmail.com>
	<CADOhEh7yi4OUBYGc0ghcfeiuzt2xJOZHi3bvXD7uA5sSiwsbzA@mail.gmail.com>
	<CAJzU6sAz6R_p3neEgYkXf+DRKL6oZF2kr256TdQchG=OzJAQzw@mail.gmail.com>
	<CADOhEh7UgLOw2+BJ_oybRjK7wn54Wkc7-K+t1WE3gUCdp+Opbg@mail.gmail.com>
	<CAJzU6sBteA7x3Jg-=hqKjDbFudcJsCgFqc7jr1xeSL1MJdKE0A@mail.gmail.com>
Message-ID: <CADOhEh7F+dLURcN4TJmgG0EdbHS4fcCoir6k=pAd7=moqpJcCg@mail.gmail.com>

On Wed, Aug 12, 2015 at 10:52 PM, Mani Chandra <mc0710 at gmail.com> wrote:

> Thanks for the information. I'm familiar with SNES+DMDA,
>

DMDA will not work with Chombo.  DMDA only works with uniform grids.  My
two (base) classes do the transformations and linearizations for a 1) level
solve and 2) full AMR solve, to a AIJ matrix.

I don't think you want to look at DMs.

Mark


> just not sure it would work with Chombo. But I'll give it a shot.
>
> Cheers,
> Mani
>
> On Wed, Aug 12, 2015 at 5:56 PM, Mark Adams <mfadams at lbl.gov> wrote:
>
>>
>>
>> On Wed, Aug 12, 2015 at 2:29 PM, Mani Chandra <mc0710 at gmail.com> wrote:
>>
>>> Hi,
>>>
>>>
>>>> Chombo (me) creates an MPIAIJ matrix.  So automatic Jacobian assembly
>>>> should work.
>>>>
>>>> I have put a SNES in a Chombo code, but did not use automatic Jacobian
>>>> assembly.
>>>>
>>>
>>> Do you have an example?
>>>
>>
>> If you want to make a SNES solver then you need an "apply" call back
>> function and a way to map Chombo vectors with PETSc vectors.
>>
>> Chombo has a level solver (classes derived from PetscSolver) and an AMR
>> composite matrix constructor class (classes derived from PetscCompGrid) in
>> lib/src/AMRElliptic.  These two class each create these maps, providing
>> methods to "putChomboInPetsc", and so forth.
>>  lib/src/AMRElliptic/PetscSolverI.H has an apply_mfree() method that is a
>> callback function that you give to PETSc to apply an operator.  There are
>> examples in Chombo on how to use/construct these two classes, or two
>> installations of them.  Each of these classes has a Poisson and a 2D
>> Viscous Tensor instantiation.
>>
>> You probably want to look at PETSc SNES examples if you are not familiar
>> with SNES to get an idea of what you need to provide.  Then, look at the
>> appropriate Chombo class as a place start.  I am guessing that you will
>> want to write your own solver and just use these classes to get these
>> mapping methods.  Wrapping a Chombo operator (apply) and solver in a SNES
>> is not hard and PetscSolverI.H has examples.
>>
>> These codes only have one user each (and they are both ANAG staff
>> members), so they are pretty immature codes.
>>
>> Mark
>>
>>
>>> Thanks,
>>> Mani
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/fed2d2cd/attachment.html>

From jychang48 at gmail.com  Thu Aug 13 10:34:44 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Thu, 13 Aug 2015 10:34:44 -0500
Subject: [petsc-users] Understanding the memory bandwidth
Message-ID: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>

Hi all,

According to our University's HPC cluster (Intel Xeon E5-2680v2
<http://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20E5-2680%20v2.html>), the
online specifications says I should have a maximum BW of 59.7 GB/s. I am
guessing this number is computed by 1866 MHz * 8 Bytes * 4 memory channels.

Now, when I run the STREAMS Triad benchmark on a single compute node (two
sockets, 10 cores each, 64 GB total memory), on up to 20 processes with
MPICH, i get the following:

$ mpiexec -n 1 ./MPIVersion:
Triad:        13448.6701   Rate (MB/s)

$ mpiexec -n 2 ./MPIVersion:
Triad:        24409.1406   Rate (MB/s)

$ mpiexec -n 4 ./MPIVersion
Triad:        31914.8087   Rate (MB/s)

$ mpiexec -n 6 ./MPIVersion
Triad:        33290.2676   Rate (MB/s)

$ mpiexec -n 8 ./MPIVersion
Triad:        33618.2542   Rate (MB/s)

$ mpiexec -n 10 ./MPIVersion
Triad:        33730.1662   Rate (MB/s)

$ mpiexec -n 12 ./MPIVersion
Triad:        40835.9440   Rate (MB/s)

$ mpiexec -n 14 ./MPIVersion
Triad:        44396.0042   Rate (MB/s)

$ mpiexec -n 16 ./MPIVersion
Triad:        54647.5214   Rate (MB/s) *

$ mpiexec -n 18 ./MPIVersion
Triad:        57530.8125   Rate (MB/s) *

$ mpiexec -n 20 ./MPIVersion
Triad:        42388.0739   Rate (MB/s) *

The * numbers fluctuate greatly each time I run this. However, if I use
hydra's processor binding options:

$ mpiexec.hydra -n 2 -bind-to socket ./MPIVersion
Triad:        26879.3853   Rate (MB/s)

$ mpiexec.hydra -n 4 -bind-to socket ./MPIVersion
Triad:        48363.8441   Rate (MB/s)

$ mpiexec.hydra -n 8 -bind-to socket ./MPIVersion
Triad:        63479.9284   Rate (MB/s)

$ mpiexec.hydra -n 10 -bind-to socket ./MPIVersion
Triad:        66160.5627   Rate (MB/s)

$ mpiexec.hydra -n 16 -bind-to socket ./MPIVersion
Triad:        65975.5959   Rate (MB/s)

$ mpiexec.hydra -n 20 -bind-to socket ./MPIVersion
Triad:        64738.9336   Rate (MB/s)

I get similar metrics when i use the binding options "-bind-to hwthread
-map-by socket".

Now my question is, is 13.5 GB/s on one processor "good"? Because when I
compare this to the 59.7 GB/s it seems really inefficient. Is there a way
to browse through my system files to confirm this?

Also, when I use multiple cores and with proper binding, the streams BW
exceeds the reported max BW. Is this expected?

Thanks,
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/30093088/attachment.html>

From thronesf at gmail.com  Thu Aug 13 10:52:39 2015
From: thronesf at gmail.com (Sharp Stone)
Date: Thu, 13 Aug 2015 11:52:39 -0400
Subject: [petsc-users] Multigrid and AMR
Message-ID: <CABJpBAx4vO4dG0pBbP595J+D_G6fNDNGDs3WbLGQJfi-Yf2WXw@mail.gmail.com>

Hi All,

I'm a new who are dealing with linear systems, and want to use multigrid,
especially AMR. I found some examples regarding multigrids, but does petsc
have any examples of AMR?

Thank you in advance!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/4294a428/attachment.html>

From knepley at gmail.com  Thu Aug 13 11:06:00 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 13 Aug 2015 11:06:00 -0500
Subject: [petsc-users] Multigrid and AMR
In-Reply-To: <CABJpBAx4vO4dG0pBbP595J+D_G6fNDNGDs3WbLGQJfi-Yf2WXw@mail.gmail.com>
References: <CABJpBAx4vO4dG0pBbP595J+D_G6fNDNGDs3WbLGQJfi-Yf2WXw@mail.gmail.com>
Message-ID: <CAMYG4GmZZooRZCbFA5=+WTYUMMSdZ1LhHX3NT0e=P+JO8MCeKQ@mail.gmail.com>

On Thu, Aug 13, 2015 at 10:52 AM, Sharp Stone <thronesf at gmail.com> wrote:

> Hi All,
>
> I'm a new who are dealing with linear systems, and want to use multigrid,
> especially AMR. I found some examples regarding multigrids, but does petsc
> have any examples of AMR?
>

Do you mean Algebraic Multigrid (AMG)?

  Matt


> Thank you in advance!
>
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/9bffdc0f/attachment-0001.html>

From bsmith at mcs.anl.gov  Thu Aug 13 12:51:15 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 13 Aug 2015 12:51:15 -0500
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
References: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
Message-ID: <0A4EBD35-AA57-4E25-B874-639DA6199218@mcs.anl.gov>


> On Aug 13, 2015, at 10:34 AM, Justin Chang <jychang48 at gmail.com> wrote:
> 
> Hi all,
> 
> According to our University's HPC cluster (Intel Xeon E5-2680v2), the online specifications says I should have a maximum BW of 59.7 GB/s. I am guessing this number is computed by 1866 MHz * 8 Bytes * 4 memory channels. 
> 
> Now, when I run the STREAMS Triad benchmark on a single compute node (two sockets, 10 cores each, 64 GB total memory), on up to 20 processes with MPICH, i get the following:
> 
> $ mpiexec -n 1 ./MPIVersion: 
> Triad:        13448.6701   Rate (MB/s) 
> 
> $ mpiexec -n 2 ./MPIVersion: 
> Triad:        24409.1406   Rate (MB/s) 
> 
> $ mpiexec -n 4 ./MPIVersion 
> Triad:        31914.8087   Rate (MB/s)
> $ mpiexec -n 6 ./MPIVersion 
> Triad:        33290.2676   Rate (MB/s) 
> 
> 
> $ mpiexec -n 8 ./MPIVersion 
> Triad:        33618.2542   Rate (MB/s)
> 
> $ mpiexec -n 10 ./MPIVersion 
> Triad:        33730.1662   Rate (MB/s) 
> 
> 
> $ mpiexec -n 12 ./MPIVersion 
> Triad:        40835.9440   Rate (MB/s) 
> 
> 
> $ mpiexec -n 14 ./MPIVersion 
> Triad:        44396.0042   Rate (MB/s)
> 
> $ mpiexec -n 16 ./MPIVersion 
> Triad:        54647.5214   Rate (MB/s) *
> 
> $ mpiexec -n 18 ./MPIVersion 
> Triad:        57530.8125   Rate (MB/s) *
> 
> $ mpiexec -n 20 ./MPIVersion 
> Triad:        42388.0739   Rate (MB/s) *
> 
> The * numbers fluctuate greatly each time I run this.

   Yeah, MPICH's default behavior is super annoying. I think they need better defaults.

> However, if I use hydra's processor binding options:
> 
> $ mpiexec.hydra -n 2 -bind-to socket ./MPIVersion 
> Triad:        26879.3853   Rate (MB/s) 
> 
> $ mpiexec.hydra -n 4 -bind-to socket ./MPIVersion 
> Triad:        48363.8441   Rate (MB/s)
> 
> $ mpiexec.hydra -n 8 -bind-to socket ./MPIVersion 
> Triad:        63479.9284   Rate (MB/s)
> 
> $ mpiexec.hydra -n 10 -bind-to socket ./MPIVersion 
> Triad:        66160.5627   Rate (MB/s) 
> 
> $ mpiexec.hydra -n 16 -bind-to socket ./MPIVersion 
> Triad:        65975.5959   Rate (MB/s) 
> 
> $ mpiexec.hydra -n 20 -bind-to socket ./MPIVersion 
> Triad:        64738.9336   Rate (MB/s) 
> 
> I get similar metrics when i use the binding options "-bind-to hwthread -map-by socket". 
> 
> Now my question is, is 13.5 GB/s on one processor "good"?

    You mean one core. 

     Yes, that is a good number. These systems are not designed so that a single core can "saturate" (that is use) all the memory bandwidth of the node. Note that after about 8 cores you don't see any more improvement because the 8 cores has saturated the memory bandwidth. What this means is that for PETSc simulations any cores beyond 8 (or so) on the node are just unnecessary eye-candy.


> Because when I compare this to the 59.7 GB/s it seems really inefficient. Is there a way to browse through my system files to confirm this?
> 
> Also, when I use multiple cores and with proper binding, the streams BW exceeds the reported max BW. Is this expected?

   I cannot explain this, look at the exact number of loads and stores needed for the triad benchmark. Perhaps the online docs are out of date.

  Barry

> 
> Thanks,
> Justin
> 


From thronesf at gmail.com  Thu Aug 13 12:55:54 2015
From: thronesf at gmail.com (Sharp Stone)
Date: Thu, 13 Aug 2015 13:55:54 -0400
Subject: [petsc-users] Multigrid and AMR
In-Reply-To: <CAMYG4GmZZooRZCbFA5=+WTYUMMSdZ1LhHX3NT0e=P+JO8MCeKQ@mail.gmail.com>
References: <CABJpBAx4vO4dG0pBbP595J+D_G6fNDNGDs3WbLGQJfi-Yf2WXw@mail.gmail.com>
	<CAMYG4GmZZooRZCbFA5=+WTYUMMSdZ1LhHX3NT0e=P+JO8MCeKQ@mail.gmail.com>
Message-ID: <CABJpBAwQWLhZ5VfkDPT7HXEc3T4QgGNtdRsfooH2J5+uf8d7OQ@mail.gmail.com>

No, I mean Adaptive Mesh Refinement (AMR). Is this supported now in Petsc?

Thanks!

On Thu, Aug 13, 2015 at 12:06 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Aug 13, 2015 at 10:52 AM, Sharp Stone <thronesf at gmail.com> wrote:
>
>> Hi All,
>>
>> I'm a new who are dealing with linear systems, and want to use multigrid,
>> especially AMR. I found some examples regarding multigrids, but does petsc
>> have any examples of AMR?
>>
>
> Do you mean Algebraic Multigrid (AMG)?
>
>   Matt
>
>
>> Thank you in advance!
>>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
Best regards,

Feng
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/dd01aef2/attachment.html>

From knepley at gmail.com  Thu Aug 13 12:57:38 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 13 Aug 2015 12:57:38 -0500
Subject: [petsc-users] Multigrid and AMR
In-Reply-To: <CABJpBAwQWLhZ5VfkDPT7HXEc3T4QgGNtdRsfooH2J5+uf8d7OQ@mail.gmail.com>
References: <CABJpBAx4vO4dG0pBbP595J+D_G6fNDNGDs3WbLGQJfi-Yf2WXw@mail.gmail.com>
	<CAMYG4GmZZooRZCbFA5=+WTYUMMSdZ1LhHX3NT0e=P+JO8MCeKQ@mail.gmail.com>
	<CABJpBAwQWLhZ5VfkDPT7HXEc3T4QgGNtdRsfooH2J5+uf8d7OQ@mail.gmail.com>
Message-ID: <CAMYG4Gmvj5rtmjDAdmQMwH7k3G0qhEcmivAimJf6-ko3Ock_gQ@mail.gmail.com>

On Thu, Aug 13, 2015 at 12:55 PM, Sharp Stone <thronesf at gmail.com> wrote:

> No, I mean Adaptive Mesh Refinement (AMR). Is this supported now in Petsc?
>

No, not at this time.

  Thanks,

    Matt


> Thanks!
>
> On Thu, Aug 13, 2015 at 12:06 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Thu, Aug 13, 2015 at 10:52 AM, Sharp Stone <thronesf at gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I'm a new who are dealing with linear systems, and want to use
>>> multigrid, especially AMR. I found some examples regarding multigrids, but
>>> does petsc have any examples of AMR?
>>>
>>
>> Do you mean Algebraic Multigrid (AMG)?
>>
>>   Matt
>>
>>
>>> Thank you in advance!
>>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> Best regards,
>
> Feng
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/4473d24c/attachment.html>

From jed at jedbrown.org  Thu Aug 13 13:04:27 2015
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 13 Aug 2015 12:04:27 -0600
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
References: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
Message-ID: <877fozqd3o.fsf@jedbrown.org>

Justin Chang <jychang48 at gmail.com> writes:

> Hi all,
>
> According to our University's HPC cluster (Intel Xeon E5-2680v2
> <http://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20E5-2680%20v2.html>), the
> online specifications says I should have a maximum BW of 59.7 GB/s. I am
> guessing this number is computed by 1866 MHz * 8 Bytes * 4 memory channels.

Yup, per socket.

> Now, when I run the STREAMS Triad benchmark on a single compute node (two
> sockets, 10 cores each, 64 GB total memory), on up to 20 processes with
> MPICH, i get the following:
>
> $ mpiexec -n 1 ./MPIVersion:
> Triad:        13448.6701   Rate (MB/s)
>
> $ mpiexec -n 2 ./MPIVersion:
> Triad:        24409.1406   Rate (MB/s)
>
> $ mpiexec -n 4 ./MPIVersion
> Triad:        31914.8087   Rate (MB/s)
>
> $ mpiexec -n 6 ./MPIVersion
> Triad:        33290.2676   Rate (MB/s)
>
> $ mpiexec -n 8 ./MPIVersion
> Triad:        33618.2542   Rate (MB/s)
>
> $ mpiexec -n 10 ./MPIVersion
> Triad:        33730.1662   Rate (MB/s)
>
> $ mpiexec -n 12 ./MPIVersion
> Triad:        40835.9440   Rate (MB/s)
>
> $ mpiexec -n 14 ./MPIVersion
> Triad:        44396.0042   Rate (MB/s)
>
> $ mpiexec -n 16 ./MPIVersion
> Triad:        54647.5214   Rate (MB/s) *
>
> $ mpiexec -n 18 ./MPIVersion
> Triad:        57530.8125   Rate (MB/s) *
>
> $ mpiexec -n 20 ./MPIVersion
> Triad:        42388.0739   Rate (MB/s) *
>
> The * numbers fluctuate greatly each time I run this. However, if I use
> hydra's processor binding options:
>
> $ mpiexec.hydra -n 2 -bind-to socket ./MPIVersion
> Triad:        26879.3853   Rate (MB/s)
>
> $ mpiexec.hydra -n 4 -bind-to socket ./MPIVersion
> Triad:        48363.8441   Rate (MB/s)
>
> $ mpiexec.hydra -n 8 -bind-to socket ./MPIVersion
> Triad:        63479.9284   Rate (MB/s)

It looks like with one core/socket, all your memory sits over one
channel.  You can play tricks to avoid that or use 4 cores/socket in
order to use all memory channels.

> $ mpiexec.hydra -n 10 -bind-to socket ./MPIVersion
> Triad:        66160.5627   Rate (MB/s)

So this is a pretty low fraction (55%) of 59.7*2 = 119.4.  I suspect
your memory or motherboard is at most 1600 MHz, so your peak would be
102.4 GB/s.

You can check this as root using "dmidecode --type 17", which should
give one entry per channel, looking something like this:

Handle 0x002B, DMI type 17, 34 bytes
Memory Device
        Array Handle: 0x002A
        Error Information Handle: 0x002F
        Total Width: Unknown
        Data Width: Unknown
        Size: 4096 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM0
        Bank Locator: BANK 0
        Type: <OUT OF SPEC>
        Type Detail: None
        Speed: Unknown
        Manufacturer: Not Specified
        Serial Number: Not Specified
        Asset Tag: Unknown
        Part Number: Not Specified
        Rank: Unknown
        Configured Clock Speed: 1600 MHz


> Now my question is, is 13.5 GB/s on one processor "good"? 

One memory channel is 1.866 * 8 = 14.9 GB/s.  You can get some bonus
overlap when adjacent pages are on different busses, but the prefetcher
only looks so far ahead, so most of the time you're only pulling from
one channel when using one thread.

> Because when I compare this to the 59.7 GB/s it seems really
> inefficient. Is there a way to browse through my system files to
> confirm this?
>
> Also, when I use multiple cores and with proper binding, the streams BW
> exceeds the reported max BW. Is this expected?

You're using two sockets.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/4e7cfa69/attachment.pgp>

From jychang48 at gmail.com  Thu Aug 13 15:22:42 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Thu, 13 Aug 2015 15:22:42 -0500
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <877fozqd3o.fsf@jedbrown.org>
References: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
	<877fozqd3o.fsf@jedbrown.org>
Message-ID: <CAP2=TMigemEB3E8itwn=0ddgLHFxyUvYzv3=mJ2EXvxPUSzrng@mail.gmail.com>

On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown <jed at jedbrown.org> wrote:
> It looks like with one core/socket, all your memory sits over one
> channel.  You can play tricks to avoid that or use 4 cores/socket in
> order to use all memory channels.

How do I play these tricks?

> So this is a pretty low fraction (55%) of 59.7*2 = 119.4.  I suspect
> your memory or motherboard is at most 1600 MHz, so your peak would be
> 102.4 GB/s.

> You can check this as root using "dmidecode --type 17", which should
> give one entry per channel, looking something like this:
>
> Handle 0x002B, DMI type 17, 34 bytes
> Memory Device
>         Array Handle: 0x002A
>         Error Information Handle: 0x002F
>         Total Width: Unknown
>         Data Width: Unknown
>         Size: 4096 MB
>         Form Factor: DIMM
>         Set: None
>         Locator: DIMM0
>         Bank Locator: BANK 0
>         Type: <OUT OF SPEC>
>         Type Detail: None
>         Speed: Unknown
>         Manufacturer: Not Specified
>         Serial Number: Not Specified
>         Asset Tag: Unknown
>         Part Number: Not Specified
>         Rank: Unknown
>         Configured Clock Speed: 1600 MHz

I have no root access. Is there another way to confirm the clock speed?

---

So if I have two sockets per node, then the theoretical peak bandwidth
is actually double than what I thought (whether it be 119.4 GB/s or
102.4 GB/s). And if 8 cores really is the optimal number to use for a
single compute node, why are there 20 totals to begin with? Or would
this depend on the particular application?

Also, can someone elaborate on the difference between the words
"core", "processor", and "thread"?

From knepley at gmail.com  Thu Aug 13 15:30:47 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 13 Aug 2015 15:30:47 -0500
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <CAP2=TMigemEB3E8itwn=0ddgLHFxyUvYzv3=mJ2EXvxPUSzrng@mail.gmail.com>
References: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
	<877fozqd3o.fsf@jedbrown.org>
	<CAP2=TMigemEB3E8itwn=0ddgLHFxyUvYzv3=mJ2EXvxPUSzrng@mail.gmail.com>
Message-ID: <CAMYG4GkBZ3tXKCxBGZ-8FxnZpr+0Q7-RwgZ-pXfh1_BJ5DtW8A@mail.gmail.com>

On Thu, Aug 13, 2015 at 3:22 PM, Justin Chang <jychang48 at gmail.com> wrote:

> On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown <jed at jedbrown.org> wrote:
> > It looks like with one core/socket, all your memory sits over one
> > channel.  You can play tricks to avoid that or use 4 cores/socket in
> > order to use all memory channels.
>
> How do I play these tricks?
>
> > So this is a pretty low fraction (55%) of 59.7*2 = 119.4.  I suspect
> > your memory or motherboard is at most 1600 MHz, so your peak would be
> > 102.4 GB/s.
>
> > You can check this as root using "dmidecode --type 17", which should
> > give one entry per channel, looking something like this:
> >
> > Handle 0x002B, DMI type 17, 34 bytes
> > Memory Device
> >         Array Handle: 0x002A
> >         Error Information Handle: 0x002F
> >         Total Width: Unknown
> >         Data Width: Unknown
> >         Size: 4096 MB
> >         Form Factor: DIMM
> >         Set: None
> >         Locator: DIMM0
> >         Bank Locator: BANK 0
> >         Type: <OUT OF SPEC>
> >         Type Detail: None
> >         Speed: Unknown
> >         Manufacturer: Not Specified
> >         Serial Number: Not Specified
> >         Asset Tag: Unknown
> >         Part Number: Not Specified
> >         Rank: Unknown
> >         Configured Clock Speed: 1600 MHz
>
> I have no root access. Is there another way to confirm the clock speed?
>
> ---
>
> So if I have two sockets per node, then the theoretical peak bandwidth
> is actually double than what I thought (whether it be 119.4 GB/s or
> 102.4 GB/s). And if 8 cores really is the optimal number to use for a
> single compute node, why are there 20 totals to begin with? Or would
> this depend on the particular application?
>

Kind Answer: Different application have different needs

Cynical Answer: Computer companies sell you what they can produce,
lots of cores, not what you need, lots of bandwidth. Bandwidth is very
expensive and there are technical limits.


> Also, can someone elaborate on the difference between the words
> "core", "processor", and "thread"?
>

A core and a processor are hardware terms. I think they are both fuzzy,
but I understand a core to be something that can carry a thread of
execution,
namely a program counter, instruction and data stream, and compute
something.
A thread is a logical construct for talking about an execution stream.

   Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/a402ae40/attachment-0001.html>

From balay at mcs.anl.gov  Thu Aug 13 15:40:55 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 13 Aug 2015 15:40:55 -0500
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <CAMYG4GkBZ3tXKCxBGZ-8FxnZpr+0Q7-RwgZ-pXfh1_BJ5DtW8A@mail.gmail.com>
References: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
	<877fozqd3o.fsf@jedbrown.org>
	<CAP2=TMigemEB3E8itwn=0ddgLHFxyUvYzv3=mJ2EXvxPUSzrng@mail.gmail.com>
	<CAMYG4GkBZ3tXKCxBGZ-8FxnZpr+0Q7-RwgZ-pXfh1_BJ5DtW8A@mail.gmail.com>
Message-ID: <alpine.LFD.2.20.1508131532590.27960@asterix>

On Thu, 13 Aug 2015, Matthew Knepley wrote:

> 
> > Also, can someone elaborate on the difference between the words
> > "core", "processor", and "thread"?
> >
>  A core and a processor are hardware terms. I think they are both
> fuzzy, but I understand a core to be something that can carry a
> thread of execution, namely a program counter, instruction and data
> stream, and compute something.  A thread is a logical construct for
> talking about an execution stream.

Perhaps there are multiple terminologies here - but I think you are
asking about the difference between:

CPU/processor, core, hardware-thread

CPU: a (manufacturing) packaging unit. Or a single chip that can be
inserted on the MotherBoard.

Core: a CPU can have multiple cores. Each core is equivalent to
independent processing unit

hardware-thread. Its a virtual mode for a single core (hardware)
process multiple streams of instructions simultaneously (aka virtual
cores).

The core vs hardware threads is a murky territory. H designers can do
quiet complex things here - esp between gore/hardware-thread
boundaries.

Satish

From bsmith at mcs.anl.gov  Thu Aug 13 15:47:30 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 13 Aug 2015 15:47:30 -0500
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <CAMYG4GkBZ3tXKCxBGZ-8FxnZpr+0Q7-RwgZ-pXfh1_BJ5DtW8A@mail.gmail.com>
References: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
	<877fozqd3o.fsf@jedbrown.org>
	<CAP2=TMigemEB3E8itwn=0ddgLHFxyUvYzv3=mJ2EXvxPUSzrng@mail.gmail.com>
	<CAMYG4GkBZ3tXKCxBGZ-8FxnZpr+0Q7-RwgZ-pXfh1_BJ5DtW8A@mail.gmail.com>
Message-ID: <318DDA3E-0B57-432C-A30D-94257EBBF686@mcs.anl.gov>


> On Aug 13, 2015, at 3:30 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Thu, Aug 13, 2015 at 3:22 PM, Justin Chang <jychang48 at gmail.com> wrote:
> On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown <jed at jedbrown.org> wrote:
> > It looks like with one core/socket, all your memory sits over one
> > channel.  You can play tricks to avoid that or use 4 cores/socket in
> > order to use all memory channels.
> 
> How do I play these tricks?
> 
> > So this is a pretty low fraction (55%) of 59.7*2 = 119.4.  I suspect
> > your memory or motherboard is at most 1600 MHz, so your peak would be
> > 102.4 GB/s.
> 
> > You can check this as root using "dmidecode --type 17", which should
> > give one entry per channel, looking something like this:
> >
> > Handle 0x002B, DMI type 17, 34 bytes
> > Memory Device
> >         Array Handle: 0x002A
> >         Error Information Handle: 0x002F
> >         Total Width: Unknown
> >         Data Width: Unknown
> >         Size: 4096 MB
> >         Form Factor: DIMM
> >         Set: None
> >         Locator: DIMM0
> >         Bank Locator: BANK 0
> >         Type: <OUT OF SPEC>
> >         Type Detail: None
> >         Speed: Unknown
> >         Manufacturer: Not Specified
> >         Serial Number: Not Specified
> >         Asset Tag: Unknown
> >         Part Number: Not Specified
> >         Rank: Unknown
> >         Configured Clock Speed: 1600 MHz
> 
> I have no root access. Is there another way to confirm the clock speed?
> 
> ---
> 
> So if I have two sockets per node, then the theoretical peak bandwidth
> is actually double than what I thought (whether it be 119.4 GB/s or
> 102.4 GB/s). And if 8 cores really is the optimal number to use for a
> single compute node, why are there 20 totals to begin with? Or would
> this depend on the particular application?
> 
> Kind Answer: Different application have different needs
> 
> Cynical Answer: Computer companies sell you what they can produce,
> lots of cores, not what you need, lots of bandwidth. Bandwidth is very
> expensive and there are technical limits.

    Cost of production of a system may not, is not, simply linearly proportional to the number of cores, or number of floating point units or any other particular feature of a system. For example, maybe a 50 core system costs $50,000 and a 100 core system (everything else being equal) costs $70,000 for a company to make, in a sense each additional core (within reason) costs less so it is acceptable to get less performance out it since the incremental cost is lower.

  Barry
 

>  
> Also, can someone elaborate on the difference between the words
> "core", "processor", and "thread"?
> 
> A core and a processor are hardware terms. I think they are both fuzzy,
> but I understand a core to be something that can carry a thread of execution,
> namely a program counter, instruction and data stream, and compute something.
> A thread is a logical construct for talking about an execution stream.
> 
>    Matt
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From jed at jedbrown.org  Thu Aug 13 15:50:34 2015
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 13 Aug 2015 14:50:34 -0600
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <CAP2=TMigemEB3E8itwn=0ddgLHFxyUvYzv3=mJ2EXvxPUSzrng@mail.gmail.com>
References: <CAP2=TMhprZasJNU7AyvR0uPwZmah0TfnhxApQnF13H3rbsMJyA@mail.gmail.com>
	<877fozqd3o.fsf@jedbrown.org>
	<CAP2=TMigemEB3E8itwn=0ddgLHFxyUvYzv3=mJ2EXvxPUSzrng@mail.gmail.com>
Message-ID: <87r3n6oqud.fsf@jedbrown.org>

Justin Chang <jychang48 at gmail.com> writes:

> On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown <jed at jedbrown.org> wrote:
>> It looks like with one core/socket, all your memory sits over one
>> channel.  You can play tricks to avoid that or use 4 cores/socket in
>> order to use all memory channels.
>
> How do I play these tricks?

They generally aren't practical outside of simple benchmarks.  Read
through this blog series if you want to dive into memory performance.

http://sites.utexas.edu/jdm4372/2010/11/11/optimizing-amd-opteron-memory-bandwidth-part-5-single-thread-read-only/

> I have no root access. Is there another way to confirm the clock speed?

I don't recall a way to access that information without root.  You can
benchmark, obviously, but you're looking for an independent information
source.  You can ask a sysadmin to run this on a compute node.

>
> ---
>
> So if I have two sockets per node, then the theoretical peak bandwidth
> is actually double than what I thought (whether it be 119.4 GB/s or
> 102.4 GB/s). And if 8 cores really is the optimal number to use for a
> single compute node, why are there 20 totals to begin with? Or would
> this depend on the particular application?

"20 totals"?  Note that you might have hyperthreading, in which case
there are twice as many logical cores as physical cores.

> Also, can someone elaborate on the difference between the words
> "core", "processor", and "thread"?

Processor - typically a unit of manufacturing and sale that goes into a
socket.  Sometimes it shares a last-level cache and other times it is
independent parts stuck together.  Sometimes different parts of the
processor are connected to different memory channels (implying multiple
"NUMA nodes" on a single socket) and sometimes they are multiplexed (so
all cores see the same speed to any memory channel on that socket).

Core - the physical unit that processes ("integer") instructions.  There
can be multiple floating point units per core (e.g., anything with
dual-issue FMA) or multiple cores per floating point unit (e.g., the AMD
processors on Titan).

Logical core/hardware thread - the logical unit exposed to the operating
system.  Often there are 2, 4, or more hardware threads per core.  These
have their own registers (as far as you can tell; it can be complicated
by "register renaming") and are used to cover high-latency operations
including waiting on memory and some arithmetic.  Usually only one
hardware thread issues instructions in any given cycle, so if a single
thread has sufficient ILP (instruction-level parallelism) to keep
issuing every cycle, there can be no benefit to using multiple hardware
threads.  This is impossible with some architectures, thus necessitating
use of multiple hardware threads per core to reach peak flops, integer
instructions, and/or bandwidth.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150813/e96cfa90/attachment.pgp>

From asmund.ervik at ntnu.no  Fri Aug 14 03:29:22 2015
From: asmund.ervik at ntnu.no (=?UTF-8?Q?=c3=85smund_Ervik?=)
Date: Fri, 14 Aug 2015 10:29:22 +0200
Subject: [petsc-users] Understanding the memory bandwidth
Message-ID: <55CDA6E2.5000302@ntnu.no>

>> So this is a pretty low fraction (55%) of 59.7*2 = 119.4.  I suspect
>> your memory or motherboard is at most 1600 MHz, so your peak would be
>> 102.4 GB/s.
>
>> You can check this as root using "dmidecode --type 17", which should
>> give one entry per channel, looking something like this:
>>
>> Handle 0x002B, DMI type 17, 34 bytes
>> Memory Device
>>         Array Handle: 0x002A
>>         Error Information Handle: 0x002F
>>         Total Width: Unknown
>>         Data Width: Unknown
>>         Size: 4096 MB
>>         Form Factor: DIMM
>>         Set: None
>>         Locator: DIMM0
>>         Bank Locator: BANK 0
>>         Type: <OUT OF SPEC>
>>         Type Detail: None
>>         Speed: Unknown
>>         Manufacturer: Not Specified
>>         Serial Number: Not Specified
>>         Asset Tag: Unknown
>>         Part Number: Not Specified
>>         Rank: Unknown
>>         Configured Clock Speed: 1600 MHz
>
>I have no root access. Is there another way to confirm the clock speed?

Also note: even in the case where your motherboard, RAM and CPU all say
1866 on the label, if there are more memory DIMMs (chips) per node than
channels, say 16 DIMMs on your 8 channels, you will see a performance
reduction on the order of 20-30%. This is more likely if you are using
nodes in a "high-memory queue" or similar where there's >= 128 GB memory
per node. (This will change in the future when/if people start using
DDR4 LRDIMMs.) There's a series of in-depth discussions here:
http://frankdenneman.nl/2015/02/20/memory-deep-dive/ and there's also
lots of interesting memory-stuff on John McCalpin's blog:
https://sites.utexas.edu/jdm4372/

Regards,
?smund

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150814/6a7dd488/attachment.pgp>

From R.Thomas at tudelft.nl  Fri Aug 14 09:23:16 2015
From: R.Thomas at tudelft.nl (Romain Thomas)
Date: Fri, 14 Aug 2015 14:23:16 +0000
Subject: [petsc-users] petsc KLU
Message-ID: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>

Dear PETSc users,

I would like to know if I can replace the following functions

MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo *info)
MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)

by

MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info)
MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)

in my code for the simulation of electrical power systems? (I installed the package SuiteSparse)

Thank you,
Best regards,
Romain
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150814/80377231/attachment.html>

From knepley at gmail.com  Fri Aug 14 09:40:38 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 14 Aug 2015 09:40:38 -0500
Subject: [petsc-users] petsc KLU
In-Reply-To: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
Message-ID: <CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>

On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl> wrote:

> Dear PETSc users,
>
> I would like to know if I can replace the following functions
>
> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo
> *info)
> MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>
> by
>
> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info)
> MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>
> in my code for the simulation of electrical power systems? (I installed
> the package SuiteSparse)
>

Why would you do that? It already works with the former code. In fact, you
should really just use

  KSPCreate(comm, &ksp)
  KSPSetOperator(ksp, A, A);
  KSPSetFromOptions(ksp);
  KSPSolve(ksp, b, x);

and then give the options

  -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse

This is no advantage to using the Factor language since subsequent calls to
KSPSolve() will not refactor.

   Matt


> Thank you,
> Best regards,
> Romain
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150814/df60f807/attachment.html>

From R.Thomas at tudelft.nl  Fri Aug 14 10:07:39 2015
From: R.Thomas at tudelft.nl (Romain Thomas)
Date: Fri, 14 Aug 2015 15:07:39 +0000
Subject: [petsc-users] petsc KLU
In-Reply-To: <CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
	<CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
Message-ID: <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>

Hi,
Thank you for your answer.
My problem is a bit more complex. During the simulation (?real time?), I need to upgrade at each time step the matrix A and the MatassemblyBegin and MatassemblyEnd take time and so, in order to avoid these functions, I don?t use ksp or pc. I prefer to use the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. And so, I want to know if there is similar functions for KLU. (I tried for Cholesky and, iLU and it works well).
Best regards,
Romain


From: Matthew Knepley [mailto:knepley at gmail.com]
Sent: vrijdag 14 augustus 2015 16:41
To: Romain Thomas
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] petsc KLU

On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl<mailto:R.Thomas at tudelft.nl>> wrote:
Dear PETSc users,

I would like to know if I can replace the following functions

MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo *info)
MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)

by

MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info)
MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)

in my code for the simulation of electrical power systems? (I installed the package SuiteSparse)

Why would you do that? It already works with the former code. In fact, you should really just use

  KSPCreate(comm, &ksp)
  KSPSetOperator(ksp, A, A);
  KSPSetFromOptions(ksp);
  KSPSolve(ksp, b, x);

and then give the options

  -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse

This is no advantage to using the Factor language since subsequent calls to
KSPSolve() will not refactor.

   Matt

Thank you,
Best regards,
Romain


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150814/aec309ef/attachment-0001.html>

From bsmith at mcs.anl.gov  Fri Aug 14 10:30:58 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 14 Aug 2015 10:30:58 -0500
Subject: [petsc-users] petsc KLU
In-Reply-To: <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
	<CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
	<6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>
Message-ID: <F2998241-0809-4699-B503-00B329AA385A@mcs.anl.gov>


   You should call 

    MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); 

  then call 

> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo *info)
> MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)

   This routines correctly internally call the appropriate MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU above.
   There is no reason to (and it won't work) to call 

> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info)
> MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)

directly.

  Barry

> On Aug 14, 2015, at 10:07 AM, Romain Thomas <R.Thomas at tudelft.nl> wrote:
> 
> Hi,
> Thank you for your answer.
> My problem is a bit more complex. During the simulation (?real time?), I need to upgrade at each time step the matrix A and the MatassemblyBegin and MatassemblyEnd take time and so, in order to avoid these functions, I don?t use ksp or pc. I prefer to use the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. And so, I want to know if there is similar functions for KLU. (I tried for Cholesky and, iLU and it works well).
> Best regards,
> Romain
>  
>  
> From: Matthew Knepley [mailto:knepley at gmail.com] 
> Sent: vrijdag 14 augustus 2015 16:41
> To: Romain Thomas
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] petsc KLU
>  
> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl> wrote:
> Dear PETSc users,
> 
> I would like to know if I can replace the following functions
> 
> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo *info)
> MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
> 
> by
> 
> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info)
> MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
> 
> in my code for the simulation of electrical power systems? (I installed the package SuiteSparse)
>  
> Why would you do that? It already works with the former code. In fact, you should really just use
>  
>   KSPCreate(comm, &ksp)
>   KSPSetOperator(ksp, A, A);
>   KSPSetFromOptions(ksp);
>   KSPSolve(ksp, b, x);
>  
> and then give the options
>  
>   -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse
>  
> This is no advantage to using the Factor language since subsequent calls to
> KSPSolve() will not refactor.
>  
>    Matt
>  
> Thank you,
> Best regards,
> Romain
> 
> 
>  
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From jed at jedbrown.org  Fri Aug 14 11:10:02 2015
From: jed at jedbrown.org (Jed Brown)
Date: Fri, 14 Aug 2015 10:10:02 -0600
Subject: [petsc-users] Need to update matrix in every loop
In-Reply-To: <CAFc94bX+WiM36T7WyDw-qqxvD_M07sOP4k0+5J2caqq9VBPyrg@mail.gmail.com>
References: <CAFc94bUK8eak7Ddv82DthDhkbErXd1mN2KEutW91AuwkqW8qzQ@mail.gmail.com>
	<EB391C0A-10E0-4120-B62A-AE920C7BCBAA@mcs.anl.gov>
	<CAFc94bWRhFTYaA7OPdwBpui1eEtccCnAEWMEgM05oQ7VUZvxWw@mail.gmail.com>
	<87y4hgqzle.fsf@jedbrown.org>
	<CAFc94bX+WiM36T7WyDw-qqxvD_M07sOP4k0+5J2caqq9VBPyrg@mail.gmail.com>
Message-ID: <878u9dn95x.fsf@jedbrown.org>

Please always use "reply-all" so that your messages go to the list.
This is standard mailing list etiquette.  It is important to preserve
threading for people who find this discussion later and so that we do
not waste our time re-answering the same questions that have already
been answered in private side-conversations.  You'll likely get an
answer faster that way too.

sheng liu <ustc.liu at gmail.com> writes:

> Thank you! I have a small question: What does the "degree of freedom" mean
> in the DMDA object? If I have a spin-1/2 system, and I discrete the system,
> does that mean I have DOF=2?

DOF is the number of PetscScalar values at each grid point.

> 2015-08-12 23:46 GMT+08:00 Jed Brown <jed at jedbrown.org>:
>
>> sheng liu <ustc.liu at gmail.com> writes:
>>
>> > Thank you very much! I have another question. If I need all the
>> eigenvalues
>> > of the sparse matrix, which solver should I use? Thanks!
>>
>> That's O(n^3) with n=1e6.  Better find a way to not need all the
>> eigenvalues or to make the system smaller.
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150814/33beb2d5/attachment.pgp>

From R.Thomas at tudelft.nl  Mon Aug 17 09:34:37 2015
From: R.Thomas at tudelft.nl (Romain Thomas)
Date: Mon, 17 Aug 2015 14:34:37 +0000
Subject: [petsc-users] petsc KLU
In-Reply-To: <F2998241-0809-4699-B503-00B329AA385A@mcs.anl.gov>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
	<CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
	<6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>
	<F2998241-0809-4699-B503-00B329AA385A@mcs.anl.gov>
Message-ID: <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net>

Hi
Thank you for your answer. I was asking help because I find LU factorization 2-3 times faster than KLU. According to my problem size (200*200) and type (power system simulation), I should get almost the same computation time. Is it true to think that? Is the difference of time due to the interface between PETSc and SuiteSparse?
Thank you,
Romain    

-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov] 
Sent: vrijdag 14 augustus 2015 17:31
To: Romain Thomas
Cc: Matthew Knepley; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] petsc KLU


   You should call 

    MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); 

  then call 

> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) 
> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo 
> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)

   This routines correctly internally call the appropriate MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU above.
   There is no reason to (and it won't work) to call 

> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) 
> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo 
> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)

directly.

  Barry

> On Aug 14, 2015, at 10:07 AM, Romain Thomas <R.Thomas at tudelft.nl> wrote:
> 
> Hi,
> Thank you for your answer.
> My problem is a bit more complex. During the simulation (?real time?), I need to upgrade at each time step the matrix A and the MatassemblyBegin and MatassemblyEnd take time and so, in order to avoid these functions, I don?t use ksp or pc. I prefer to use the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. And so, I want to know if there is similar functions for KLU. (I tried for Cholesky and, iLU and it works well).
> Best regards,
> Romain
>  
>  
> From: Matthew Knepley [mailto:knepley at gmail.com]
> Sent: vrijdag 14 augustus 2015 16:41
> To: Romain Thomas
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] petsc KLU
>  
> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl> wrote:
> Dear PETSc users,
> 
> I would like to know if I can replace the following functions
> 
> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) 
> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo 
> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
> 
> by
> 
> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) 
> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo 
> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
> 
> in my code for the simulation of electrical power systems? (I 
> installed the package SuiteSparse)
>  
> Why would you do that? It already works with the former code. In fact, 
> you should really just use
>  
>   KSPCreate(comm, &ksp)
>   KSPSetOperator(ksp, A, A);
>   KSPSetFromOptions(ksp);
>   KSPSolve(ksp, b, x);
>  
> and then give the options
>  
>   -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse
>  
> This is no advantage to using the Factor language since subsequent 
> calls to
> KSPSolve() will not refactor.
>  
>    Matt
>  
> Thank you,
> Best regards,
> Romain
> 
> 
>  
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From hzhang at mcs.anl.gov  Mon Aug 17 10:08:17 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Mon, 17 Aug 2015 10:08:17 -0500
Subject: [petsc-users] petsc KLU
In-Reply-To: <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
	<CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
	<6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>
	<F2998241-0809-4699-B503-00B329AA385A@mcs.anl.gov>
	<6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net>
Message-ID: <CAGCphBvvrA1UnRY--2h5-PVd+kBTBWM42hCbYTGt8VSAtjzF7Q@mail.gmail.com>

Romain:
Do you mean small sparse sequential 200 by 200 matrices?
Petsc LU might be better than external LU packages because it implements
simple LU algorithm and we took good care on data accesing (I've heard same
observations). You may try 'qmd' matrix ordering for power grid simulation.
I do not have experience on SuiteSparse. Testing MUMPS is worth it as well.

Hong

Hi
> Thank you for your answer. I was asking help because I find LU
> factorization 2-3 times faster than KLU. According to my problem size
> (200*200) and type (power system simulation), I should get almost the same
> computation time. Is it true to think that? Is the difference of time due
> to the interface between PETSc and SuiteSparse?
> Thank you,
> Romain
>
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
> Sent: vrijdag 14 augustus 2015 17:31
> To: Romain Thomas
> Cc: Matthew Knepley; petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] petsc KLU
>
>
>    You should call
>
>     MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact);
>
>   then call
>
> > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
> > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo
> > *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>
>    This routines correctly internally call the appropriate
> MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU above.
>    There is no reason to (and it won't work) to call
>
> > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
> > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
> > *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>
> directly.
>
>   Barry
>
> > On Aug 14, 2015, at 10:07 AM, Romain Thomas <R.Thomas at tudelft.nl> wrote:
> >
> > Hi,
> > Thank you for your answer.
> > My problem is a bit more complex. During the simulation (?real time?), I
> need to upgrade at each time step the matrix A and the MatassemblyBegin and
> MatassemblyEnd take time and so, in order to avoid these functions, I don?t
> use ksp or pc. I prefer to use the functions MatLUFactorNumeric,
> MatLUFactorSymbolic and MatLUFactor. And so, I want to know if there is
> similar functions for KLU. (I tried for Cholesky and, iLU and it works
> well).
> > Best regards,
> > Romain
> >
> >
> > From: Matthew Knepley [mailto:knepley at gmail.com]
> > Sent: vrijdag 14 augustus 2015 16:41
> > To: Romain Thomas
> > Cc: petsc-users at mcs.anl.gov
> > Subject: Re: [petsc-users] petsc KLU
> >
> > On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl>
> wrote:
> > Dear PETSc users,
> >
> > I would like to know if I can replace the following functions
> >
> > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
> > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo
> > *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
> >
> > by
> >
> > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
> > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
> > *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
> >
> > in my code for the simulation of electrical power systems? (I
> > installed the package SuiteSparse)
> >
> > Why would you do that? It already works with the former code. In fact,
> > you should really just use
> >
> >   KSPCreate(comm, &ksp)
> >   KSPSetOperator(ksp, A, A);
> >   KSPSetFromOptions(ksp);
> >   KSPSolve(ksp, b, x);
> >
> > and then give the options
> >
> >   -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse
> >
> > This is no advantage to using the Factor language since subsequent
> > calls to
> > KSPSolve() will not refactor.
> >
> >    Matt
> >
> > Thank you,
> > Best regards,
> > Romain
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/8305f87a/attachment.html>

From xzhao99 at gmail.com  Mon Aug 17 10:49:08 2015
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Mon, 17 Aug 2015 10:49:08 -0500
Subject: [petsc-users] Petsc creates a random vector
Message-ID: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>

Hi all,

I want PETSc to generate random vector using VecSetRandom() following given
examples, but failed and showed some "out of memory" error. The following
is the code, which goes well until it reaches VecSetRandom(). Can anyone
help me figure out the reason? Thanks a lot.

XZ


--------------------------------------------------------------------------------------------
  Vec             u;
  PetscRandom     rand_ctx;     /* random number generator context */
  PetscMPIInt     size, rank;
  PetscInt        n, dn;


  MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
  MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
  n  = N/size + 1;
  dn = n*size - N;
  if ( dn>0 && rank<dn ) n -= 1;
  printf("--->test in petsc_random_vector(): rank = %d, n = %d\n",rank,n);


  VecCreate(PETSC_COMM_WORLD,&u);
  VecSetSizes(u,n,N);
  PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
#if defined(PETSC_HAVE_DRAND48)
  PetscRandomSetType(rand_ctx,PETSCRAND48);
#elif defined(PETSC_HAVE_RAND)
  PetscRandomSetType(rand_ctx,PETSCRAND);
#endif
  PetscRandomSetFromOptions(rand_ctx);


  VecSetRandom(u,rand_ctx);
  PetscRandomDestroy(&rand_ctx);
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/3cf81c2d/attachment.html>

From knepley at gmail.com  Mon Aug 17 10:57:07 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 17 Aug 2015 10:57:07 -0500
Subject: [petsc-users] Petsc creates a random vector
In-Reply-To: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
References: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
Message-ID: <CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>

On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:

> Hi all,
>
> I want PETSc to generate random vector using VecSetRandom() following
> given examples, but failed and showed some "out of memory" error. The
> following is the code, which goes well until it reaches VecSetRandom(). Can
> anyone help me figure out the reason? Thanks a lot.
>

Does src/vec/vec/examples/tests/ex43.c run for you?

 Thanks,

    Matt


> XZ
>
>
>
> --------------------------------------------------------------------------------------------
>   Vec             u;
>   PetscRandom     rand_ctx;     /* random number generator context */
>   PetscMPIInt     size, rank;
>   PetscInt        n, dn;
>
>
>   MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
>   MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
>   n  = N/size + 1;
>   dn = n*size - N;
>   if ( dn>0 && rank<dn ) n -= 1;
>   printf("--->test in petsc_random_vector(): rank = %d, n = %d\n",rank,n);
>
>
>   VecCreate(PETSC_COMM_WORLD,&u);
>   VecSetSizes(u,n,N);
>   PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
> #if defined(PETSC_HAVE_DRAND48)
>   PetscRandomSetType(rand_ctx,PETSCRAND48);
> #elif defined(PETSC_HAVE_RAND)
>   PetscRandomSetType(rand_ctx,PETSCRAND);
> #endif
>   PetscRandomSetFromOptions(rand_ctx);
>
>
>   VecSetRandom(u,rand_ctx);
>   PetscRandomDestroy(&rand_ctx);
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/98e29d57/attachment.html>

From xzhao99 at gmail.com  Mon Aug 17 11:02:24 2015
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Mon, 17 Aug 2015 11:02:24 -0500
Subject: [petsc-users] Petsc creates a random vector
In-Reply-To: <CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>
References: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
	<CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>
Message-ID: <CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>

No. It gives the following error msg:

mpirun -np 2 ex43

[proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov]
HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process
(utils/launch/launch.c:75): execvp error on file ex43 (No such file or
directory)

execvp error on file ex43 (No such file or directory)

On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>
>> Hi all,
>>
>> I want PETSc to generate random vector using VecSetRandom() following
>> given examples, but failed and showed some "out of memory" error. The
>> following is the code, which goes well until it reaches VecSetRandom(). Can
>> anyone help me figure out the reason? Thanks a lot.
>>
>
> Does src/vec/vec/examples/tests/ex43.c run for you?
>
>  Thanks,
>
>     Matt
>
>
>> XZ
>>
>>
>>
>> --------------------------------------------------------------------------------------------
>>   Vec             u;
>>   PetscRandom     rand_ctx;     /* random number generator context */
>>   PetscMPIInt     size, rank;
>>   PetscInt        n, dn;
>>
>>
>>   MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
>>   MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
>>   n  = N/size + 1;
>>   dn = n*size - N;
>>   if ( dn>0 && rank<dn ) n -= 1;
>>   printf("--->test in petsc_random_vector(): rank = %d, n = %d\n",rank,n);
>>
>>
>>   VecCreate(PETSC_COMM_WORLD,&u);
>>   VecSetSizes(u,n,N);
>>   PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
>> #if defined(PETSC_HAVE_DRAND48)
>>   PetscRandomSetType(rand_ctx,PETSCRAND48);
>> #elif defined(PETSC_HAVE_RAND)
>>   PetscRandomSetType(rand_ctx,PETSCRAND);
>> #endif
>>   PetscRandomSetFromOptions(rand_ctx);
>>
>>
>>   VecSetRandom(u,rand_ctx);
>>   PetscRandomDestroy(&rand_ctx);
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/0aa68bd9/attachment-0001.html>

From karpeev at mcs.anl.gov  Mon Aug 17 11:12:47 2015
From: karpeev at mcs.anl.gov (Dmitry Karpeyev)
Date: Mon, 17 Aug 2015 16:12:47 +0000
Subject: [petsc-users] Petsc creates a random vector
In-Reply-To: <CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>
References: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
	<CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>
	<CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>
Message-ID: <CA+Y_9UKrq8QOPotiQ7Af0dNmOmJNuEegOZcGtj1z2GH1iFaLiQ@mail.gmail.com>

You need to give the path to the executable, for example, ./ex43 etc.
Dmitry.
On Mon, Aug 17, 2015 at 11:02 AM Xujun Zhao <xzhao99 at gmail.com> wrote:

> No. It gives the following error msg:
>
> mpirun -np 2 ex43
>
> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov]
> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process
> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or
> directory)
>
> execvp error on file ex43 (No such file or directory)
>
> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I want PETSc to generate random vector using VecSetRandom() following
>>> given examples, but failed and showed some "out of memory" error. The
>>> following is the code, which goes well until it reaches VecSetRandom(). Can
>>> anyone help me figure out the reason? Thanks a lot.
>>>
>>
>> Does src/vec/vec/examples/tests/ex43.c run for you?
>>
>>  Thanks,
>>
>>     Matt
>>
>>
>>> XZ
>>>
>>>
>>>
>>> --------------------------------------------------------------------------------------------
>>>   Vec             u;
>>>   PetscRandom     rand_ctx;     /* random number generator context */
>>>   PetscMPIInt     size, rank;
>>>   PetscInt        n, dn;
>>>
>>>
>>>   MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
>>>   MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
>>>   n  = N/size + 1;
>>>   dn = n*size - N;
>>>   if ( dn>0 && rank<dn ) n -= 1;
>>>   printf("--->test in petsc_random_vector(): rank = %d, n =
>>> %d\n",rank,n);
>>>
>>>
>>>   VecCreate(PETSC_COMM_WORLD,&u);
>>>   VecSetSizes(u,n,N);
>>>   PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
>>> #if defined(PETSC_HAVE_DRAND48)
>>>   PetscRandomSetType(rand_ctx,PETSCRAND48);
>>> #elif defined(PETSC_HAVE_RAND)
>>>   PetscRandomSetType(rand_ctx,PETSCRAND);
>>> #endif
>>>   PetscRandomSetFromOptions(rand_ctx);
>>>
>>>
>>>   VecSetRandom(u,rand_ctx);
>>>   PetscRandomDestroy(&rand_ctx);
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/6d22b867/attachment.html>

From knepley at gmail.com  Mon Aug 17 11:13:23 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 17 Aug 2015 11:13:23 -0500
Subject: [petsc-users] Petsc creates a random vector
In-Reply-To: <CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>
References: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
	<CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>
	<CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>
Message-ID: <CAMYG4GmL=yD2i0WHKGTsR8CuSxnJzfMKtX-zhat0Mij0nDBg9A@mail.gmail.com>

On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:

> No. It gives the following error msg:
>

Did you build the executable?

  cd src/vec/vec/examples/tutorials
  make ex43

    Matt


> mpirun -np 2 ex43
>
> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov]
> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process
> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or
> directory)
>
> execvp error on file ex43 (No such file or directory)
>
> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I want PETSc to generate random vector using VecSetRandom() following
>>> given examples, but failed and showed some "out of memory" error. The
>>> following is the code, which goes well until it reaches VecSetRandom(). Can
>>> anyone help me figure out the reason? Thanks a lot.
>>>
>>
>> Does src/vec/vec/examples/tests/ex43.c run for you?
>>
>>  Thanks,
>>
>>     Matt
>>
>>
>>> XZ
>>>
>>>
>>>
>>> --------------------------------------------------------------------------------------------
>>>   Vec             u;
>>>   PetscRandom     rand_ctx;     /* random number generator context */
>>>   PetscMPIInt     size, rank;
>>>   PetscInt        n, dn;
>>>
>>>
>>>   MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
>>>   MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
>>>   n  = N/size + 1;
>>>   dn = n*size - N;
>>>   if ( dn>0 && rank<dn ) n -= 1;
>>>   printf("--->test in petsc_random_vector(): rank = %d, n =
>>> %d\n",rank,n);
>>>
>>>
>>>   VecCreate(PETSC_COMM_WORLD,&u);
>>>   VecSetSizes(u,n,N);
>>>   PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
>>> #if defined(PETSC_HAVE_DRAND48)
>>>   PetscRandomSetType(rand_ctx,PETSCRAND48);
>>> #elif defined(PETSC_HAVE_RAND)
>>>   PetscRandomSetType(rand_ctx,PETSCRAND);
>>> #endif
>>>   PetscRandomSetFromOptions(rand_ctx);
>>>
>>>
>>>   VecSetRandom(u,rand_ctx);
>>>   PetscRandomDestroy(&rand_ctx);
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/b0cf5adf/attachment.html>

From xzhao99 at gmail.com  Mon Aug 17 11:15:07 2015
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Mon, 17 Aug 2015 11:15:07 -0500
Subject: [petsc-users] Petsc creates a random vector
In-Reply-To: <CAMYG4GmL=yD2i0WHKGTsR8CuSxnJzfMKtX-zhat0Mij0nDBg9A@mail.gmail.com>
References: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
	<CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>
	<CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>
	<CAMYG4GmL=yD2i0WHKGTsR8CuSxnJzfMKtX-zhat0Mij0nDBg9A@mail.gmail.com>
Message-ID: <CAHOKZ67=f-3Ypixo1cRUuceSQxH3CieAczDz418UKE6jbyn4Jg@mail.gmail.com>

Ahhhh, I should drink some coffee in the morning.
Now it passed the test!

On Mon, Aug 17, 2015 at 11:13 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>
>> No. It gives the following error msg:
>>
>
> Did you build the executable?
>
>   cd src/vec/vec/examples/tutorials
>   make ex43
>
>     Matt
>
>
>> mpirun -np 2 ex43
>>
>> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov]
>> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process
>> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or
>> directory)
>>
>> execvp error on file ex43 (No such file or directory)
>>
>> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I want PETSc to generate random vector using VecSetRandom() following
>>>> given examples, but failed and showed some "out of memory" error. The
>>>> following is the code, which goes well until it reaches VecSetRandom(). Can
>>>> anyone help me figure out the reason? Thanks a lot.
>>>>
>>>
>>> Does src/vec/vec/examples/tests/ex43.c run for you?
>>>
>>>  Thanks,
>>>
>>>     Matt
>>>
>>>
>>>> XZ
>>>>
>>>>
>>>>
>>>> --------------------------------------------------------------------------------------------
>>>>   Vec             u;
>>>>   PetscRandom     rand_ctx;     /* random number generator context */
>>>>   PetscMPIInt     size, rank;
>>>>   PetscInt        n, dn;
>>>>
>>>>
>>>>   MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
>>>>   MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
>>>>   n  = N/size + 1;
>>>>   dn = n*size - N;
>>>>   if ( dn>0 && rank<dn ) n -= 1;
>>>>   printf("--->test in petsc_random_vector(): rank = %d, n =
>>>> %d\n",rank,n);
>>>>
>>>>
>>>>   VecCreate(PETSC_COMM_WORLD,&u);
>>>>   VecSetSizes(u,n,N);
>>>>   PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
>>>> #if defined(PETSC_HAVE_DRAND48)
>>>>   PetscRandomSetType(rand_ctx,PETSCRAND48);
>>>> #elif defined(PETSC_HAVE_RAND)
>>>>   PetscRandomSetType(rand_ctx,PETSCRAND);
>>>> #endif
>>>>   PetscRandomSetFromOptions(rand_ctx);
>>>>
>>>>
>>>>   VecSetRandom(u,rand_ctx);
>>>>   PetscRandomDestroy(&rand_ctx);
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/aae504c2/attachment.html>

From dkarpeev at gmail.com  Mon Aug 17 11:16:30 2015
From: dkarpeev at gmail.com (Dmitry Karpeyev)
Date: Mon, 17 Aug 2015 16:16:30 +0000
Subject: [petsc-users] Petsc creates a random vector
In-Reply-To: <CAHOKZ67=f-3Ypixo1cRUuceSQxH3CieAczDz418UKE6jbyn4Jg@mail.gmail.com>
References: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
	<CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>
	<CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>
	<CAMYG4GmL=yD2i0WHKGTsR8CuSxnJzfMKtX-zhat0Mij0nDBg9A@mail.gmail.com>
	<CAHOKZ67=f-3Ypixo1cRUuceSQxH3CieAczDz418UKE6jbyn4Jg@mail.gmail.com>
Message-ID: <CA+Y_9U+AjYGes7vyyA63UT_HU2-xAfaV=csuj3HtTAxkZdzC5w@mail.gmail.com>

Xujun,
Regarding your original question: please send the complete error message.
Dmitry.

On Mon, Aug 17, 2015 at 11:15 AM Xujun Zhao <xzhao99 at gmail.com> wrote:

> Ahhhh, I should drink some coffee in the morning.
> Now it passed the test!
>
> On Mon, Aug 17, 2015 at 11:13 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>>
>>> No. It gives the following error msg:
>>>
>>
>> Did you build the executable?
>>
>>   cd src/vec/vec/examples/tutorials
>>   make ex43
>>
>>     Matt
>>
>>
>>> mpirun -np 2 ex43
>>>
>>> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov]
>>> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process
>>> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or
>>> directory)
>>>
>>> execvp error on file ex43 (No such file or directory)
>>>
>>> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I want PETSc to generate random vector using VecSetRandom() following
>>>>> given examples, but failed and showed some "out of memory" error. The
>>>>> following is the code, which goes well until it reaches VecSetRandom(). Can
>>>>> anyone help me figure out the reason? Thanks a lot.
>>>>>
>>>>
>>>> Does src/vec/vec/examples/tests/ex43.c run for you?
>>>>
>>>>  Thanks,
>>>>
>>>>     Matt
>>>>
>>>>
>>>>> XZ
>>>>>
>>>>>
>>>>>
>>>>> --------------------------------------------------------------------------------------------
>>>>>   Vec             u;
>>>>>   PetscRandom     rand_ctx;     /* random number generator context */
>>>>>   PetscMPIInt     size, rank;
>>>>>   PetscInt        n, dn;
>>>>>
>>>>>
>>>>>   MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
>>>>>   MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
>>>>>   n  = N/size + 1;
>>>>>   dn = n*size - N;
>>>>>   if ( dn>0 && rank<dn ) n -= 1;
>>>>>   printf("--->test in petsc_random_vector(): rank = %d, n =
>>>>> %d\n",rank,n);
>>>>>
>>>>>
>>>>>   VecCreate(PETSC_COMM_WORLD,&u);
>>>>>   VecSetSizes(u,n,N);
>>>>>   PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
>>>>> #if defined(PETSC_HAVE_DRAND48)
>>>>>   PetscRandomSetType(rand_ctx,PETSCRAND48);
>>>>> #elif defined(PETSC_HAVE_RAND)
>>>>>   PetscRandomSetType(rand_ctx,PETSCRAND);
>>>>> #endif
>>>>>   PetscRandomSetFromOptions(rand_ctx);
>>>>>
>>>>>
>>>>>   VecSetRandom(u,rand_ctx);
>>>>>   PetscRandomDestroy(&rand_ctx);
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/dd764dcd/attachment-0001.html>

From abhyshr at anl.gov  Mon Aug 17 11:21:15 2015
From: abhyshr at anl.gov (Abhyankar, Shrirang G.)
Date: Mon, 17 Aug 2015 16:21:15 +0000
Subject: [petsc-users] petsc KLU
In-Reply-To: <CAGCphBvvrA1UnRY--2h5-PVd+kBTBWM42hCbYTGt8VSAtjzF7Q@mail.gmail.com>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
	<CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
	<6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>
	<F2998241-0809-4699-B503-00B329AA385A@mcs.anl.gov>
	<6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net>
	<CAGCphBvvrA1UnRY--2h5-PVd+kBTBWM42hCbYTGt8VSAtjzF7Q@mail.gmail.com>
Message-ID: <D1F77073.2703E%abhyshr@mcs.anl.gov>

Romain,
   I added the KLU interface to PETSc last year hearing the hype about
KLU?s performance from several power system folks. I must say that I?m
terribly disappointed! I did some performance testing of KLU on power grid
problems (power flow application) last year and I got a similar
performance that you report (PETSc is 2-4 times faster than KLU). I also
clocked the time spent in PETSc?s SuiteSparse interface for KLU for
operations other than factorization and it was very minimal. The fastest
linear solver combination that I found was PETSc?s LU solver + AMD
ordering from the SuiteSparse package (-pc_factor_mat_ordering_type amd).
Don?t try MUMPS and SuperLU ? they are terribly slow.

Shri


From:  hong zhang <hzhang at mcs.anl.gov>
Date:  Monday, August 17, 2015 at 10:08 AM
To:  Romain Thomas <R.Thomas at tudelft.nl>
Cc:  "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject:  Re: [petsc-users] petsc KLU


>Romain:
>Do you mean small sparse sequential 200 by 200 matrices?
>Petsc LU might be better than external LU packages because it implements
>simple LU algorithm and we took good care on data accesing (I've heard
>same observations). You may try 'qmd' matrix ordering for power grid
>simulation. 
>I do not have experience on SuiteSparse. Testing MUMPS is worth it as
>well.
>
>Hong
>
>
>Hi
>Thank you for your answer. I was asking help because I find LU
>factorization 2-3 times faster than KLU. According to my problem size
>(200*200) and type (power system simulation), I should get almost the
>same computation time. Is it true to think that? Is the
> difference of time due to the interface between PETSc and SuiteSparse?
>Thank you,
>Romain
>
>-----Original Message-----
>From: Barry Smith [mailto:bsmith at mcs.anl.gov]
>Sent: vrijdag 14 augustus 2015 17:31
>To: Romain Thomas
>Cc: Matthew Knepley; petsc-users at mcs.anl.gov
>Subject: Re: [petsc-users] petsc KLU
>
>
>   You should call
>
>    MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact);
>
>  then call
>
>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo
>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>
>   This routines correctly internally call the appropriate
>MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU
>above.
>   There is no reason to (and it won't work) to call
>
>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>
>directly.
>
>  Barry
>
>> On Aug 14, 2015, at 10:07 AM, Romain Thomas <R.Thomas at tudelft.nl> wrote:
>>
>> Hi,
>> Thank you for your answer.
>> My problem is a bit more complex. During the simulation (?real time?),
>>I need to upgrade at each time step the matrix A and the
>>MatassemblyBegin and MatassemblyEnd take time and so, in order to avoid
>>these functions, I don?t use ksp or pc. I prefer to use
> the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor.
>And so, I want to know if there is similar functions for KLU. (I tried
>for Cholesky and, iLU and it works well).
>> Best regards,
>> Romain
>>
>>
>> From: Matthew Knepley [mailto:knepley at gmail.com]
>> Sent: vrijdag 14 augustus 2015 16:41
>> To: Romain Thomas
>> Cc: petsc-users at mcs.anl.gov
>> Subject: Re: [petsc-users] petsc KLU
>>
>> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl>
>>wrote:
>> Dear PETSc users,
>>
>> I would like to know if I can replace the following functions
>>
>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo
>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>>
>> by
>>
>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>>
>> in my code for the simulation of electrical power systems? (I
>> installed the package SuiteSparse)
>>
>> Why would you do that? It already works with the former code. In fact,
>> you should really just use
>>
>>   KSPCreate(comm, &ksp)
>>   KSPSetOperator(ksp, A, A);
>>   KSPSetFromOptions(ksp);
>>   KSPSolve(ksp, b, x);
>>
>> and then give the options
>>
>>   -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse
>>
>> This is no advantage to using the Factor language since subsequent
>> calls to
>> KSPSolve() will not refactor.
>>
>>    Matt
>>
>> Thank you,
>> Best regards,
>> Romain
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>>experiments is infinitely more interesting than any results to which
>>their experiments lead.
>> -- Norbert Wiener
>
>
>
>
>


From xzhao99 at gmail.com  Mon Aug 17 11:31:42 2015
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Mon, 17 Aug 2015 11:31:42 -0500
Subject: [petsc-users] Petsc creates a random vector
In-Reply-To: <CA+Y_9U+AjYGes7vyyA63UT_HU2-xAfaV=csuj3HtTAxkZdzC5w@mail.gmail.com>
References: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
	<CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>
	<CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>
	<CAMYG4GmL=yD2i0WHKGTsR8CuSxnJzfMKtX-zhat0Mij0nDBg9A@mail.gmail.com>
	<CAHOKZ67=f-3Ypixo1cRUuceSQxH3CieAczDz418UKE6jbyn4Jg@mail.gmail.com>
	<CA+Y_9U+AjYGes7vyyA63UT_HU2-xAfaV=csuj3HtTAxkZdzC5w@mail.gmail.com>
Message-ID: <CAHOKZ65H_7dnyG5Af1gyRjkePx8=QyjfieCqXAnFe+v-kZKHZA@mail.gmail.com>

This is run with PETSc opt mode, so the error message looks not very
useful, see below:
Probably I should use dbg version to see the details.

[0]PETSC ERROR:
------------------------------------------------------------------------

[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range

[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger

[0]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X
to find memory corruption errors

[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run

[0]PETSC ERROR: to get more information on the crash.

[1]PETSC ERROR:
------------------------------------------------------------------------

[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range

[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger

[1]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X
to find memory corruption errors

[1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run

[2]PETSC ERROR:
------------------------------------------------------------------------

[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range

[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger

[2]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

[2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X
to find memory corruption errors

[2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run

[2]PETSC ERROR: [1]PETSC ERROR: to get more information on the crash.

to get more information on the crash.

[3]PETSC ERROR:
------------------------------------------------------------------------

[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range

[3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger

[3]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

[3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X
to find memory corruption errors

[3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run

[3]PETSC ERROR: to get more information on the crash.

[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------

[0]PETSC ERROR: Signal received

[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015

[0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015

[0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
--with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
--download-fblaslapack --download-scalapack --download-mumps
--download-superlu_dist --download-hypre --download-ml --download-parmetis
--download-metis --download-triangle --download-chaco --download-elemental
--with-debugging=0

[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file

[1]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------

[1]PETSC ERROR: Signal received

[1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015

[1]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015

[1]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
--with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
--download-fblaslapack --download-scalapack --download-mumps
--download-superlu_dist --download-hypre --download-ml --download-parmetis
--download-metis --download-triangle --download-chaco --download-elemental
--with-debugging=0

[1]PETSC ERROR: #1 User provided function() line 0 in  unknown file

[2]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------

[2]PETSC ERROR: Signal received

[2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015

[2]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015

[2]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
--with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
--download-fblaslapack --download-scalapack --download-mumps
--download-superlu_dist --download-hypre --download-ml --download-parmetis
--download-metis --download-triangle --download-chaco --download-elemental
--with-debugging=0

[2]PETSC ERROR: #1 User provided function() line 0 in  unknown file

[3]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------

[3]PETSC ERROR: Signal received

[3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015

[3]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015

[3]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
--with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
--download-fblaslapack --download-scalapack --download-mumps
--download-superlu_dist --download-hypre --download-ml --download-parmetis
--download-metis --download-triangle --download-chaco --download-elemental
--with-debugging=0

[3]PETSC ERROR: #1 User provided function() line 0 in  unknown file

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2

[cli_0]: aborting job:

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0

[cli_1]: aborting job:

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1

[cli_2]: aborting job:

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 3

[cli_3]: aborting job:

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 3

On Mon, Aug 17, 2015 at 11:16 AM, Dmitry Karpeyev <dkarpeev at gmail.com>
wrote:

> Xujun,
> Regarding your original question: please send the complete error message.
> Dmitry.
>
> On Mon, Aug 17, 2015 at 11:15 AM Xujun Zhao <xzhao99 at gmail.com> wrote:
>
>> Ahhhh, I should drink some coffee in the morning.
>> Now it passed the test!
>>
>> On Mon, Aug 17, 2015 at 11:13 AM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>>>
>>>> No. It gives the following error msg:
>>>>
>>>
>>> Did you build the executable?
>>>
>>>   cd src/vec/vec/examples/tutorials
>>>   make ex43
>>>
>>>     Matt
>>>
>>>
>>>> mpirun -np 2 ex43
>>>>
>>>> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov]
>>>> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process
>>>> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or
>>>> directory)
>>>>
>>>> execvp error on file ex43 (No such file or directory)
>>>>
>>>> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao <xzhao99 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I want PETSc to generate random vector using VecSetRandom() following
>>>>>> given examples, but failed and showed some "out of memory" error. The
>>>>>> following is the code, which goes well until it reaches VecSetRandom(). Can
>>>>>> anyone help me figure out the reason? Thanks a lot.
>>>>>>
>>>>>
>>>>> Does src/vec/vec/examples/tests/ex43.c run for you?
>>>>>
>>>>>  Thanks,
>>>>>
>>>>>     Matt
>>>>>
>>>>>
>>>>>> XZ
>>>>>>
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------------------------------------------------
>>>>>>   Vec             u;
>>>>>>   PetscRandom     rand_ctx;     /* random number generator context */
>>>>>>   PetscMPIInt     size, rank;
>>>>>>   PetscInt        n, dn;
>>>>>>
>>>>>>
>>>>>>   MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
>>>>>>   MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
>>>>>>   n  = N/size + 1;
>>>>>>   dn = n*size - N;
>>>>>>   if ( dn>0 && rank<dn ) n -= 1;
>>>>>>   printf("--->test in petsc_random_vector(): rank = %d, n =
>>>>>> %d\n",rank,n);
>>>>>>
>>>>>>
>>>>>>   VecCreate(PETSC_COMM_WORLD,&u);
>>>>>>   VecSetSizes(u,n,N);
>>>>>>   PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
>>>>>> #if defined(PETSC_HAVE_DRAND48)
>>>>>>   PetscRandomSetType(rand_ctx,PETSCRAND48);
>>>>>> #elif defined(PETSC_HAVE_RAND)
>>>>>>   PetscRandomSetType(rand_ctx,PETSCRAND);
>>>>>> #endif
>>>>>>   PetscRandomSetFromOptions(rand_ctx);
>>>>>>
>>>>>>
>>>>>>   VecSetRandom(u,rand_ctx);
>>>>>>   PetscRandomDestroy(&rand_ctx);
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/a060c3fe/attachment.html>

From reza.yaghmaie2 at gmail.com  Mon Aug 17 11:46:35 2015
From: reza.yaghmaie2 at gmail.com (Reza Yaghmaie)
Date: Mon, 17 Aug 2015 12:46:35 -0400
Subject: [petsc-users] SNESSetFunction
Message-ID: <CADOqKpmk2jYJrbbOMZY6as2ysFG1XE8bKdxvZ6RUivV9Uz5odw@mail.gmail.com>

Hi,

I have problems with passing variables through  *SNESSetFunction* in my
code. basically I have the following subroutines in the main body of the
Fortran code. Could you provide some insight on how to transfer variables
into the residual calculation routine (*FormFunction1*)?

Thanks,
Reza
------------------------------------------------------------------------------------------------------------------
*main code*

*SNES*      snes
*Vec*          xvec,rvec
*external*   FormFunction1
*real*8*       variable1(10),variable2(20,20),variable3(30),variable4(40,40)


      call *SNESSetFunction*(snes,rvec,FormFunction1,
     & PETSC_NULL_OBJECT,
     & variable1,variable2,variable3,variable4,
     & ierr)

      end

      subroutine *FormFunction1*(snes,XVEC,FVEC,
     & dummy,
     & varable1,varable2,varable3,varable4,
     & ierr)

*SNES*                      snes
*Vec*                         XVEC,FVEC
*PetscFortranAddr*    dummy
*real*8*
 variable1(10),variable2(20,20),variable3(30),variable4(40,40)


      return
      end
--------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/4ccec57d/attachment-0001.html>

From dkarpeev at gmail.com  Mon Aug 17 11:49:56 2015
From: dkarpeev at gmail.com (Dmitry Karpeyev)
Date: Mon, 17 Aug 2015 16:49:56 +0000
Subject: [petsc-users] Petsc creates a random vector
In-Reply-To: <CAHOKZ65H_7dnyG5Af1gyRjkePx8=QyjfieCqXAnFe+v-kZKHZA@mail.gmail.com>
References: <CAHOKZ657zxRTBoBXTx4E3goBr0O9ak1zwAimt-o0EM46UudZBA@mail.gmail.com>
	<CAMYG4Gk5sf11kevcV=xv0Q6-pAT5kEHUTvoah5Voheru8hxttA@mail.gmail.com>
	<CAHOKZ650MBRqGPhKS=YD8_swj4GAuM25BHX2FM4Sew_SxGVJUQ@mail.gmail.com>
	<CAMYG4GmL=yD2i0WHKGTsR8CuSxnJzfMKtX-zhat0Mij0nDBg9A@mail.gmail.com>
	<CAHOKZ67=f-3Ypixo1cRUuceSQxH3CieAczDz418UKE6jbyn4Jg@mail.gmail.com>
	<CA+Y_9U+AjYGes7vyyA63UT_HU2-xAfaV=csuj3HtTAxkZdzC5w@mail.gmail.com>
	<CAHOKZ65H_7dnyG5Af1gyRjkePx8=QyjfieCqXAnFe+v-kZKHZA@mail.gmail.com>
Message-ID: <CA+Y_9UKsNaW0PF5NZx4VW79pWKeWdZ-+Cf7Rse-ORgzz1DQ_vg@mail.gmail.com>

Use a dbg build with a debugger and/or valgrind.

Dmitry.

On Mon, Aug 17, 2015 at 11:31 AM Xujun Zhao <xzhao99 at gmail.com> wrote:

> This is run with PETSc opt mode, so the error message looks not very
> useful, see below:
> Probably I should use dbg version to see the details.
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [1]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [2]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [2]PETSC ERROR: [1]PETSC ERROR: to get more information on the crash.
>
> to get more information on the crash.
>
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [3]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [3]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>
> [0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
> mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015
>
> [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
> --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
> --download-fblaslapack --download-scalapack --download-mumps
> --download-superlu_dist --download-hypre --download-ml --download-parmetis
> --download-metis --download-triangle --download-chaco --download-elemental
> --with-debugging=0
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [1]PETSC ERROR: Signal received
>
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>
> [1]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
> mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015
>
> [1]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
> --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
> --download-fblaslapack --download-scalapack --download-mumps
> --download-superlu_dist --download-hypre --download-ml --download-parmetis
> --download-metis --download-triangle --download-chaco --download-elemental
> --with-debugging=0
>
> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> [2]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [2]PETSC ERROR: Signal received
>
> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>
> [2]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
> mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015
>
> [2]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
> --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
> --download-fblaslapack --download-scalapack --download-mumps
> --download-superlu_dist --download-hypre --download-ml --download-parmetis
> --download-metis --download-triangle --download-chaco --download-elemental
> --with-debugging=0
>
> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> [3]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [3]PETSC ERROR: Signal received
>
> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>
> [3]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named
> mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015
>
> [3]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9
> --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich
> --download-fblaslapack --download-scalapack --download-mumps
> --download-superlu_dist --download-hypre --download-ml --download-parmetis
> --download-metis --download-triangle --download-chaco --download-elemental
> --with-debugging=0
>
> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2
>
> [cli_0]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
> [cli_1]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1
>
> [cli_2]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 3
>
> [cli_3]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 3
>
> On Mon, Aug 17, 2015 at 11:16 AM, Dmitry Karpeyev <dkarpeev at gmail.com>
> wrote:
>
>> Xujun,
>> Regarding your original question: please send the complete error message.
>> Dmitry.
>>
>> On Mon, Aug 17, 2015 at 11:15 AM Xujun Zhao <xzhao99 at gmail.com> wrote:
>>
>>> Ahhhh, I should drink some coffee in the morning.
>>> Now it passed the test!
>>>
>>> On Mon, Aug 17, 2015 at 11:13 AM, Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>>>>
>>>>> No. It gives the following error msg:
>>>>>
>>>>
>>>> Did you build the executable?
>>>>
>>>>   cd src/vec/vec/examples/tutorials
>>>>   make ex43
>>>>
>>>>     Matt
>>>>
>>>>
>>>>> mpirun -np 2 ex43
>>>>>
>>>>> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov]
>>>>> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process
>>>>> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or
>>>>> directory)
>>>>>
>>>>> execvp error on file ex43 (No such file or directory)
>>>>>
>>>>> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley <knepley at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao <xzhao99 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I want PETSc to generate random vector using VecSetRandom()
>>>>>>> following given examples, but failed and showed some "out of memory" error.
>>>>>>> The following is the code, which goes well until it reaches VecSetRandom().
>>>>>>> Can anyone help me figure out the reason? Thanks a lot.
>>>>>>>
>>>>>>
>>>>>> Does src/vec/vec/examples/tests/ex43.c run for you?
>>>>>>
>>>>>>  Thanks,
>>>>>>
>>>>>>     Matt
>>>>>>
>>>>>>
>>>>>>> XZ
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --------------------------------------------------------------------------------------------
>>>>>>>   Vec             u;
>>>>>>>   PetscRandom     rand_ctx;     /* random number generator context */
>>>>>>>   PetscMPIInt     size, rank;
>>>>>>>   PetscInt        n, dn;
>>>>>>>
>>>>>>>
>>>>>>>   MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr);
>>>>>>>   MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr);
>>>>>>>   n  = N/size + 1;
>>>>>>>   dn = n*size - N;
>>>>>>>   if ( dn>0 && rank<dn ) n -= 1;
>>>>>>>   printf("--->test in petsc_random_vector(): rank = %d, n =
>>>>>>> %d\n",rank,n);
>>>>>>>
>>>>>>>
>>>>>>>   VecCreate(PETSC_COMM_WORLD,&u);
>>>>>>>   VecSetSizes(u,n,N);
>>>>>>>   PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx);
>>>>>>> #if defined(PETSC_HAVE_DRAND48)
>>>>>>>   PetscRandomSetType(rand_ctx,PETSCRAND48);
>>>>>>> #elif defined(PETSC_HAVE_RAND)
>>>>>>>   PetscRandomSetType(rand_ctx,PETSCRAND);
>>>>>>> #endif
>>>>>>>   PetscRandomSetFromOptions(rand_ctx);
>>>>>>>
>>>>>>>
>>>>>>>   VecSetRandom(u,rand_ctx);
>>>>>>>   PetscRandomDestroy(&rand_ctx);
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/51aa1028/attachment.html>

From knepley at gmail.com  Mon Aug 17 12:39:31 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 17 Aug 2015 12:39:31 -0500
Subject: [petsc-users] SNESSetFunction
In-Reply-To: <CADOqKpmk2jYJrbbOMZY6as2ysFG1XE8bKdxvZ6RUivV9Uz5odw@mail.gmail.com>
References: <CADOqKpmk2jYJrbbOMZY6as2ysFG1XE8bKdxvZ6RUivV9Uz5odw@mail.gmail.com>
Message-ID: <CAMYG4GmEh2w0-MSajCja-=asq=iAkK7bPVdBrcF_Zphh2GNc9A@mail.gmail.com>

On Mon, Aug 17, 2015 at 11:46 AM, Reza Yaghmaie <reza.yaghmaie2 at gmail.com>
wrote:

>
> Hi,
>
> I have problems with passing variables through  *SNESSetFunction* in my
> code. basically I have the following subroutines in the main body of the
> Fortran code. Could you provide some insight on how to transfer variables
> into the residual calculation routine (*FormFunction1*)?
>

Extra arguments to your FormFunction are meant to be passed in a context,
through the context variable.

This is difficult in Fortran, but you can use a PetscObject as a container.
You can attach other
PetscObjects using PetscObjectCompose() in Fortran.

   Matt


> Thanks,
> Reza
>
> ------------------------------------------------------------------------------------------------------------------
> *main code*
>
> *SNES*      snes
> *Vec*          xvec,rvec
> *external*   FormFunction1
> *real*8*
>  variable1(10),variable2(20,20),variable3(30),variable4(40,40)
>
>
>       call *SNESSetFunction*(snes,rvec,FormFunction1,
>      & PETSC_NULL_OBJECT,
>      & variable1,variable2,variable3,variable4,
>      & ierr)
>
>       end
>
>       subroutine *FormFunction1*(snes,XVEC,FVEC,
>      & dummy,
>      & varable1,varable2,varable3,varable4,
>      & ierr)
>
> *SNES*                      snes
> *Vec*                         XVEC,FVEC
> *PetscFortranAddr*    dummy
> *real*8*
>  variable1(10),variable2(20,20),variable3(30),variable4(40,40)
>
>
>       return
>       end
>
> --------------------------------------------------------------------------------------------------------------
>
>
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/50d16b9a/attachment.html>

From jychang48 at gmail.com  Mon Aug 17 13:21:51 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Mon, 17 Aug 2015 12:21:51 -0600
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <55CDA6E2.5000302@ntnu.no>
References: <55CDA6E2.5000302@ntnu.no>
Message-ID: <CAP2=TMjYsqfc3xig3PP0V=xzJsK25Xg_PvoXwMtAJc9crKjG9Q@mail.gmail.com>

Thanks everyone for your valuable input, a few follow up questions:

1) The specs for my machine says there are 10 cores and 20 threads.
Does that mean for each socket, i have 10 cores where each core has 2
threads? Or does it mean that each core can use up to 20 threads? Or
something else entirely?

2a) When I do an hwloc-info on a single compute node:

$ hwloc-info

depth 0: 1 Machine (type #1)

 depth 1: 2 NUMANode (type #2)

  depth 2: 2 Socket (type #3)

   depth 3: 2 L3Cache (type #4)

    depth 4: 20 L2Cache (type #4)

     depth 5: 20 L1dCache (type #4)

      depth 6: 20 L1iCache (type #4)

       depth 7: 20 Core (type #5)

        depth 8: 20 PU (type #6)

Special depth -3: 5 Bridge (type #9)

Special depth -4: 6 PCI Device (type #10)

Special depth -5: 6 OS Device (type #11)

With this setup, does it mean that if I invoke mpiexec.hydra -np
<number> -bind-to hwthread ... the MPI program will bind to the cores?

2b) Our headnode has 40 PU at depth 8, so if I -bind-to hwthread on
this node (and get yelled at by the system admins) it's possible that
two MPI processes can run on the same core?

3) When I invoke an MPI process via mpiexec.hydra -np <number> ...
without any bindings, do we know what exactly is going on?

Thanks,
Justin

On Fri, Aug 14, 2015 at 2:29 AM, ?smund Ervik <asmund.ervik at ntnu.no> wrote:
>>> So this is a pretty low fraction (55%) of 59.7*2 = 119.4.  I suspect
>>> your memory or motherboard is at most 1600 MHz, so your peak would be
>>> 102.4 GB/s.
>>
>>> You can check this as root using "dmidecode --type 17", which should
>>> give one entry per channel, looking something like this:
>>>
>>> Handle 0x002B, DMI type 17, 34 bytes
>>> Memory Device
>>>         Array Handle: 0x002A
>>>         Error Information Handle: 0x002F
>>>         Total Width: Unknown
>>>         Data Width: Unknown
>>>         Size: 4096 MB
>>>         Form Factor: DIMM
>>>         Set: None
>>>         Locator: DIMM0
>>>         Bank Locator: BANK 0
>>>         Type: <OUT OF SPEC>
>>>         Type Detail: None
>>>         Speed: Unknown
>>>         Manufacturer: Not Specified
>>>         Serial Number: Not Specified
>>>         Asset Tag: Unknown
>>>         Part Number: Not Specified
>>>         Rank: Unknown
>>>         Configured Clock Speed: 1600 MHz
>>
>>I have no root access. Is there another way to confirm the clock speed?
>
> Also note: even in the case where your motherboard, RAM and CPU all say
> 1866 on the label, if there are more memory DIMMs (chips) per node than
> channels, say 16 DIMMs on your 8 channels, you will see a performance
> reduction on the order of 20-30%. This is more likely if you are using
> nodes in a "high-memory queue" or similar where there's >= 128 GB memory
> per node. (This will change in the future when/if people start using
> DDR4 LRDIMMs.) There's a series of in-depth discussions here:
> http://frankdenneman.nl/2015/02/20/memory-deep-dive/ and there's also
> lots of interesting memory-stuff on John McCalpin's blog:
> https://sites.utexas.edu/jdm4372/
>
> Regards,
> ?smund
>

From bsmith at mcs.anl.gov  Mon Aug 17 13:25:46 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 17 Aug 2015 13:25:46 -0500
Subject: [petsc-users] SNESSetFunction
In-Reply-To: <CAMYG4GmEh2w0-MSajCja-=asq=iAkK7bPVdBrcF_Zphh2GNc9A@mail.gmail.com>
References: <CADOqKpmk2jYJrbbOMZY6as2ysFG1XE8bKdxvZ6RUivV9Uz5odw@mail.gmail.com>
	<CAMYG4GmEh2w0-MSajCja-=asq=iAkK7bPVdBrcF_Zphh2GNc9A@mail.gmail.com>
Message-ID: <E3C84517-3C72-43C1-AD3C-1AB739F4071D@mcs.anl.gov>


  Reza,

    See src/snes/examples/tutorials/ex5f90.F for how this may be easily done using a Fortran user defined type

  Barry

> On Aug 17, 2015, at 12:39 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Mon, Aug 17, 2015 at 11:46 AM, Reza Yaghmaie <reza.yaghmaie2 at gmail.com> wrote:
> 
> Hi,
> 
> I have problems with passing variables through  SNESSetFunction in my code. basically I have the following subroutines in the main body of the Fortran code. Could you provide some insight on how to transfer variables into the residual calculation routine (FormFunction1)?
> 
> Extra arguments to your FormFunction are meant to be passed in a context, through the context variable.
> 
> This is difficult in Fortran, but you can use a PetscObject as a container. You can attach other
> PetscObjects using PetscObjectCompose() in Fortran.
> 
>    Matt
>  
> Thanks,
> Reza
> ------------------------------------------------------------------------------------------------------------------
> main code
> 
> 	SNES      snes
> 	Vec          xvec,rvec
> 	external   FormFunction1
> 	real*8       variable1(10),variable2(20,20),variable3(30),variable4(40,40)
> 
> 
>       call SNESSetFunction(snes,rvec,FormFunction1,
>      &		PETSC_NULL_OBJECT,
>      &		variable1,variable2,variable3,variable4,
>      &		ierr)
> 
>       end
> 
>       subroutine FormFunction1(snes,XVEC,FVEC,
>      &		dummy,
>      &		varable1,varable2,varable3,varable4,
>      &		ierr)
> 
> 	SNES                      snes
> 	Vec                         XVEC,FVEC
> 	PetscFortranAddr    dummy
> 	real*8                       variable1(10),variable2(20,20),variable3(30),variable4(40,40)
> 
> 
>       return
>       end
> --------------------------------------------------------------------------------------------------------------
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From bsmith at mcs.anl.gov  Mon Aug 17 13:35:31 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 17 Aug 2015 13:35:31 -0500
Subject: [petsc-users] Understanding the memory bandwidth
In-Reply-To: <CAP2=TMjYsqfc3xig3PP0V=xzJsK25Xg_PvoXwMtAJc9crKjG9Q@mail.gmail.com>
References: <55CDA6E2.5000302@ntnu.no>
	<CAP2=TMjYsqfc3xig3PP0V=xzJsK25Xg_PvoXwMtAJc9crKjG9Q@mail.gmail.com>
Message-ID: <7FED100B-A227-4C8D-AB1C-DA3FCF9C8F38@mcs.anl.gov>


> On Aug 17, 2015, at 1:21 PM, Justin Chang <jychang48 at gmail.com> wrote:
> 
> Thanks everyone for your valuable input, a few follow up questions:
> 
> 1) The specs for my machine says there are 10 cores and 20 threads.
> Does that mean for each socket, i have 10 cores where each core has 2
> threads? Or does it mean that each core can use up to 20 threads? Or
> something else entirely?

   Some times a single core has support for multiple (like 2) "hardware threads". What this means is that the core has "extra" hardware, generally registers, that allow switching between 2 threads on the core without having to save all the registers for one thread and load all the registers from the other thread (essentially it has more registers then it would have without support for "hardware threads". This means that if the core has two threads, they can be switched back and forth very rapidly. The reason hardware designers put this in is so if one thread is waiting for memory loads, it can switch to the other thread and get some work done during the time. This allows memory latency hiding. 

  The term "hardware threads" is not a very accurate term IMHO. You can use any many or as few threads on this system as you want to, the system is just optimized hardware wise to run with 20 threads for latency hiding. 

  Note that PETSc codes are generally memory bandwidth limited, not memory latency limited and so generally with PETSc code it makes sense to use fewer threads than cores (you are not utilizing all the "extra" hardware then but it is faster).

  Barry


> 
> 2a) When I do an hwloc-info on a single compute node:
> 
> $ hwloc-info
> 
> depth 0: 1 Machine (type #1)
> 
> depth 1: 2 NUMANode (type #2)
> 
>  depth 2: 2 Socket (type #3)
> 
>   depth 3: 2 L3Cache (type #4)
> 
>    depth 4: 20 L2Cache (type #4)
> 
>     depth 5: 20 L1dCache (type #4)
> 
>      depth 6: 20 L1iCache (type #4)
> 
>       depth 7: 20 Core (type #5)
> 
>        depth 8: 20 PU (type #6)
> 
> Special depth -3: 5 Bridge (type #9)
> 
> Special depth -4: 6 PCI Device (type #10)
> 
> Special depth -5: 6 OS Device (type #11)
> 
> With this setup, does it mean that if I invoke mpiexec.hydra -np
> <number> -bind-to hwthread ... the MPI program will bind to the cores?
> 
> 2b) Our headnode has 40 PU at depth 8, so if I -bind-to hwthread on
> this node (and get yelled at by the system admins) it's possible that
> two MPI processes can run on the same core?
> 
> 3) When I invoke an MPI process via mpiexec.hydra -np <number> ...
> without any bindings, do we know what exactly is going on?
> 
> Thanks,
> Justin
> 
> On Fri, Aug 14, 2015 at 2:29 AM, ?smund Ervik <asmund.ervik at ntnu.no> wrote:
>>>> So this is a pretty low fraction (55%) of 59.7*2 = 119.4.  I suspect
>>>> your memory or motherboard is at most 1600 MHz, so your peak would be
>>>> 102.4 GB/s.
>>> 
>>>> You can check this as root using "dmidecode --type 17", which should
>>>> give one entry per channel, looking something like this:
>>>> 
>>>> Handle 0x002B, DMI type 17, 34 bytes
>>>> Memory Device
>>>>        Array Handle: 0x002A
>>>>        Error Information Handle: 0x002F
>>>>        Total Width: Unknown
>>>>        Data Width: Unknown
>>>>        Size: 4096 MB
>>>>        Form Factor: DIMM
>>>>        Set: None
>>>>        Locator: DIMM0
>>>>        Bank Locator: BANK 0
>>>>        Type: <OUT OF SPEC>
>>>>        Type Detail: None
>>>>        Speed: Unknown
>>>>        Manufacturer: Not Specified
>>>>        Serial Number: Not Specified
>>>>        Asset Tag: Unknown
>>>>        Part Number: Not Specified
>>>>        Rank: Unknown
>>>>        Configured Clock Speed: 1600 MHz
>>> 
>>> I have no root access. Is there another way to confirm the clock speed?
>> 
>> Also note: even in the case where your motherboard, RAM and CPU all say
>> 1866 on the label, if there are more memory DIMMs (chips) per node than
>> channels, say 16 DIMMs on your 8 channels, you will see a performance
>> reduction on the order of 20-30%. This is more likely if you are using
>> nodes in a "high-memory queue" or similar where there's >= 128 GB memory
>> per node. (This will change in the future when/if people start using
>> DDR4 LRDIMMs.) There's a series of in-depth discussions here:
>> http://frankdenneman.nl/2015/02/20/memory-deep-dive/ and there's also
>> lots of interesting memory-stuff on John McCalpin's blog:
>> https://sites.utexas.edu/jdm4372/
>> 
>> Regards,
>> ?smund
>> 


From gideon.simpson at gmail.com  Mon Aug 17 15:05:27 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Mon, 17 Aug 2015 16:05:27 -0400
Subject: [petsc-users] superlu_dist output
Message-ID: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com>

Is there a way to suppress this kind of output when running with superlu_dist?

.. equilibrated? *equed = B
.. LDPERM job 5	 time: 0.00
.. anorm 3.541231e+01
.. Use METIS ordering on A'+A
.. symbfact(): relax   10, maxsuper   60, fill    4
	No of supers 73
	Size of G(L) 991
	Size of G(U) 722
	int 4, short 2, float 4, double 8
	SYMBfact (MB):	L\U 0.02	total 0.03	expansions 0
.. # L blocks 228	# U blocks 211
MPI tag upper bound = 536870911
 === using DAG ===
 * init: 1.779720e-04 seconds
.. thresh = s_eps 0.000000e+00 * anorm 3.541231e+01 = 0.000000e+00
.. Buffer size: Lsub 30	Lval 200	Usub 35	Uval 120	LDA 20
	NUMfact (MB) all PEs:	L\U	0.10	all	0.12
	All space (MB):		total	0.32	Avg	0.32	Max	0.32
	Number of tiny pivots:          0
.. DiagScale = 3
.. LDPERM job 5	 time: 0.00
.. DiagScale = 3
.. LDPERM job 5	 time: 0.00
.. DiagScale = 3
.. LDPERM job 5	 time: 0.00
.. anorm 3.378641e+01

-gideon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/85f15ebf/attachment.html>

From balay at mcs.anl.gov  Mon Aug 17 15:14:10 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 17 Aug 2015 15:14:10 -0500
Subject: [petsc-users] superlu_dist output
In-Reply-To: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com>
References: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com>
Message-ID: <alpine.LFD.2.20.1508171510300.30276@asterix>

How can we reproduce this? Is this using debian pkg install of petsc,
superlu_dist? [which versions?]

>>>>>
balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ $PETSC_DIR/bin/petscmpiexec -n 2 ./ex2 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist 
Norm of error 1.87427e-15 iterations 1
balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ 
<<<<<<

You can try installing our latest version using
--download-superlu_dist and see if you still have this issue.

Satish

Satish

On Mon, 17 Aug 2015, Gideon Simpson wrote:

> Is there a way to suppress this kind of output when running with superlu_dist?
> 
> .. equilibrated? *equed = B
> .. LDPERM job 5	 time: 0.00
> .. anorm 3.541231e+01
> .. Use METIS ordering on A'+A
> .. symbfact(): relax   10, maxsuper   60, fill    4
> 	No of supers 73
> 	Size of G(L) 991
> 	Size of G(U) 722
> 	int 4, short 2, float 4, double 8
> 	SYMBfact (MB):	L\U 0.02	total 0.03	expansions 0
> .. # L blocks 228	# U blocks 211
> MPI tag upper bound = 536870911
>  === using DAG ===
>  * init: 1.779720e-04 seconds
> .. thresh = s_eps 0.000000e+00 * anorm 3.541231e+01 = 0.000000e+00
> .. Buffer size: Lsub 30	Lval 200	Usub 35	Uval 120	LDA 20
> 	NUMfact (MB) all PEs:	L\U	0.10	all	0.12
> 	All space (MB):		total	0.32	Avg	0.32	Max	0.32
> 	Number of tiny pivots:          0
> .. DiagScale = 3
> .. LDPERM job 5	 time: 0.00
> .. DiagScale = 3
> .. LDPERM job 5	 time: 0.00
> .. DiagScale = 3
> .. LDPERM job 5	 time: 0.00
> .. anorm 3.378641e+01
> 
> -gideon
> 
> 


From gideon.simpson at gmail.com  Mon Aug 17 15:18:49 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Mon, 17 Aug 2015 16:18:49 -0400
Subject: [petsc-users] superlu_dist output
In-Reply-To: <alpine.LFD.2.20.1508171510300.30276@asterix>
References: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com>
	<alpine.LFD.2.20.1508171510300.30276@asterix>
Message-ID: <36CA333A-7F0A-4A2C-9411-F651740271E8@gmail.com>

This is the macports version.  For me, this appears when running a problem with a linear solver with the flags

-pc_factor_mat_solver_package superlu_dist -pc_type lu

Nothing else is required.  The answer it gives is correct, I?d just like to suppress the output.

-gideon

> On Aug 17, 2015, at 4:14 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> 
> How can we reproduce this? Is this using debian pkg install of petsc,
> superlu_dist? [which versions?]
> 
>>>>>> 
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ $PETSC_DIR/bin/petscmpiexec -n 2 ./ex2 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist 
> Norm of error 1.87427e-15 iterations 1
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ 
> <<<<<<
> 
> You can try installing our latest version using
> --download-superlu_dist and see if you still have this issue.
> 
> Satish
> 
> Satish
> 
> On Mon, 17 Aug 2015, Gideon Simpson wrote:
> 
>> Is there a way to suppress this kind of output when running with superlu_dist?
>> 
>> .. equilibrated? *equed = B
>> .. LDPERM job 5	 time: 0.00
>> .. anorm 3.541231e+01
>> .. Use METIS ordering on A'+A
>> .. symbfact(): relax   10, maxsuper   60, fill    4
>> 	No of supers 73
>> 	Size of G(L) 991
>> 	Size of G(U) 722
>> 	int 4, short 2, float 4, double 8
>> 	SYMBfact (MB):	L\U 0.02	total 0.03	expansions 0
>> .. # L blocks 228	# U blocks 211
>> MPI tag upper bound = 536870911
>> === using DAG ===
>> * init: 1.779720e-04 seconds
>> .. thresh = s_eps 0.000000e+00 * anorm 3.541231e+01 = 0.000000e+00
>> .. Buffer size: Lsub 30	Lval 200	Usub 35	Uval 120	LDA 20
>> 	NUMfact (MB) all PEs:	L\U	0.10	all	0.12
>> 	All space (MB):		total	0.32	Avg	0.32	Max	0.32
>> 	Number of tiny pivots:          0
>> .. DiagScale = 3
>> .. LDPERM job 5	 time: 0.00
>> .. DiagScale = 3
>> .. LDPERM job 5	 time: 0.00
>> .. DiagScale = 3
>> .. LDPERM job 5	 time: 0.00
>> .. anorm 3.378641e+01
>> 
>> -gideon
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/1ee8306a/attachment.html>

From balay at mcs.anl.gov  Mon Aug 17 15:37:49 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 17 Aug 2015 15:37:49 -0500
Subject: [petsc-users] superlu_dist output
In-Reply-To: <36CA333A-7F0A-4A2C-9411-F651740271E8@gmail.com>
References: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com>
	<alpine.LFD.2.20.1508171510300.30276@asterix>
	<36CA333A-7F0A-4A2C-9411-F651740271E8@gmail.com>
Message-ID: <alpine.LFD.2.20.1508171526510.30276@asterix>

from superlu_dist README

>>>>>>>>
      o -DPRNTlevel=[0,1,2,...]
      printing level to show solver's execution details. (default is 0)

      o -DDEBUGlevel=[0,1,2,...]
        diagnostic printing level for debugging purpose. (default is 0)
<<<<<

Presumably macports version of superlu_dist is built with these flags enabled.

cc:ing Sean to check if its an issue with macports superlu_dist package.

If you need to get rid of this output - you can build petsc from sources [with --download_superlu_dist]

Satish

On Mon, 17 Aug 2015, Gideon Simpson wrote:

> This is the macports version.  For me, this appears when running a problem with a linear solver with the flags
> 
> -pc_factor_mat_solver_package superlu_dist -pc_type lu
> 
> Nothing else is required.  The answer it gives is correct, I?d just like to suppress the output.
> 
> -gideon
> 
> > On Aug 17, 2015, at 4:14 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> > 
> > How can we reproduce this? Is this using debian pkg install of petsc,
> > superlu_dist? [which versions?]
> > 
> >>>>>> 
> > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ $PETSC_DIR/bin/petscmpiexec -n 2 ./ex2 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist 
> > Norm of error 1.87427e-15 iterations 1
> > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ 
> > <<<<<<
> > 
> > You can try installing our latest version using
> > --download-superlu_dist and see if you still have this issue.
> > 
> > Satish
> > 
> > Satish
> > 
> > On Mon, 17 Aug 2015, Gideon Simpson wrote:
> > 
> >> Is there a way to suppress this kind of output when running with superlu_dist?
> >> 
> >> .. equilibrated? *equed = B
> >> .. LDPERM job 5	 time: 0.00
> >> .. anorm 3.541231e+01
> >> .. Use METIS ordering on A'+A
> >> .. symbfact(): relax   10, maxsuper   60, fill    4
> >> 	No of supers 73
> >> 	Size of G(L) 991
> >> 	Size of G(U) 722
> >> 	int 4, short 2, float 4, double 8
> >> 	SYMBfact (MB):	L\U 0.02	total 0.03	expansions 0
> >> .. # L blocks 228	# U blocks 211
> >> MPI tag upper bound = 536870911
> >> === using DAG ===
> >> * init: 1.779720e-04 seconds
> >> .. thresh = s_eps 0.000000e+00 * anorm 3.541231e+01 = 0.000000e+00
> >> .. Buffer size: Lsub 30	Lval 200	Usub 35	Uval 120	LDA 20
> >> 	NUMfact (MB) all PEs:	L\U	0.10	all	0.12
> >> 	All space (MB):		total	0.32	Avg	0.32	Max	0.32
> >> 	Number of tiny pivots:          0
> >> .. DiagScale = 3
> >> .. LDPERM job 5	 time: 0.00
> >> .. DiagScale = 3
> >> .. LDPERM job 5	 time: 0.00
> >> .. DiagScale = 3
> >> .. LDPERM job 5	 time: 0.00
> >> .. anorm 3.378641e+01
> >> 
> >> -gideon
> >> 
> >> 
> > 
> 
> 

From sean at farley.io  Mon Aug 17 16:13:08 2015
From: sean at farley.io (Sean Farley)
Date: Mon, 17 Aug 2015 14:13:08 -0700
Subject: [petsc-users] superlu_dist output
In-Reply-To: <alpine.LFD.2.20.1508171526510.30276@asterix>
References: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com>
	<alpine.LFD.2.20.1508171510300.30276@asterix>
	<36CA333A-7F0A-4A2C-9411-F651740271E8@gmail.com>
	<alpine.LFD.2.20.1508171526510.30276@asterix>
Message-ID: <m2fv3hmxej.fsf@farley.io>


Satish Balay <balay at mcs.anl.gov> writes:

> from superlu_dist README
>
>>>>>>>>>
>       o -DPRNTlevel=[0,1,2,...]
>       printing level to show solver's execution details. (default is 0)
>
>       o -DDEBUGlevel=[0,1,2,...]
>         diagnostic printing level for debugging purpose. (default is 0)
> <<<<<
>
> Presumably macports version of superlu_dist is built with these flags enabled.
>
> cc:ing Sean to check if its an issue with macports superlu_dist package.

I don't think they're set at all:

http://trac.macports.org/browser/trunk/dports/math/superlu_dist/Portfile

I'm in the process of upgrading superlu_dist soon, so maybe that will
help.

From fdkong.jd at gmail.com  Mon Aug 17 17:32:35 2015
From: fdkong.jd at gmail.com (Fande Kong)
Date: Mon, 17 Aug 2015 16:32:35 -0600
Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM?
Message-ID: <CAN5Wd-Ke5bPvqb3wGO4c75Q8oaUsNUdggHi-4BsidFX=_odYng@mail.gmail.com>

Hi all,

I was wondering why, in Petsc,  MPI_Reduce with PetscInt needs MPI_SUM
meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any special
reasons to distinguish them?

Thanks,

Fande Kong,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/9eb25864/attachment.html>

From balay at mcs.anl.gov  Mon Aug 17 17:49:32 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 17 Aug 2015 17:49:32 -0500
Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM?
In-Reply-To: <CAN5Wd-Ke5bPvqb3wGO4c75Q8oaUsNUdggHi-4BsidFX=_odYng@mail.gmail.com>
References: <CAN5Wd-Ke5bPvqb3wGO4c75Q8oaUsNUdggHi-4BsidFX=_odYng@mail.gmail.com>
Message-ID: <alpine.LFD.2.20.1508171747110.30276@asterix>

I think some MPI impls didn't provide some of the ops on MPI_COMPLEX
datatype.

So petsc provides these ops for PetscReal i.e MPIU_SUM, MPIU_MAX, MPIU_MIN

Satish

On Mon, 17 Aug 2015, Fande Kong wrote:

> Hi all,
> 
> I was wondering why, in Petsc,  MPI_Reduce with PetscInt needs MPI_SUM
> meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any special
> reasons to distinguish them?
> 
> Thanks,
> 
> Fande Kong,
> 


From bsmith at mcs.anl.gov  Mon Aug 17 19:18:52 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 17 Aug 2015 19:18:52 -0500
Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM?
In-Reply-To: <alpine.LFD.2.20.1508171747110.30276@asterix>
References: <CAN5Wd-Ke5bPvqb3wGO4c75Q8oaUsNUdggHi-4BsidFX=_odYng@mail.gmail.com>
	<alpine.LFD.2.20.1508171747110.30276@asterix>
Message-ID: <C92C0993-78ED-49A9-9F68-F8D5B0DDA73B@mcs.anl.gov>


  It is crucial. MPI also doesn't provide sums for __float128 precision. But MPI does always provide sums for 32 and 64 bit integers so no need for MPIU_SUM for PETSC_INT


> On Aug 17, 2015, at 5:49 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> 
> I think some MPI impls didn't provide some of the ops on MPI_COMPLEX
> datatype.
> 
> So petsc provides these ops for PetscReal i.e MPIU_SUM, MPIU_MAX, MPIU_MIN
> 
> Satish
> 
> On Mon, 17 Aug 2015, Fande Kong wrote:
> 
>> Hi all,
>> 
>> I was wondering why, in Petsc,  MPI_Reduce with PetscInt needs MPI_SUM
>> meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any special
>> reasons to distinguish them?
>> 
>> Thanks,
>> 
>> Fande Kong,
>> 
> 


From fdkong.jd at gmail.com  Mon Aug 17 21:53:01 2015
From: fdkong.jd at gmail.com (Fande Kong)
Date: Mon, 17 Aug 2015 20:53:01 -0600
Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM?
In-Reply-To: <C92C0993-78ED-49A9-9F68-F8D5B0DDA73B@mcs.anl.gov>
References: <CAN5Wd-Ke5bPvqb3wGO4c75Q8oaUsNUdggHi-4BsidFX=_odYng@mail.gmail.com>
	<alpine.LFD.2.20.1508171747110.30276@asterix>
	<C92C0993-78ED-49A9-9F68-F8D5B0DDA73B@mcs.anl.gov>
Message-ID: <CAN5Wd-L51r2gU3e8JOJYgYStsvWdkKjS0XscpeOpn346s5xf=w@mail.gmail.com>

Thanks, Barry, Satish,

But, is it possible to uniform the use of MPI_SUM and MPIU_SUM? For
example, we could let  a Petsc function just switch to a regular MPI_Reduce
or other function when using PetscInt. In other words, we need a wrapper. I
always use MPIU_INT in a MPI function  when using PetscInt. It is very
straightforward to use MPIU_SUM, MPIU_MAX so on, when thinking about we are
using MPIU_INT.

Thanks,

Fande Kong,

On Mon, Aug 17, 2015 at 6:18 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   It is crucial. MPI also doesn't provide sums for __float128 precision.
> But MPI does always provide sums for 32 and 64 bit integers so no need for
> MPIU_SUM for PETSC_INT
>
>
> > On Aug 17, 2015, at 5:49 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> >
> > I think some MPI impls didn't provide some of the ops on MPI_COMPLEX
> > datatype.
> >
> > So petsc provides these ops for PetscReal i.e MPIU_SUM, MPIU_MAX,
> MPIU_MIN
> >
> > Satish
> >
> > On Mon, 17 Aug 2015, Fande Kong wrote:
> >
> >> Hi all,
> >>
> >> I was wondering why, in Petsc,  MPI_Reduce with PetscInt needs MPI_SUM
> >> meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any
> special
> >> reasons to distinguish them?
> >>
> >> Thanks,
> >>
> >> Fande Kong,
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150817/484f3608/attachment-0001.html>

From bsmith at mcs.anl.gov  Mon Aug 17 22:01:17 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 17 Aug 2015 22:01:17 -0500
Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM?
In-Reply-To: <CAN5Wd-L51r2gU3e8JOJYgYStsvWdkKjS0XscpeOpn346s5xf=w@mail.gmail.com>
References: <CAN5Wd-Ke5bPvqb3wGO4c75Q8oaUsNUdggHi-4BsidFX=_odYng@mail.gmail.com>
	<alpine.LFD.2.20.1508171747110.30276@asterix>
	<C92C0993-78ED-49A9-9F68-F8D5B0DDA73B@mcs.anl.gov>
	<CAN5Wd-L51r2gU3e8JOJYgYStsvWdkKjS0XscpeOpn346s5xf=w@mail.gmail.com>
Message-ID: <E48ECB6B-1B80-426B-8E00-7D82E08C2463@mcs.anl.gov>


> On Aug 17, 2015, at 9:53 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> 
> Thanks, Barry, Satish,
> 
> But, is it possible to uniform the use of MPI_SUM and MPIU_SUM? For example, we could let  a Petsc function just switch to a regular MPI_Reduce or other function when using PetscInt. In other words, we need a wrapper. I always use MPIU_INT in a MPI function  when using PetscInt. It is very straightforward to use MPIU_SUM, MPIU_MAX so on, when thinking about we are using MPIU_INT.

  We could add code to the routine that gets called when one uses  MPIU_SUM which is PetscSum_Local() and defined in pinit.c to handle all possible data types then you could always use MPIU_SUM. The reason we don't is that using a user provide reduction such as PetscSum_Local() will ALWAYS be less efficient then using the MPI built in reduction operations. For integers which MPI can always handle we prefer to us the fastest possible which is the built in operation for summing. Now likely the time difference between the user provided one vs the built in one is too small to measure, I agree, but for me it is easy enough just to remember that MPIU_SUM is only needed for floating pointer numbers not integers.


  Barry

> 
> Thanks,
> 
> Fande Kong,
> 
> On Mon, Aug 17, 2015 at 6:18 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   It is crucial. MPI also doesn't provide sums for __float128 precision. But MPI does always provide sums for 32 and 64 bit integers so no need for MPIU_SUM for PETSC_INT
> 
> 
> > On Aug 17, 2015, at 5:49 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> >
> > I think some MPI impls didn't provide some of the ops on MPI_COMPLEX
> > datatype.
> >
> > So petsc provides these ops for PetscReal i.e MPIU_SUM, MPIU_MAX, MPIU_MIN
> >
> > Satish
> >
> > On Mon, 17 Aug 2015, Fande Kong wrote:
> >
> >> Hi all,
> >>
> >> I was wondering why, in Petsc,  MPI_Reduce with PetscInt needs MPI_SUM
> >> meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any special
> >> reasons to distinguish them?
> >>
> >> Thanks,
> >>
> >> Fande Kong,
> >>
> >
> 
> 


From timothee.nicolas at gmail.com  Tue Aug 18 03:42:36 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Tue, 18 Aug 2015 17:42:36 +0900
Subject: [petsc-users] Wise usage of user contexts
Message-ID: <CAGi1ndS9ivv5R_uyyMvSGoJZR7hVAfn4-F=0O6kqA0z--JrwZw@mail.gmail.com>

Hi all,

I am in the process of writing an implicit solver for a set of PDEs (namely
MHD equations), in FORTRAN. When setting the non-linear function to solve
via Newton-Krylov, I use a "user defined context", namely the thing denoted
by "ctx" on the doc page about SNESFunction :

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction

In practice ctx is a user defined type which contains everything I need in
the local routine which sets the function on the local part of the grid,
FormFunctionLocal. That is, some local/global geometrical information on
the grid, the physical parameter, and possibly any other thing.

In my case it so happens that due to the scheme I have chosen, when I
compute my function, I need the full solution of the problem at the last
two time steps (which are in Vec format). So my ctx contains two Vec
elements. Since I will work in 3D and intend to use a lot of points in the
future, I am concerned about memory problems which could occur.

Is there a limit to the size occupied by ctx ? Would this be better if
instead I was declaring global variables in a module and using this module
inside FormFunctionLocal ? Is this allowed ?

Best regards

Timothee NICOLAS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150818/0a5a1a84/attachment.html>

From dave.mayhem23 at gmail.com  Tue Aug 18 03:59:58 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Tue, 18 Aug 2015 10:59:58 +0200
Subject: [petsc-users] Wise usage of user contexts
In-Reply-To: <CAGi1ndS9ivv5R_uyyMvSGoJZR7hVAfn4-F=0O6kqA0z--JrwZw@mail.gmail.com>
References: <CAGi1ndS9ivv5R_uyyMvSGoJZR7hVAfn4-F=0O6kqA0z--JrwZw@mail.gmail.com>
Message-ID: <CAJ98EDqkLtDPcymAGVpEiVScNsq05ywioMkbNM-D8gANvj00ug@mail.gmail.com>

On 18 August 2015 at 10:42, Timoth?e Nicolas <timothee.nicolas at gmail.com>
wrote:

> Hi all,
>
> I am in the process of writing an implicit solver for a set of PDEs
> (namely MHD equations), in FORTRAN. When setting the non-linear function to
> solve via Newton-Krylov, I use a "user defined context", namely the thing
> denoted by "ctx" on the doc page about SNESFunction :
>
>
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction
>
> In practice ctx is a user defined type which contains everything I need in
> the local routine which sets the function on the local part of the grid,
> FormFunctionLocal. That is, some local/global geometrical information on
> the grid, the physical parameter, and possibly any other thing.
>
> In my case it so happens that due to the scheme I have chosen, when I
> compute my function, I need the full solution of the problem at the last
> two time steps (which are in Vec format). So my ctx contains two Vec
> elements. Since I will work in 3D and intend to use a lot of points in the
> future, I am concerned about memory problems which could occur.
>

In the grand scheme of things, the two vectors in your context aren't
likely to significantly add to the total memory footprint of your code. A
couple of things to note:
* If you run in parallel, only the local part of the vector will be stored
on each MPI process.
* All the KSP methods will allocate auxiliary vectors. Most methods require
more than 2 auxiliary vectors.
* SNES also requires auxiliary vectors. If you use JFNK, that method will
also need some additional temporary vectors.
* If you assemble a Jacobian, this matrix will likely require much more
memory per MPI process than two vectors


> Is there a limit to the size occupied by ctx ?
>

The only limit is defined by the available memory per MPI process you have
on your target machine.


> Would this be better if instead I was declaring global variables in a
> module and using this module inside FormFunctionLocal ? Is this allowed ?
>

What would be the difference in doing that - the memory usage will be
identical.

Cheers
  Dave


>
> Best regards
>
> Timothee NICOLAS
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150818/49856720/attachment.html>

From timothee.nicolas at gmail.com  Tue Aug 18 04:20:35 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Tue, 18 Aug 2015 18:20:35 +0900
Subject: [petsc-users] Wise usage of user contexts
In-Reply-To: <CAJ98EDqkLtDPcymAGVpEiVScNsq05ywioMkbNM-D8gANvj00ug@mail.gmail.com>
References: <CAGi1ndS9ivv5R_uyyMvSGoJZR7hVAfn4-F=0O6kqA0z--JrwZw@mail.gmail.com>
	<CAJ98EDqkLtDPcymAGVpEiVScNsq05ywioMkbNM-D8gANvj00ug@mail.gmail.com>
Message-ID: <CAGi1ndRW_mBJmB4F1DYz3BzZXRoEVDmxh5R-Y8mb_siL+dZowg@mail.gmail.com>

Dave,

Thx a lot for your very clear answer. My last question about modules could
be reformulated like this :

Why would I put anything in a ctx while I could simply use modules ? Maybe
it has something to do with the fact that PETSc is initially written for C ?

Best

Timothee

2015-08-18 17:59 GMT+09:00 Dave May <dave.mayhem23 at gmail.com>:

>
>
> On 18 August 2015 at 10:42, Timoth?e Nicolas <timothee.nicolas at gmail.com>
> wrote:
>
>> Hi all,
>>
>> I am in the process of writing an implicit solver for a set of PDEs
>> (namely MHD equations), in FORTRAN. When setting the non-linear function to
>> solve via Newton-Krylov, I use a "user defined context", namely the thing
>> denoted by "ctx" on the doc page about SNESFunction :
>>
>>
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction
>>
>> In practice ctx is a user defined type which contains everything I need
>> in the local routine which sets the function on the local part of the grid,
>> FormFunctionLocal. That is, some local/global geometrical information on
>> the grid, the physical parameter, and possibly any other thing.
>>
>> In my case it so happens that due to the scheme I have chosen, when I
>> compute my function, I need the full solution of the problem at the last
>> two time steps (which are in Vec format). So my ctx contains two Vec
>> elements. Since I will work in 3D and intend to use a lot of points in the
>> future, I am concerned about memory problems which could occur.
>>
>
> In the grand scheme of things, the two vectors in your context aren't
> likely to significantly add to the total memory footprint of your code. A
> couple of things to note:
> * If you run in parallel, only the local part of the vector will be stored
> on each MPI process.
> * All the KSP methods will allocate auxiliary vectors. Most methods
> require more than 2 auxiliary vectors.
> * SNES also requires auxiliary vectors. If you use JFNK, that method will
> also need some additional temporary vectors.
> * If you assemble a Jacobian, this matrix will likely require much more
> memory per MPI process than two vectors
>
>
>
>> Is there a limit to the size occupied by ctx ?
>>
>
> The only limit is defined by the available memory per MPI process you have
> on your target machine.
>
>
>> Would this be better if instead I was declaring global variables in a
>> module and using this module inside FormFunctionLocal ? Is this allowed ?
>>
>
> What would be the difference in doing that - the memory usage will be
> identical.
>
> Cheers
>   Dave
>
>
>>
>> Best regards
>>
>> Timothee NICOLAS
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150818/2aa2239a/attachment.html>

From D.J.P.Lahaye at tudelft.nl  Tue Aug 18 06:34:14 2015
From: D.J.P.Lahaye at tudelft.nl (Domenico Lahaye - EWI)
Date: Tue, 18 Aug 2015 11:34:14 +0000
Subject: [petsc-users] petsc KLU
In-Reply-To: <6F0087987AC5484D8D593B639648295A2F3B3521@SRV384.tudelft.net>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
	<CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
	<6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>
	<F2998241-0809-4699-B503-00B329AA385A@mcs.anl.gov>
	<6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net>
	<CAGCphBvvrA1UnRY--2h5-PVd+kBTBWM42hCbYTGt8VSAtjzF7Q@mail.gmail.com>
	<D1F77073.2703E%abhyshr@mcs.anl.gov>,
	<6F0087987AC5484D8D593B639648295A2F3B3521@SRV384.tudelft.net>
Message-ID: <71B4204D92F7884494460446CAD0F04B5B7CDBF0@SRV364.tudelft.net>

Dear all, 

  Have the disappointing results of KLU been reported somewhere? 
Earlier claims made might reinforce claims that we want to make. 

  Sincere thanks, Domenico. 
 
________________________________________
From: Romain Thomas
Sent: Tuesday, August 18, 2015 1:10 PM
To: Domenico Lahaye - EWI
Subject: FW: [petsc-users] petsc KLU

Hi,
You can find below the message from Shri.
Best regards,
Romain

-----Original Message-----
From: Abhyankar, Shrirang G. [mailto:abhyshr at anl.gov]
Sent: maandag 17 augustus 2015 18:21
To: Romain Thomas; Zhang, Hong
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] petsc KLU

Romain,
   I added the KLU interface to PETSc last year hearing the hype about KLU?s performance from several power system folks. I must say that I?m terribly disappointed! I did some performance testing of KLU on power grid problems (power flow application) last year and I got a similar performance that you report (PETSc is 2-4 times faster than KLU). I also clocked the time spent in PETSc?s SuiteSparse interface for KLU for operations other than factorization and it was very minimal. The fastest linear solver combination that I found was PETSc?s LU solver + AMD ordering from the SuiteSparse package (-pc_factor_mat_ordering_type amd).
Don?t try MUMPS and SuperLU ? they are terribly slow.

Shri


From:  hong zhang <hzhang at mcs.anl.gov>
Date:  Monday, August 17, 2015 at 10:08 AM
To:  Romain Thomas <R.Thomas at tudelft.nl>
Cc:  "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject:  Re: [petsc-users] petsc KLU


>Romain:
>Do you mean small sparse sequential 200 by 200 matrices?
>Petsc LU might be better than external LU packages because it
>implements simple LU algorithm and we took good care on data accesing
>(I've heard same observations). You may try 'qmd' matrix ordering for
>power grid simulation.
>I do not have experience on SuiteSparse. Testing MUMPS is worth it as
>well.
>
>Hong
>
>
>Hi
>Thank you for your answer. I was asking help because I find LU
>factorization 2-3 times faster than KLU. According to my problem size
>(200*200) and type (power system simulation), I should get almost the
>same computation time. Is it true to think that? Is the  difference of
>time due to the interface between PETSc and SuiteSparse?
>Thank you,
>Romain
>
>-----Original Message-----
>From: Barry Smith [mailto:bsmith at mcs.anl.gov]
>Sent: vrijdag 14 augustus 2015 17:31
>To: Romain Thomas
>Cc: Matthew Knepley; petsc-users at mcs.anl.gov
>Subject: Re: [petsc-users] petsc KLU
>
>
>   You should call
>
>    MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact);
>
>  then call
>
>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const
>> MatFactorInfo
>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>
>   This routines correctly internally call the appropriate
>MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU
>above.
>   There is no reason to (and it won't work) to call
>
>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>
>directly.
>
>  Barry
>
>> On Aug 14, 2015, at 10:07 AM, Romain Thomas <R.Thomas at tudelft.nl> wrote:
>>
>> Hi,
>> Thank you for your answer.
>> My problem is a bit more complex. During the simulation (?real
>>time?), I need to upgrade at each time step the matrix A and the
>>MatassemblyBegin and MatassemblyEnd take time and so, in order to
>>avoid these functions, I don?t use ksp or pc. I prefer to use
> the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor.
>And so, I want to know if there is similar functions for KLU. (I tried
>for Cholesky and, iLU and it works well).
>> Best regards,
>> Romain
>>
>>
>> From: Matthew Knepley [mailto:knepley at gmail.com]
>> Sent: vrijdag 14 augustus 2015 16:41
>> To: Romain Thomas
>> Cc: petsc-users at mcs.anl.gov
>> Subject: Re: [petsc-users] petsc KLU
>>
>> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl>
>>wrote:
>> Dear PETSc users,
>>
>> I would like to know if I can replace the following functions
>>
>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const
>> MatFactorInfo
>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>>
>> by
>>
>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>>
>> in my code for the simulation of electrical power systems? (I
>> installed the package SuiteSparse)
>>
>> Why would you do that? It already works with the former code. In
>> fact, you should really just use
>>
>>   KSPCreate(comm, &ksp)
>>   KSPSetOperator(ksp, A, A);
>>   KSPSetFromOptions(ksp);
>>   KSPSolve(ksp, b, x);
>>
>> and then give the options
>>
>>   -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse
>>
>> This is no advantage to using the Factor language since subsequent
>> calls to
>> KSPSolve() will not refactor.
>>
>>    Matt
>>
>> Thank you,
>> Best regards,
>> Romain
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>>experiments is infinitely more interesting than any results to which
>>their experiments lead.
>> -- Norbert Wiener
>
>
>
>
>


From knepley at gmail.com  Tue Aug 18 06:47:40 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 18 Aug 2015 06:47:40 -0500
Subject: [petsc-users] Wise usage of user contexts
In-Reply-To: <CAGi1ndRW_mBJmB4F1DYz3BzZXRoEVDmxh5R-Y8mb_siL+dZowg@mail.gmail.com>
References: <CAGi1ndS9ivv5R_uyyMvSGoJZR7hVAfn4-F=0O6kqA0z--JrwZw@mail.gmail.com>
	<CAJ98EDqkLtDPcymAGVpEiVScNsq05ywioMkbNM-D8gANvj00ug@mail.gmail.com>
	<CAGi1ndRW_mBJmB4F1DYz3BzZXRoEVDmxh5R-Y8mb_siL+dZowg@mail.gmail.com>
Message-ID: <CAMYG4G=V-zVd72Eu0ZTfJ7BL=4-Y3xqvzWHsa8JCZ62i1e-NMw@mail.gmail.com>

On Tue, Aug 18, 2015 at 4:20 AM, Timoth?e Nicolas <
timothee.nicolas at gmail.com> wrote:

> Dave,
>
> Thx a lot for your very clear answer. My last question about modules could
> be reformulated like this :
>
> Why would I put anything in a ctx while I could simply use modules ? Maybe
> it has something to do with the fact that PETSc is initially written for C ?
>

We think it makes the code more modular and easier to understand,
especially if many pieces are composed together. Global
variables have to be tracked down by someone looking at your code.

   Thanks,

     Matt


> Best
>
> Timothee
>
> 2015-08-18 17:59 GMT+09:00 Dave May <dave.mayhem23 at gmail.com>:
>
>>
>>
>> On 18 August 2015 at 10:42, Timoth?e Nicolas <timothee.nicolas at gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I am in the process of writing an implicit solver for a set of PDEs
>>> (namely MHD equations), in FORTRAN. When setting the non-linear function to
>>> solve via Newton-Krylov, I use a "user defined context", namely the thing
>>> denoted by "ctx" on the doc page about SNESFunction :
>>>
>>>
>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction
>>>
>>> In practice ctx is a user defined type which contains everything I need
>>> in the local routine which sets the function on the local part of the grid,
>>> FormFunctionLocal. That is, some local/global geometrical information on
>>> the grid, the physical parameter, and possibly any other thing.
>>>
>>> In my case it so happens that due to the scheme I have chosen, when I
>>> compute my function, I need the full solution of the problem at the last
>>> two time steps (which are in Vec format). So my ctx contains two Vec
>>> elements. Since I will work in 3D and intend to use a lot of points in the
>>> future, I am concerned about memory problems which could occur.
>>>
>>
>> In the grand scheme of things, the two vectors in your context aren't
>> likely to significantly add to the total memory footprint of your code. A
>> couple of things to note:
>> * If you run in parallel, only the local part of the vector will be
>> stored on each MPI process.
>> * All the KSP methods will allocate auxiliary vectors. Most methods
>> require more than 2 auxiliary vectors.
>> * SNES also requires auxiliary vectors. If you use JFNK, that method will
>> also need some additional temporary vectors.
>> * If you assemble a Jacobian, this matrix will likely require much more
>> memory per MPI process than two vectors
>>
>>
>>
>>> Is there a limit to the size occupied by ctx ?
>>>
>>
>> The only limit is defined by the available memory per MPI process you
>> have on your target machine.
>>
>>
>>> Would this be better if instead I was declaring global variables in a
>>> module and using this module inside FormFunctionLocal ? Is this allowed ?
>>>
>>
>> What would be the difference in doing that - the memory usage will be
>> identical.
>>
>> Cheers
>>   Dave
>>
>>
>>>
>>> Best regards
>>>
>>> Timothee NICOLAS
>>>
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150818/634ee9e1/attachment.html>

From abhyshr at anl.gov  Tue Aug 18 11:45:18 2015
From: abhyshr at anl.gov (Abhyankar, Shrirang G.)
Date: Tue, 18 Aug 2015 16:45:18 +0000
Subject: [petsc-users] petsc KLU
In-Reply-To: <71B4204D92F7884494460446CAD0F04B5B7CDBF0@SRV364.tudelft.net>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
	<CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
	<6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>
	<F2998241-0809-4699-B503-00B329AA385A@mcs.anl.gov>
	<6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net>
	<CAGCphBvvrA1UnRY--2h5-PVd+kBTBWM42hCbYTGt8VSAtjzF7Q@mail.gmail.com>
	<D1F77073.2703E%abhyshr@mcs.anl.gov>
	<6F0087987AC5484D8D593B639648295A2F3B3521@SRV384.tudelft.net>
	<71B4204D92F7884494460446CAD0F04B5B7CDBF0@SRV364.tudelft.net>
Message-ID: <D1F8BF17.270E3%abhyshr@mcs.anl.gov>

Domenico,
   Here are the results for a power flow application.I don?t remember the
size of the system.

Package + Ordering      MatSolve        Sym. Fact       Num. Fact
Ordering       KSPSolve      Numeric ratio     Linear solve ratio
PETSc + QMD	      1.60E-02	    2.40E-02	   9.99E-02	   0.13	
2.76E-01	          1.14                    1.90
PETSc + ND                 3.20E-02          5.60E-02        9.40E-01
  0.02           1.06E+00           10.68                    7.31
PETSc + AMD            2.40E-02          2.00E-02        8.80E-02
0.01            1.45E-01             1.00                    1.00
KLU     + AMD            2.80E-02          2.80E-02        2.40E-01
  0.01             3.08E-01             2.73                    2.12
KLU     + COLAMD    5.60E-02          4.00E-02        3.90E-01
0.01              5.00E-01             4.43                    3.45
KLU     + QMD           2.80E-02          1.20E-02        2.67E-01
 0.13              4.40E-01             3.03                    3.03

The numeric and linear solve ratios are the ratios w.r.t. to using PETSc +
AMD.

You can test the performance of KLU on the power flow example application
$PETSC_DIR/src/snes/examples/tutorial/network/pflow/pf.c

Shri


-----Original Message-----
From: Domenico Lahaye - EWI <D.J.P.Lahaye at tudelft.nl>
Date: Tuesday, August 18, 2015 at 6:34 AM
To: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] petsc KLU

>Dear all, 
>
>  Have the disappointing results of KLU been reported somewhere?
>Earlier claims made might reinforce claims that we want to make.
>
>  Sincere thanks, Domenico.
> 
>________________________________________
>From: Romain Thomas
>Sent: Tuesday, August 18, 2015 1:10 PM
>To: Domenico Lahaye - EWI
>Subject: FW: [petsc-users] petsc KLU
>
>Hi,
>You can find below the message from Shri.
>Best regards,
>Romain
>
>-----Original Message-----
>From: Abhyankar, Shrirang G. [mailto:abhyshr at anl.gov]
>Sent: maandag 17 augustus 2015 18:21
>To: Romain Thomas; Zhang, Hong
>Cc: petsc-users at mcs.anl.gov
>Subject: Re: [petsc-users] petsc KLU
>
>Romain,
>   I added the KLU interface to PETSc last year hearing the hype about
>KLU?s performance from several power system folks. I must say that I?m
>terribly disappointed! I did some performance testing of KLU on power
>grid problems (power flow application) last year and I got a similar
>performance that you report (PETSc is 2-4 times faster than KLU). I also
>clocked the time spent in PETSc?s SuiteSparse interface for KLU for
>operations other than factorization and it was very minimal. The fastest
>linear solver combination that I found was PETSc?s LU solver + AMD
>ordering from the SuiteSparse package (-pc_factor_mat_ordering_type amd).
>Don?t try MUMPS and SuperLU ? they are terribly slow.
>
>Shri
>
>
>From:  hong zhang <hzhang at mcs.anl.gov>
>Date:  Monday, August 17, 2015 at 10:08 AM
>To:  Romain Thomas <R.Thomas at tudelft.nl>
>Cc:  "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>Subject:  Re: [petsc-users] petsc KLU
>
>
>>Romain:
>>Do you mean small sparse sequential 200 by 200 matrices?
>>Petsc LU might be better than external LU packages because it
>>implements simple LU algorithm and we took good care on data accesing
>>(I've heard same observations). You may try 'qmd' matrix ordering for
>>power grid simulation.
>>I do not have experience on SuiteSparse. Testing MUMPS is worth it as
>>well.
>>
>>Hong
>>
>>
>>Hi
>>Thank you for your answer. I was asking help because I find LU
>>factorization 2-3 times faster than KLU. According to my problem size
>>(200*200) and type (power system simulation), I should get almost the
>>same computation time. Is it true to think that? Is the  difference of
>>time due to the interface between PETSc and SuiteSparse?
>>Thank you,
>>Romain
>>
>>-----Original Message-----
>>From: Barry Smith [mailto:bsmith at mcs.anl.gov]
>>Sent: vrijdag 14 augustus 2015 17:31
>>To: Romain Thomas
>>Cc: Matthew Knepley; petsc-users at mcs.anl.gov
>>Subject: Re: [petsc-users] petsc KLU
>>
>>
>>   You should call
>>
>>    MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact);
>>
>>  then call
>>
>>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
>>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const
>>> MatFactorInfo
>>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>>
>>   This routines correctly internally call the appropriate
>>MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU
>>above.
>>   There is no reason to (and it won't work) to call
>>
>>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
>>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
>>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>>
>>directly.
>>
>>  Barry
>>
>>> On Aug 14, 2015, at 10:07 AM, Romain Thomas <R.Thomas at tudelft.nl>
>>>wrote:
>>>
>>> Hi,
>>> Thank you for your answer.
>>> My problem is a bit more complex. During the simulation (?real
>>>time?), I need to upgrade at each time step the matrix A and the
>>>MatassemblyBegin and MatassemblyEnd take time and so, in order to
>>>avoid these functions, I don?t use ksp or pc. I prefer to use
>> the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor.
>>And so, I want to know if there is similar functions for KLU. (I tried
>>for Cholesky and, iLU and it works well).
>>> Best regards,
>>> Romain
>>>
>>>
>>> From: Matthew Knepley [mailto:knepley at gmail.com]
>>> Sent: vrijdag 14 augustus 2015 16:41
>>> To: Romain Thomas
>>> Cc: petsc-users at mcs.anl.gov
>>> Subject: Re: [petsc-users] petsc KLU
>>>
>>> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl>
>>>wrote:
>>> Dear PETSc users,
>>>
>>> I would like to know if I can replace the following functions
>>>
>>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
>>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const
>>> MatFactorInfo
>>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>>>
>>> by
>>>
>>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
>>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
>>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>>>
>>> in my code for the simulation of electrical power systems? (I
>>> installed the package SuiteSparse)
>>>
>>> Why would you do that? It already works with the former code. In
>>> fact, you should really just use
>>>
>>>   KSPCreate(comm, &ksp)
>>>   KSPSetOperator(ksp, A, A);
>>>   KSPSetFromOptions(ksp);
>>>   KSPSolve(ksp, b, x);
>>>
>>> and then give the options
>>>
>>>   -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse
>>>
>>> This is no advantage to using the Factor language since subsequent
>>> calls to
>>> KSPSolve() will not refactor.
>>>
>>>    Matt
>>>
>>> Thank you,
>>> Best regards,
>>> Romain
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>>experiments is infinitely more interesting than any results to which
>>>their experiments lead.
>>> -- Norbert Wiener
>>
>>
>>
>>
>>
>


From D.J.P.Lahaye at tudelft.nl  Tue Aug 18 12:38:04 2015
From: D.J.P.Lahaye at tudelft.nl (Domenico Lahaye - EWI)
Date: Tue, 18 Aug 2015 17:38:04 +0000
Subject: [petsc-users] petsc KLU
In-Reply-To: <D1F8BF17.270E3%abhyshr@mcs.anl.gov>
References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net>
	<CAMYG4G=LPFuDUVy=6u6=S2QcqHmL+RgpYNREB=V5D_P20MFLog@mail.gmail.com>
	<6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net>
	<F2998241-0809-4699-B503-00B329AA385A@mcs.anl.gov>
	<6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net>
	<CAGCphBvvrA1UnRY--2h5-PVd+kBTBWM42hCbYTGt8VSAtjzF7Q@mail.gmail.com>
	<D1F77073.2703E%abhyshr@mcs.anl.gov>
	<6F0087987AC5484D8D593B639648295A2F3B3521@SRV384.tudelft.net>
	<71B4204D92F7884494460446CAD0F04B5B7CDBF0@SRV364.tudelft.net>,
	<D1F8BF17.270E3%abhyshr@mcs.anl.gov>
Message-ID: <71B4204D92F7884494460446CAD0F04B5B7CDCA4@SRV364.tudelft.net>

Dear Shri, 

  I am by no means putting your arguments in doubt. I apologize 
if I gave that impression.   
 
  I am however looking for a reference that we can cite when making claims 
on the performance of KLU. Did you publish your results somewhere? 

   Thank you. Domenico. 
________________________________________
From: Abhyankar, Shrirang G. [abhyshr at anl.gov]
Sent: Tuesday, August 18, 2015 6:45 PM
To: Domenico Lahaye - EWI; petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] petsc KLU

Domenico,
   Here are the results for a power flow application.I don?t remember the
size of the system.

Package + Ordering      MatSolve        Sym. Fact       Num. Fact
Ordering       KSPSolve      Numeric ratio     Linear solve ratio
PETSc + QMD           1.60E-02      2.40E-02       9.99E-02        0.13
2.76E-01                  1.14                    1.90
PETSc + ND                 3.20E-02          5.60E-02        9.40E-01
  0.02           1.06E+00           10.68                    7.31
PETSc + AMD            2.40E-02          2.00E-02        8.80E-02
0.01            1.45E-01             1.00                    1.00
KLU     + AMD            2.80E-02          2.80E-02        2.40E-01
  0.01             3.08E-01             2.73                    2.12
KLU     + COLAMD    5.60E-02          4.00E-02        3.90E-01
0.01              5.00E-01             4.43                    3.45
KLU     + QMD           2.80E-02          1.20E-02        2.67E-01
 0.13              4.40E-01             3.03                    3.03

The numeric and linear solve ratios are the ratios w.r.t. to using PETSc +
AMD.

You can test the performance of KLU on the power flow example application
$PETSC_DIR/src/snes/examples/tutorial/network/pflow/pf.c

Shri


-----Original Message-----
From: Domenico Lahaye - EWI <D.J.P.Lahaye at tudelft.nl>
Date: Tuesday, August 18, 2015 at 6:34 AM
To: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] petsc KLU

>Dear all,
>
>  Have the disappointing results of KLU been reported somewhere?
>Earlier claims made might reinforce claims that we want to make.
>
>  Sincere thanks, Domenico.
>
>________________________________________
>From: Romain Thomas
>Sent: Tuesday, August 18, 2015 1:10 PM
>To: Domenico Lahaye - EWI
>Subject: FW: [petsc-users] petsc KLU
>
>Hi,
>You can find below the message from Shri.
>Best regards,
>Romain
>
>-----Original Message-----
>From: Abhyankar, Shrirang G. [mailto:abhyshr at anl.gov]
>Sent: maandag 17 augustus 2015 18:21
>To: Romain Thomas; Zhang, Hong
>Cc: petsc-users at mcs.anl.gov
>Subject: Re: [petsc-users] petsc KLU
>
>Romain,
>   I added the KLU interface to PETSc last year hearing the hype about
>KLU?s performance from several power system folks. I must say that I?m
>terribly disappointed! I did some performance testing of KLU on power
>grid problems (power flow application) last year and I got a similar
>performance that you report (PETSc is 2-4 times faster than KLU). I also
>clocked the time spent in PETSc?s SuiteSparse interface for KLU for
>operations other than factorization and it was very minimal. The fastest
>linear solver combination that I found was PETSc?s LU solver + AMD
>ordering from the SuiteSparse package (-pc_factor_mat_ordering_type amd).
>Don?t try MUMPS and SuperLU ? they are terribly slow.
>
>Shri
>
>
>From:  hong zhang <hzhang at mcs.anl.gov>
>Date:  Monday, August 17, 2015 at 10:08 AM
>To:  Romain Thomas <R.Thomas at tudelft.nl>
>Cc:  "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>Subject:  Re: [petsc-users] petsc KLU
>
>
>>Romain:
>>Do you mean small sparse sequential 200 by 200 matrices?
>>Petsc LU might be better than external LU packages because it
>>implements simple LU algorithm and we took good care on data accesing
>>(I've heard same observations). You may try 'qmd' matrix ordering for
>>power grid simulation.
>>I do not have experience on SuiteSparse. Testing MUMPS is worth it as
>>well.
>>
>>Hong
>>
>>
>>Hi
>>Thank you for your answer. I was asking help because I find LU
>>factorization 2-3 times faster than KLU. According to my problem size
>>(200*200) and type (power system simulation), I should get almost the
>>same computation time. Is it true to think that? Is the  difference of
>>time due to the interface between PETSc and SuiteSparse?
>>Thank you,
>>Romain
>>
>>-----Original Message-----
>>From: Barry Smith [mailto:bsmith at mcs.anl.gov]
>>Sent: vrijdag 14 augustus 2015 17:31
>>To: Romain Thomas
>>Cc: Matthew Knepley; petsc-users at mcs.anl.gov
>>Subject: Re: [petsc-users] petsc KLU
>>
>>
>>   You should call
>>
>>    MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact);
>>
>>  then call
>>
>>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
>>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const
>>> MatFactorInfo
>>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>>
>>   This routines correctly internally call the appropriate
>>MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU
>>above.
>>   There is no reason to (and it won't work) to call
>>
>>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
>>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
>>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>>
>>directly.
>>
>>  Barry
>>
>>> On Aug 14, 2015, at 10:07 AM, Romain Thomas <R.Thomas at tudelft.nl>
>>>wrote:
>>>
>>> Hi,
>>> Thank you for your answer.
>>> My problem is a bit more complex. During the simulation (?real
>>>time?), I need to upgrade at each time step the matrix A and the
>>>MatassemblyBegin and MatassemblyEnd take time and so, in order to
>>>avoid these functions, I don?t use ksp or pc. I prefer to use
>> the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor.
>>And so, I want to know if there is similar functions for KLU. (I tried
>>for Cholesky and, iLU and it works well).
>>> Best regards,
>>> Romain
>>>
>>>
>>> From: Matthew Knepley [mailto:knepley at gmail.com]
>>> Sent: vrijdag 14 augustus 2015 16:41
>>> To: Romain Thomas
>>> Cc: petsc-users at mcs.anl.gov
>>> Subject: Re: [petsc-users] petsc KLU
>>>
>>> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas <R.Thomas at tudelft.nl>
>>>wrote:
>>> Dear PETSc users,
>>>
>>> I would like to know if I can replace the following functions
>>>
>>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info)
>>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const
>>> MatFactorInfo
>>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info)
>>>
>>> by
>>>
>>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info)
>>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo
>>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F)
>>>
>>> in my code for the simulation of electrical power systems? (I
>>> installed the package SuiteSparse)
>>>
>>> Why would you do that? It already works with the former code. In
>>> fact, you should really just use
>>>
>>>   KSPCreate(comm, &ksp)
>>>   KSPSetOperator(ksp, A, A);
>>>   KSPSetFromOptions(ksp);
>>>   KSPSolve(ksp, b, x);
>>>
>>> and then give the options
>>>
>>>   -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse
>>>
>>> This is no advantage to using the Factor language since subsequent
>>> calls to
>>> KSPSolve() will not refactor.
>>>
>>>    Matt
>>>
>>> Thank you,
>>> Best regards,
>>> Romain
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>>experiments is infinitely more interesting than any results to which
>>>their experiments lead.
>>> -- Norbert Wiener
>>
>>
>>
>>
>>
>


From bsmith at mcs.anl.gov  Tue Aug 18 13:20:51 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 18 Aug 2015 13:20:51 -0500
Subject: [petsc-users] Wise usage of user contexts
In-Reply-To: <CAGi1ndRW_mBJmB4F1DYz3BzZXRoEVDmxh5R-Y8mb_siL+dZowg@mail.gmail.com>
References: <CAGi1ndS9ivv5R_uyyMvSGoJZR7hVAfn4-F=0O6kqA0z--JrwZw@mail.gmail.com>
	<CAJ98EDqkLtDPcymAGVpEiVScNsq05ywioMkbNM-D8gANvj00ug@mail.gmail.com>
	<CAGi1ndRW_mBJmB4F1DYz3BzZXRoEVDmxh5R-Y8mb_siL+dZowg@mail.gmail.com>
Message-ID: <D32714A6-0720-4B95-94FC-4607CDB8CA90@mcs.anl.gov>


> On Aug 18, 2015, at 4:20 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> 
> Dave, 
> 
> Thx a lot for your very clear answer. My last question about modules could be reformulated like this :
> 
> Why would I put anything in a ctx while I could simply use modules ?

   You could.  The disadvantage of modules, which may not matter in your case, is that there can only ever be ONE data structure that contains all this information and can be used in your form functions. If you use a derived data type you can have multiple different ones of these data structures in the same code. Say for example, you had two different sets of physical parameters and you wanted to solve in your program both problems, you would just a derived type designed to hold this information and put each set of parameters into a different derived type object. With a module there could only be one set of parameters so you would have to do some horrible thing like constantly be changing the values in the model back and forth between the two sets.

> Maybe it has something to do with the fact that PETSc is initially written for C ?

  Modules are kind of like singletons in object oriented programming. They are occasionally useful but limiting yourself to ONLY having singletons in object oriented programs is nuts.

  Barry

> 
> Best
> 
> Timothee
> 
> 2015-08-18 17:59 GMT+09:00 Dave May <dave.mayhem23 at gmail.com>:
> 
> 
> On 18 August 2015 at 10:42, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> Hi all,
> 
> I am in the process of writing an implicit solver for a set of PDEs (namely MHD equations), in FORTRAN. When setting the non-linear function to solve via Newton-Krylov, I use a "user defined context", namely the thing denoted by "ctx" on the doc page about SNESFunction :
>  
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction
> 
> In practice ctx is a user defined type which contains everything I need in the local routine which sets the function on the local part of the grid, FormFunctionLocal. That is, some local/global geometrical information on the grid, the physical parameter, and possibly any other thing. 
> 
> In my case it so happens that due to the scheme I have chosen, when I compute my function, I need the full solution of the problem at the last two time steps (which are in Vec format). So my ctx contains two Vec elements. Since I will work in 3D and intend to use a lot of points in the future, I am concerned about memory problems which could occur.
> 
> In the grand scheme of things, the two vectors in your context aren't likely to significantly add to the total memory footprint of your code. A couple of things to note:
> * If you run in parallel, only the local part of the vector will be stored on each MPI process. 
> * All the KSP methods will allocate auxiliary vectors. Most methods require more than 2 auxiliary vectors.
> * SNES also requires auxiliary vectors. If you use JFNK, that method will also need some additional temporary vectors.
> * If you assemble a Jacobian, this matrix will likely require much more memory per MPI process than two vectors
>  
> 
> 
> Is there a limit to the size occupied by ctx ?
> 
> The only limit is defined by the available memory per MPI process you have on your target machine.
>  
> Would this be better if instead I was declaring global variables in a module and using this module inside FormFunctionLocal ? Is this allowed ?
> 
> What would be the difference in doing that - the memory usage will be identical.
> 
> Cheers
>   Dave
>  
> 
> Best regards
> 
> Timothee NICOLAS
> 
> 


From reza.yaghmaie2 at gmail.com  Tue Aug 18 19:25:42 2015
From: reza.yaghmaie2 at gmail.com (Reza Yaghmaie)
Date: Tue, 18 Aug 2015 20:25:42 -0400
Subject: [petsc-users] SNESSetFunction
In-Reply-To: <E3C84517-3C72-43C1-AD3C-1AB739F4071D@mcs.anl.gov>
References: <CADOqKpmk2jYJrbbOMZY6as2ysFG1XE8bKdxvZ6RUivV9Uz5odw@mail.gmail.com>
	<CAMYG4GmEh2w0-MSajCja-=asq=iAkK7bPVdBrcF_Zphh2GNc9A@mail.gmail.com>
	<E3C84517-3C72-43C1-AD3C-1AB739F4071D@mcs.anl.gov>
Message-ID: <CADOqKpmur1nW7y3LDMMUmpKGK76Mx_riK3vw6euQ4KJt+qHzKQ@mail.gmail.com>

Thank you very much for the insight. It helped.

I am trying to solve the system using *snes *routines. Let's say the I
execute the below command in Fortran

call *SNESSolve*(snes,PETSC_NULL_OBJECT,xvec,ierr)

In the residual calculation and Jacobian update routines I need to finalize
the vectors and matrix assemblies using the commands as following otherwise
*SNESSolve* will crash:

call *VecAssemblyBegin *(FVEC, ierr)
call *VecAssemblyEnd   *(FVEC, ierr)

        call *MatAssemblyBegin*(jac_prec,MAT_FINAL_ASSEMBLY,ierr)
        call *MatAssemblyEnd*(jac_prec,MAT_FINAL_ASSEMBLY,ierr)

I face the issue that my debugger crashes at the locations of theses final
vector and matrix assemblies. It worked for the sequential version of the
code but for the parallel version it stops there. I am sure all processors
in the mpi framework reach to these pointers simultaneously. Any insights?

Thanks,
Reza


On Mon, Aug 17, 2015 at 2:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   Reza,
>
>     See src/snes/examples/tutorials/ex5f90.F for how this may be easily
> done using a Fortran user defined type
>
>   Barry
>
> > On Aug 17, 2015, at 12:39 PM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > On Mon, Aug 17, 2015 at 11:46 AM, Reza Yaghmaie <
> reza.yaghmaie2 at gmail.com> wrote:
> >
> > Hi,
> >
> > I have problems with passing variables through  SNESSetFunction in my
> code. basically I have the following subroutines in the main body of the
> Fortran code. Could you provide some insight on how to transfer variables
> into the residual calculation routine (FormFunction1)?
> >
> > Extra arguments to your FormFunction are meant to be passed in a
> context, through the context variable.
> >
> > This is difficult in Fortran, but you can use a PetscObject as a
> container. You can attach other
> > PetscObjects using PetscObjectCompose() in Fortran.
> >
> >    Matt
> >
> > Thanks,
> > Reza
> >
> ------------------------------------------------------------------------------------------------------------------
> > main code
> >
> >       SNES      snes
> >       Vec          xvec,rvec
> >       external   FormFunction1
> >       real*8
>  variable1(10),variable2(20,20),variable3(30),variable4(40,40)
> >
> >
> >       call SNESSetFunction(snes,rvec,FormFunction1,
> >      &                PETSC_NULL_OBJECT,
> >      &                variable1,variable2,variable3,variable4,
> >      &                ierr)
> >
> >       end
> >
> >       subroutine FormFunction1(snes,XVEC,FVEC,
> >      &                dummy,
> >      &                varable1,varable2,varable3,varable4,
> >      &                ierr)
> >
> >       SNES                      snes
> >       Vec                         XVEC,FVEC
> >       PetscFortranAddr    dummy
> >       real*8
>  variable1(10),variable2(20,20),variable3(30),variable4(40,40)
> >
> >
> >       return
> >       end
> >
> --------------------------------------------------------------------------------------------------------------
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150818/0b5f1972/attachment.html>

From bsmith at mcs.anl.gov  Tue Aug 18 20:29:17 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 18 Aug 2015 20:29:17 -0500
Subject: [petsc-users] SNESSetFunction
In-Reply-To: <CADOqKpmur1nW7y3LDMMUmpKGK76Mx_riK3vw6euQ4KJt+qHzKQ@mail.gmail.com>
References: <CADOqKpmk2jYJrbbOMZY6as2ysFG1XE8bKdxvZ6RUivV9Uz5odw@mail.gmail.com>
	<CAMYG4GmEh2w0-MSajCja-=asq=iAkK7bPVdBrcF_Zphh2GNc9A@mail.gmail.com>
	<E3C84517-3C72-43C1-AD3C-1AB739F4071D@mcs.anl.gov>
	<CADOqKpmur1nW7y3LDMMUmpKGK76Mx_riK3vw6euQ4KJt+qHzKQ@mail.gmail.com>
Message-ID: <CF52E97C-0294-44B3-A3B9-22B70E167083@mcs.anl.gov>


  We would have to have full details of what happens in the debugger during those crashes. 

  Barry

> On Aug 18, 2015, at 7:25 PM, Reza Yaghmaie <reza.yaghmaie2 at gmail.com> wrote:
> 
> 
> Thank you very much for the insight. It helped.
> 
> I am trying to solve the system using snes routines. Let's say the I execute the below command in Fortran
> 
> 	call SNESSolve(snes,PETSC_NULL_OBJECT,xvec,ierr)
> 
> In the residual calculation and Jacobian update routines I need to finalize the vectors and matrix assemblies using the commands as following otherwise SNESSolve will crash:
> 
> 	call VecAssemblyBegin (FVEC, ierr)
> 	call VecAssemblyEnd   (FVEC, ierr)
> 
>         call MatAssemblyBegin(jac_prec,MAT_FINAL_ASSEMBLY,ierr)
>         call MatAssemblyEnd(jac_prec,MAT_FINAL_ASSEMBLY,ierr)
> 
> I face the issue that my debugger crashes at the locations of theses final vector and matrix assemblies. It worked for the sequential version of the code but for the parallel version it stops there. I am sure all processors in the mpi framework reach to these pointers simultaneously. Any insights?
> 
> Thanks,
> Reza 
> 
> 
> 
> On Mon, Aug 17, 2015 at 2:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   Reza,
> 
>     See src/snes/examples/tutorials/ex5f90.F for how this may be easily done using a Fortran user defined type
> 
>   Barry
> 
> > On Aug 17, 2015, at 12:39 PM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > On Mon, Aug 17, 2015 at 11:46 AM, Reza Yaghmaie <reza.yaghmaie2 at gmail.com> wrote:
> >
> > Hi,
> >
> > I have problems with passing variables through  SNESSetFunction in my code. basically I have the following subroutines in the main body of the Fortran code. Could you provide some insight on how to transfer variables into the residual calculation routine (FormFunction1)?
> >
> > Extra arguments to your FormFunction are meant to be passed in a context, through the context variable.
> >
> > This is difficult in Fortran, but you can use a PetscObject as a container. You can attach other
> > PetscObjects using PetscObjectCompose() in Fortran.
> >
> >    Matt
> >
> > Thanks,
> > Reza
> > ------------------------------------------------------------------------------------------------------------------
> > main code
> >
> >       SNES      snes
> >       Vec          xvec,rvec
> >       external   FormFunction1
> >       real*8       variable1(10),variable2(20,20),variable3(30),variable4(40,40)
> >
> >
> >       call SNESSetFunction(snes,rvec,FormFunction1,
> >      &                PETSC_NULL_OBJECT,
> >      &                variable1,variable2,variable3,variable4,
> >      &                ierr)
> >
> >       end
> >
> >       subroutine FormFunction1(snes,XVEC,FVEC,
> >      &                dummy,
> >      &                varable1,varable2,varable3,varable4,
> >      &                ierr)
> >
> >       SNES                      snes
> >       Vec                         XVEC,FVEC
> >       PetscFortranAddr    dummy
> >       real*8                       variable1(10),variable2(20,20),variable3(30),variable4(40,40)
> >
> >
> >       return
> >       end
> > --------------------------------------------------------------------------------------------------------------
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> 
> 


From zonexo at gmail.com  Tue Aug 18 20:38:03 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Wed, 19 Aug 2015 09:38:03 +0800
Subject: [petsc-users] difference between local and global vectors
Message-ID: <55D3DDFB.30302@gmail.com>

Hi,

I am using DA. For e.g.

DM da_u

call 
DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&

size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)

call DMCreateGlobalVector(da_u,u_global,ierr)

call DMCreateLocalVector(da_u,u_local,ierr)

To update the ghost values, I use:

call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)

call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)

It seems that I don't need to use global vector at all.

So what's the difference between local and global vector?

When will I need to use?:

call DMGlobalToLocalBegin(da_u,u_global,INSERT_VALUES,u_local,ierr)

call DMGlobalToLocalEnd(da_u,u_global,INSERT_VALUES,u_local,ierr)

-- 
Thank you

Yours sincerely,

TAY wee-beng


From dave.mayhem23 at gmail.com  Wed Aug 19 00:17:58 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Wed, 19 Aug 2015 07:17:58 +0200
Subject: [petsc-users] difference between local and global vectors
In-Reply-To: <55D3DDFB.30302@gmail.com>
References: <55D3DDFB.30302@gmail.com>
Message-ID: <CAJ98EDpFpaq-fee9VuOykN-2Bk_geO4=PeoseyuB-DbtKOwqSQ@mail.gmail.com>

On 19 August 2015 at 03:38, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
>
> I am using DA. For e.g.
>
> DM da_u
>
> call
> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
>
>
> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
>
> call DMCreateGlobalVector(da_u,u_global,ierr)
>
> call DMCreateLocalVector(da_u,u_local,ierr)
>
> To update the ghost values, I use:
>
> call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)
>
> call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)
>


This is incorrect.
The manpage for DMLocalToLocal clearly says "Maps from a local vector
(including ghost points that contain irrelevant values) to another local
vector where the ghost points in the second are set correctly."
To update ghost values from a global vector (e.g. to perform the scatter)
you need to use DMGlobalToLocalBegin() , DMGlobalToLocalEnd().


>
> It seems that I don't need to use global vector at all.
>
> So what's the difference between local and global vector?
>


* Local vectors contain ghost values from any neighbouring MPI processes.
They are always defined over PETSC_COMM_SELF.
* Global vectors store the DOFs assigned to each sub-domain. These will
parallel vectors defined over the same communicator as your DM

Thus, you use local vectors to compute things like the sub-domain
contribution to (i) a non-linear residual evaluation or (ii) a
sparse-matric vector product.
You use global vectors together with linear and non-linear solvers as these
vectors.

If your stencil width was zero (in your DMDACreate3d() function call), then
the would be no ghost values to communicate between neighbouring MPI
processes. Hence, the entries in the following two arrays LA_u_local[],
LA_u[] would be identical
  VecGetArrayRead(u_local,&LA_u_local);
and
  VecGetArrayRead(u,&LA_u);

That said, u_local would still be of type VECSEQ, where as u would be of
type VECMPI.


>
> When will I need to use?:
>
> call DMGlobalToLocalBegin(da_u,u_global,INSERT_VALUES,u_local,ierr)
>
> call DMGlobalToLocalEnd(da_u,u_global,INSERT_VALUES,u_local,ierr)
>

See points (i) and (ii) above from common use cases.

Thanks,
  Dave


>
> --
> Thank you
>
> Yours sincerely,
>
> TAY wee-beng
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/70e70e01/attachment.html>

From zonexo at gmail.com  Wed Aug 19 03:20:01 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Wed, 19 Aug 2015 16:20:01 +0800
Subject: [petsc-users] difference between local and global vectors
In-Reply-To: <CAJ98EDpFpaq-fee9VuOykN-2Bk_geO4=PeoseyuB-DbtKOwqSQ@mail.gmail.com>
References: <55D3DDFB.30302@gmail.com>
	<CAJ98EDpFpaq-fee9VuOykN-2Bk_geO4=PeoseyuB-DbtKOwqSQ@mail.gmail.com>
Message-ID: <55D43C31.4070902@gmail.com>


On 19/8/2015 1:17 PM, Dave May wrote:
>
>
> On 19 August 2015 at 03:38, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>     Hi,
>
>     I am using DA. For e.g.
>
>     DM da_u
>
>     call
>     DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
>
>     size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
>
>     call DMCreateGlobalVector(da_u,u_global,ierr)
>
>     call DMCreateLocalVector(da_u,u_local,ierr)
>
>     To update the ghost values, I use:
>
>     call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)
>
>     call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)
>
>
>
> This is incorrect.
> The manpage for DMLocalToLocal clearly says "Maps from a local vector 
> (including ghost points that contain irrelevant values) to another 
> local vector where the ghost points in the second are set correctly."
> To update ghost values from a global vector (e.g. to perform the 
> scatter) you need to use DMGlobalToLocalBegin() , DMGlobalToLocalEnd().

Hi Dave,

Thanks for the clarification although I'm still confused. Supposed I 
have a 1D vector da_u, It has size 8, so it's like da_u_array(8), with 
stencil width 1

So for 2 procs,

there will be 2 da_u_array - da_u_array(1:5) and da_u_array(4:8)

After performing some operations on each procs's da_u_array, I need to 
update 1st procs's da_u_array(5) and 2nd procs's da_u_array(4) from the 
2nd and 1st procs respectively. I simply call:

call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)

call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)

and it seems to be enough. I check the ghost values and they have been 
updated. So if I am not using the linear solvers, I do not need the 
global vector,is that so?
>
>
>
>     It seems that I don't need to use global vector at all.
>
>     So what's the difference between local and global vector?
>
>
>
> * Local vectors contain ghost values from any neighbouring MPI 
> processes. They are always defined over PETSC_COMM_SELF.
> * Global vectors store the DOFs assigned to each sub-domain. These 
> will parallel vectors defined over the same communicator as your DM
>
> Thus, you use local vectors to compute things like the sub-domain 
> contribution to (i) a non-linear residual evaluation or (ii) a 
> sparse-matric vector product.
> You use global vectors together with linear and non-linear solvers as 
> these vectors.
>
> If your stencil width was zero (in your DMDACreate3d() function call), 
> then the would be no ghost values to communicate between neighbouring 
> MPI processes. Hence, the entries in the following two arrays 
> LA_u_local[], LA_u[] would be identical
>   VecGetArrayRead(u_local,&LA_u_local);
> and
>   VecGetArrayRead(u,&LA_u);
>
> That said, u_local would still be of type VECSEQ, where as u would be 
> of type VECMPI.
>
>
>
>     When will I need to use?:
>
>     call DMGlobalToLocalBegin(da_u,u_global,INSERT_VALUES,u_local,ierr)
>
>     call DMGlobalToLocalEnd(da_u,u_global,INSERT_VALUES,u_local,ierr)
>
>
> See points (i) and (ii) above from common use cases.
>
> Thanks,
>   Dave
>
>
>
>
>     -- 
>     Thank you
>
>     Yours sincerely,
>
>     TAY wee-beng
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/f3099a8b/attachment.html>

From dave.mayhem23 at gmail.com  Wed Aug 19 03:26:13 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Wed, 19 Aug 2015 10:26:13 +0200
Subject: [petsc-users] difference between local and global vectors
In-Reply-To: <55D43C31.4070902@gmail.com>
References: <55D3DDFB.30302@gmail.com>
	<CAJ98EDpFpaq-fee9VuOykN-2Bk_geO4=PeoseyuB-DbtKOwqSQ@mail.gmail.com>
	<55D43C31.4070902@gmail.com>
Message-ID: <CAJ98EDpTBwxU5M-Zqg-yu8p_X=Frsq-tObH8cMbeKeYesDP_cw@mail.gmail.com>

On 19 August 2015 at 10:20, TAY wee-beng <zonexo at gmail.com> wrote:

>
> On 19/8/2015 1:17 PM, Dave May wrote:
>
>
>
> On 19 August 2015 at 03:38, TAY wee-beng <zonexo at gmail.com> wrote:
>
>> Hi,
>>
>> I am using DA. For e.g.
>>
>> DM da_u
>>
>> call
>> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
>>
>>
>> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
>>
>> call DMCreateGlobalVector(da_u,u_global,ierr)
>>
>> call DMCreateLocalVector(da_u,u_local,ierr)
>>
>> To update the ghost values, I use:
>>
>> call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)
>>
>> call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)
>>
>
>
> This is incorrect.
> The manpage for DMLocalToLocal clearly says "Maps from a local vector
> (including ghost points that contain irrelevant values) to another local
> vector where the ghost points in the second are set correctly."
> To update ghost values from a global vector (e.g. to perform the scatter)
> you need to use DMGlobalToLocalBegin() , DMGlobalToLocalEnd().
>
>
I must apologize (and should have read my own email :D)
- I misunderstood what DMLocalToLocalBegin/End does.
Indeed it will give produce the correct / updated ghost values.


> Hi Dave,
>
> Thanks for the clarification although I'm still confused. Supposed I have
> a 1D vector da_u, It has size 8, so it's like da_u_array(8), with stencil
> width 1
>
> So for 2 procs,
>
> there will be 2 da_u_array - da_u_array(1:5) and da_u_array(4:8)
>
> After performing some operations on each procs's da_u_array, I need to
> update 1st procs's da_u_array(5) and 2nd procs's da_u_array(4) from the 2nd
> and 1st procs respectively. I simply call:
>
> call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)
>
> call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)
>
> and it seems to be enough. I check the ghost values and they have been
> updated.
>

Yeah, this is correct.
Sorry about my mistake in the previous email regarding what DMLocalToLocal
actually does.


> So if I am not using the linear solvers, I do not need the global
> vector,is that so?
>

I guess in the end it is application specific whether you need a global
vector or not.
I would have thought you always would want a global vector.

What is your application where you don't require a global vector?

Cheers,
  Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/3aad2788/attachment.html>

From zonexo at gmail.com  Wed Aug 19 03:28:25 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Wed, 19 Aug 2015 16:28:25 +0800
Subject: [petsc-users] difference between local and global vectors
In-Reply-To: <CAJ98EDpTBwxU5M-Zqg-yu8p_X=Frsq-tObH8cMbeKeYesDP_cw@mail.gmail.com>
References: <55D3DDFB.30302@gmail.com>	<CAJ98EDpFpaq-fee9VuOykN-2Bk_geO4=PeoseyuB-DbtKOwqSQ@mail.gmail.com>	<55D43C31.4070902@gmail.com>
	<CAJ98EDpTBwxU5M-Zqg-yu8p_X=Frsq-tObH8cMbeKeYesDP_cw@mail.gmail.com>
Message-ID: <55D43E29.9020400@gmail.com>


On 19/8/2015 4:26 PM, Dave May wrote:
>
>
> On 19 August 2015 at 10:20, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>
>     On 19/8/2015 1:17 PM, Dave May wrote:
>>
>>
>>     On 19 August 2015 at 03:38, TAY wee-beng <zonexo at gmail.com
>>     <mailto:zonexo at gmail.com>> wrote:
>>
>>         Hi,
>>
>>         I am using DA. For e.g.
>>
>>         DM da_u
>>
>>         call
>>         DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
>>
>>         size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
>>
>>         call DMCreateGlobalVector(da_u,u_global,ierr)
>>
>>         call DMCreateLocalVector(da_u,u_local,ierr)
>>
>>         To update the ghost values, I use:
>>
>>         call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)
>>
>>         call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)
>>
>>
>>
>>     This is incorrect.
>>     The manpage for DMLocalToLocal clearly says "Maps from a local
>>     vector (including ghost points that contain irrelevant values) to
>>     another local vector where the ghost points in the second are set
>>     correctly."
>>     To update ghost values from a global vector (e.g. to perform the
>>     scatter) you need to use DMGlobalToLocalBegin() ,
>>     DMGlobalToLocalEnd().
>
>
> I must apologize (and should have read my own email :D)
> - I misunderstood what DMLocalToLocalBegin/End does.
> Indeed it will give produce the correct / updated ghost values.
>
>     Hi Dave,
>
>     Thanks for the clarification although I'm still confused. Supposed
>     I have a 1D vector da_u, It has size 8, so it's like
>     da_u_array(8), with stencil width 1
>
>     So for 2 procs,
>
>     there will be 2 da_u_array - da_u_array(1:5) and da_u_array(4:8)
>
>     After performing some operations on each procs's da_u_array, I
>     need to update 1st procs's da_u_array(5) and 2nd procs's
>     da_u_array(4) from the 2nd and 1st procs respectively. I simply call:
>
>     call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)
>
>     call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)
>
>     and it seems to be enough. I check the ghost values and they have
>     been updated.
>
>
> Yeah, this is correct.
> Sorry about my mistake in the previous email regarding what 
> DMLocalToLocal actually does.
>
>     So if I am not using the linear solvers, I do not need the global
>     vector,is that so?
>
>
> I guess in the end it is application specific whether you need a 
> global vector or not.
> I would have thought you always would want a global vector.
>
> What is your application where you don't require a global vector?
Well, I mean when I don't need to solve the linear eqn. But of course, 
later on in the code, when I need to, I will require the global vector.

Thanks
>
> Cheers,
>   Dave
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/179a7db6/attachment-0001.html>

From zonexo at gmail.com  Wed Aug 19 03:54:29 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Wed, 19 Aug 2015 16:54:29 +0800
Subject: [petsc-users] How to view petsc array?
In-Reply-To: <CAMYG4Gn9QHro7vymWcU16yAt3rKB1+nUXyCzFEt_MThbidJ+gQ@mail.gmail.com>
References: <55ADE837.8080705@gmail.com>
	<CAMYG4Gn9QHro7vymWcU16yAt3rKB1+nUXyCzFEt_MThbidJ+gQ@mail.gmail.com>
Message-ID: <55D44445.8000503@gmail.com>


On 21/7/2015 7:28 PM, Matthew Knepley wrote:
> On Tue, Jul 21, 2015 at 1:35 AM, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>     Hi,
>
>     I need to check the contents of the array which was declared using:
>
>     PetscScalar,pointer ::
>     u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:)
>
>     I tried to use :
>
>     call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr)
>
>     call VecView(p_array,viewer,ierr)
>
>     or
>
>     call MatView(p_array,viewer,ierr)
>
>     call PetscViewerDestroy(viewer,ierr)
>
>     but I got segmentation error. So is there a PETSc routine I can use?
>
>
> No. Those routines work only for Vec objects. You could
>
>  a) Declare a DMDA of the same size
>
>  b) Use DMDAVecGetArrayF90() to get out the multidimensional array
>
>  c) Use that in your code
>
>  d) Use VecView() on the original vector

Hi,

Supposed I need to check the contents of the u_array which was declared 
using:

PetscScalar,pointer :: u_array(:,:,:)

call 
DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&

size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)

call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)

call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr)

call VecView(array,viewer,ierr)

call PetscViewerDestroy(viewer,ierr)

Is this the correct way?
>
>    Matt
>
>
>     -- 
>     Thank you
>
>     Yours sincerely,
>
>     TAY wee-beng
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/dfe8e4e5/attachment.html>

From dave.mayhem23 at gmail.com  Wed Aug 19 03:58:58 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Wed, 19 Aug 2015 10:58:58 +0200
Subject: [petsc-users] How to view petsc array?
In-Reply-To: <55D44445.8000503@gmail.com>
References: <55ADE837.8080705@gmail.com>
	<CAMYG4Gn9QHro7vymWcU16yAt3rKB1+nUXyCzFEt_MThbidJ+gQ@mail.gmail.com>
	<55D44445.8000503@gmail.com>
Message-ID: <CAJ98EDowOg_cpAGxBCsn03TOa2Ga1kSPBfwinR2cyjMTxgiAtw@mail.gmail.com>

On 19 August 2015 at 10:54, TAY wee-beng <zonexo at gmail.com> wrote:

>
> On 21/7/2015 7:28 PM, Matthew Knepley wrote:
>
> On Tue, Jul 21, 2015 at 1:35 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>
>> Hi,
>>
>> I need to check the contents of the array which was declared using:
>>
>> PetscScalar,pointer ::
>> u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:)
>>
>> I tried to use :
>>
>> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr)
>>
>> call VecView(p_array,viewer,ierr)
>>
>> or
>>
>> call MatView(p_array,viewer,ierr)
>>
>> call PetscViewerDestroy(viewer,ierr)
>>
>> but I got segmentation error. So is there a PETSc routine I can use?
>
>
> No. Those routines work only for Vec objects. You could
>
>  a) Declare a DMDA of the same size
>
>  b) Use DMDAVecGetArrayF90() to get out the multidimensional array
>
>  c) Use that in your code
>
>  d) Use VecView() on the original vector
>
>
> Hi,
>
> Supposed I need to check the contents of the u_array which was declared
> using:
>
> PetscScalar,pointer :: u_array(:,:,:)
>
> call
> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
>
>
> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
>
> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>
> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr)
>
> call VecView(array,viewer,ierr)
>


The first argument of VecView must be of type Vec (as Matt noted).
It looks you are passing in an array of PetscScalar's.


>
> call PetscViewerDestroy(viewer,ierr)
>
> Is this the correct way?
>
>
>    Matt
>
>
>>
>> --
>> Thank you
>>
>> Yours sincerely,
>>
>> TAY wee-beng
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/b8206097/attachment.html>

From zonexo at gmail.com  Wed Aug 19 04:08:16 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Wed, 19 Aug 2015 17:08:16 +0800
Subject: [petsc-users] How to view petsc array?
In-Reply-To: <CAJ98EDowOg_cpAGxBCsn03TOa2Ga1kSPBfwinR2cyjMTxgiAtw@mail.gmail.com>
References: <55ADE837.8080705@gmail.com>	<CAMYG4Gn9QHro7vymWcU16yAt3rKB1+nUXyCzFEt_MThbidJ+gQ@mail.gmail.com>	<55D44445.8000503@gmail.com>
	<CAJ98EDowOg_cpAGxBCsn03TOa2Ga1kSPBfwinR2cyjMTxgiAtw@mail.gmail.com>
Message-ID: <55D44780.3000500@gmail.com>


On 19/8/2015 4:58 PM, Dave May wrote:
>
>
> On 19 August 2015 at 10:54, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>
>     On 21/7/2015 7:28 PM, Matthew Knepley wrote:
>>     On Tue, Jul 21, 2015 at 1:35 AM, TAY wee-beng <zonexo at gmail.com
>>     <mailto:zonexo at gmail.com>> wrote:
>>
>>         Hi,
>>
>>         I need to check the contents of the array which was declared
>>         using:
>>
>>         PetscScalar,pointer ::
>>         u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:)
>>
>>         I tried to use :
>>
>>         call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr)
>>
>>         call VecView(p_array,viewer,ierr)
>>
>>         or
>>
>>         call MatView(p_array,viewer,ierr)
>>
>>         call PetscViewerDestroy(viewer,ierr)
>>
>>         but I got segmentation error. So is there a PETSc routine I
>>         can use?
>>
>>
>>     No. Those routines work only for Vec objects. You could
>>
>>      a) Declare a DMDA of the same size
>>
>>      b) Use DMDAVecGetArrayF90() to get out the multidimensional array
>>
>>      c) Use that in your code
>>
>>      d) Use VecView() on the original vector
>
>     Hi,
>
>     Supposed I need to check the contents of the u_array which was
>     declared using:
>
>     PetscScalar,pointer :: u_array(:,:,:)
>
>     call
>     DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
>
>     size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
>
>     call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>
>     call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr)
>
>     call VecView(array,viewer,ierr)
>
>
>
> The first argument of VecView must be of type Vec (as Matt noted).
> It looks you are passing in an array of PetscScalar's.

Oh so should it be:

Vec u_global,u_local

call 
DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&

size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)

call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)

call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr)

call VecView(u_local,viewer,ierr)

call PetscViewerDestroy(viewer,ierr)
>
>
>     call PetscViewerDestroy(viewer,ierr)
>
>     Is this the correct way?
>>
>>        Matt
>>
>>
>>         -- 
>>         Thank you
>>
>>         Yours sincerely,
>>
>>         TAY wee-beng
>>
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/97eea5fb/attachment.html>

From dave.mayhem23 at gmail.com  Wed Aug 19 04:56:35 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Wed, 19 Aug 2015 11:56:35 +0200
Subject: [petsc-users] How to view petsc array?
In-Reply-To: <55D44780.3000500@gmail.com>
References: <55ADE837.8080705@gmail.com>
	<CAMYG4Gn9QHro7vymWcU16yAt3rKB1+nUXyCzFEt_MThbidJ+gQ@mail.gmail.com>
	<55D44445.8000503@gmail.com>
	<CAJ98EDowOg_cpAGxBCsn03TOa2Ga1kSPBfwinR2cyjMTxgiAtw@mail.gmail.com>
	<55D44780.3000500@gmail.com>
Message-ID: <CAJ98EDo1fCjyxj6pbw94EmTaBit52P1h8WDr40ki5++n42iH=w@mail.gmail.com>

On 19 August 2015 at 11:08, TAY wee-beng <zonexo at gmail.com> wrote:

>
> On 19/8/2015 4:58 PM, Dave May wrote:
>
>
>
> On 19 August 2015 at 10:54, TAY wee-beng <zonexo at gmail.com> wrote:
>
>>
>> On 21/7/2015 7:28 PM, Matthew Knepley wrote:
>>
>> On Tue, Jul 21, 2015 at 1:35 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I need to check the contents of the array which was declared using:
>>>
>>> PetscScalar,pointer ::
>>> u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:)
>>>
>>> I tried to use :
>>>
>>> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr)
>>>
>>> call VecView(p_array,viewer,ierr)
>>>
>>> or
>>>
>>> call MatView(p_array,viewer,ierr)
>>>
>>> call PetscViewerDestroy(viewer,ierr)
>>>
>>> but I got segmentation error. So is there a PETSc routine I can use?
>>
>>
>> No. Those routines work only for Vec objects. You could
>>
>>  a) Declare a DMDA of the same size
>>
>>  b) Use DMDAVecGetArrayF90() to get out the multidimensional array
>>
>>  c) Use that in your code
>>
>>  d) Use VecView() on the original vector
>>
>>
>> Hi,
>>
>> Supposed I need to check the contents of the u_array which was declared
>> using:
>>
>> PetscScalar,pointer :: u_array(:,:,:)
>>
>> call
>> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
>>
>>
>> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
>>
>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>
>> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr)
>>
>> call VecView(array,viewer,ierr)
>>
>
>
> The first argument of VecView must be of type Vec (as Matt noted).
> It looks you are passing in an array of PetscScalar's.
>
>
> Oh so should it be:
>
> Vec u_global,u_local
>
> call
> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
>
>
> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
>
> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>
> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr)
>
> call VecView(u_local,viewer,ierr)
>
> call PetscViewerDestroy(viewer,ierr)
>


Yes, the arguments types now match.

However, if you run this in parallel there will be two issues:

(1) You have a different local vector per process, thus you will need to
use a unique file name, e.g. "u-rankXXX.txt" to avoid overwriting the data
from each process

(2) You need to make sure that the communicator used for the viewer and the
vector are the same.

To implement (1) and (2) you could do something like this:

MPI_Comm comm;
PetscMPIInt rank;
char filename[PETSC_MAX_PATH_LEN];

ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr);
ierr =
PetscSNPrintf(filename,PETSC_MAX_PATH_LEN-1,"u-%d.txt",rank);CHKERRQ(ierr);
ierr = PetscObjectGetComm(((PetscObject)u_local,&comm);CHKERRQ(ierr);
ierr = PetscViewerASCIIOpen(comm,filename,viewer);CHKERRQ(ierr);


>
>
>
>
>>
>> call PetscViewerDestroy(viewer,ierr)
>>
>> Is this the correct way?
>>
>>
>>    Matt
>>
>>
>>>
>>> --
>>> Thank you
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/77ba3274/attachment-0001.html>

From dave.mayhem23 at gmail.com  Wed Aug 19 04:57:55 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Wed, 19 Aug 2015 11:57:55 +0200
Subject: [petsc-users] How to view petsc array?
In-Reply-To: <CAJ98EDo1fCjyxj6pbw94EmTaBit52P1h8WDr40ki5++n42iH=w@mail.gmail.com>
References: <55ADE837.8080705@gmail.com>
	<CAMYG4Gn9QHro7vymWcU16yAt3rKB1+nUXyCzFEt_MThbidJ+gQ@mail.gmail.com>
	<55D44445.8000503@gmail.com>
	<CAJ98EDowOg_cpAGxBCsn03TOa2Ga1kSPBfwinR2cyjMTxgiAtw@mail.gmail.com>
	<55D44780.3000500@gmail.com>
	<CAJ98EDo1fCjyxj6pbw94EmTaBit52P1h8WDr40ki5++n42iH=w@mail.gmail.com>
Message-ID: <CAJ98EDroOd7a2CQvLWcHWnmFcoN5kr2zXjFA6fExE+9_tfnhtQ@mail.gmail.com>

>
> MPI_Comm comm;
> PetscMPIInt rank;
> char filename[PETSC_MAX_PATH_LEN];
>
> ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr);
> ierr =
> PetscSNPrintf(filename,PETSC_MAX_PATH_LEN-1,"u-%d.txt",rank);CHKERRQ(ierr);
> ierr = PetscObjectGetComm(((PetscObject)u_local,&comm);CHKERRQ(ierr);
> ierr = PetscViewerASCIIOpen(comm,filename,viewer);CHKERRQ(ierr);
>

The last line should be
ierr = PetscViewerASCIIOpen(comm,filename,*&viewer*);CHKERRQ(ierr);
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/784e8f88/attachment.html>

From balay at mcs.anl.gov  Wed Aug 19 09:41:28 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 19 Aug 2015 09:41:28 -0500
Subject: [petsc-users] How to view petsc array?
In-Reply-To: <55ADE837.8080705@gmail.com>
References: <55ADE837.8080705@gmail.com>
Message-ID: <alpine.LFD.2.20.1508190941230.8461@asterix>

check PetscScalarView()

Satish

On Tue, 21 Jul 2015, TAY wee-beng wrote:

> Hi,
> 
> I need to check the contents of the array which was declared using:
> 
> PetscScalar,pointer ::
> u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:)
> 
> I tried to use :
> 
> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr)
> 
> call VecView(p_array,viewer,ierr)
> 
> or
> 
> call MatView(p_array,viewer,ierr)
> 
> call PetscViewerDestroy(viewer,ierr)
> 
> but I got segmentation error. So is there a PETSc routine I can use?
> 
> 


From honglianglu87 at gmail.com  Wed Aug 19 10:51:07 2015
From: honglianglu87 at gmail.com (Hongliang Lu)
Date: Wed, 19 Aug 2015 23:51:07 +0800
Subject: [petsc-users] on the data size problem
Message-ID: <CAFxoUcLj1eXSQ44YTA4Bg7YpC_xhN3i9+VHeyaj9NPK2p22E_Q@mail.gmail.com>

Dear all,
I am trying to implement a BFS algorithm using Petsc, and I have tested my
code on a graph of 5 nodes, but when I tested on a larger graph, which size
is 5000 nodes, the program went wrong, and ca not finished, could some on
help me out? thank you very much!!!!!
I tried to run the following code in a cluster with 10 nodes.

*int main(int argc,char **args)*
*{*
*        Vec         curNodes,tmp;*
*        Mat         oriGraph;*
* PetscInt rows, cols;*
* PetscScalar one=1;*
* PetscScalar nodeVecSum=1;*
*        char
filein[PETSC_MAX_PATH_LEN],fileout[PETSC_MAX_PATH_LEN],buf[PETSC_MAX_PATH_LEN];*
*        PetscViewer fd;*
*        PetscInitialize(&argc,&args,(char *)0,help);*

*
PetscOptionsGetString(PETSC_NULL,"-fin",filein,PETSC_MAX_PATH_LEN-1,PETSC_NULL);*
*        PetscViewerBinaryOpen(PETSC_COMM_WORLD,filein,FILE_MODE_READ,&fd);*
*        MatCreate(PETSC_COMM_WORLD,&oriGraph);*

*        MatLoad(oriGraph,fd);*
* MatGetSize(oriGraph,&rows,&cols);*
* MatSetOption(oriGraph,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);*
* MatSetUp(oriGraph);*
* VecCreate(PETSC_COMM_WORLD,&curNodes);*
* VecSetSizes(curNodes,PETSC_DECIDE,rows);*
* VecSetFromOptions(curNodes);*
* VecCreate(PETSC_COMM_WORLD,&tmp);*
* VecSetSizes(tmp,PETSC_DECIDE,rows);*
* VecSetFromOptions(tmp);*
* VecZeroEntries(tmp);*
* srand(time(0));*
* PetscInt node=rand()%rows;*
* PetscPrintf(PETSC_COMM_SELF,"The node ID is: %d \n",node);*
* VecSetValues(curNodes,1,&node,&one,INSERT_VALUES);*
* VecAssemblyBegin(curNodes);*
* VecAssemblyEnd(curNodes); *

*        PetscViewerDestroy(&fd);*

* const PetscInt    *colsv;*
*        const PetscScalar *valsv;*
*        PetscInt ncols,i,zero=0;*
* PetscInt iter=0;*

* nodeVecSum=1;*
* for(;iter<10;iter++)*
* { *
* VecAssemblyBegin(curNodes);*
* VecAssemblyEnd(curNodes);*
* MatMult(oriGraph,curNodes,tmp);*
* VecAssemblyBegin(tmp);*
* VecAssemblyEnd(tmp);*
* VecSum(tmp,&nodeVecSum);*
* PetscPrintf(PETSC_COMM_SELF,"There are neighbors: %d
\n",(int)nodeVecSum);*
* VecSum(curNodes,&nodeVecSum);*
* if(nodeVecSum<1)*
* break;*

* PetscScalar y;*
*        PetscInt indices;*
* PetscInt n,m,rstart,rend;*
* IS isrow;*
* Mat curMat;*
* MatGetLocalSize(oriGraph,&n,&m);*
* MatGetOwnershipRange(oriGraph,&rstart,&rend);*
* ISCreateStride(PETSC_COMM_SELF,n,rstart,1,&isrow);*
* MatGetSubMatrix(oriGraph,isrow,NULL,MAT_INITIAL_MATRIX,&curMat);*
* MatGetSize(curMat,&n,&m);*
* for(i=rstart;i<rend;i++)*
* {*
* indices=i;*
* VecGetValues(curNodes,1,&indices,&y);*
* if(y>0){*
*        MatGetRow(oriGraph,indices,&ncols,&colsv,&valsv);*
*        PetscScalar *v,zero=0;*
*        PetscMalloc1(cols,&v);*
*        for(int j=0;j<ncols;j++){*
*                v[j]=zero;*
* }*
* MatSetValues(oriGraph,1,&indices,ncols,colsv,v,INSERT_VALUES);*
* PetscFree(v);*
*        }*

* }*
* MatAssemblyBegin(oriGraph,MAT_FINAL_ASSEMBLY);*
*                MatAssemblyEnd(oriGraph,MAT_FINAL_ASSEMBLY);*
* ISDestroy(&isrow);*
* MatDestroy(&curMat);*
* VecCopy(tmp,curNodes);*
* VecAssemblyBegin(curNodes);*
* VecAssemblyEnd(curNodes);*
* }*
* PetscPrintf(PETSC_COMM_SELF,"Finished in iterations of: %d\n",iter);*
*        MatDestroy(&oriGraph);*
*        VecDestroy(&curNodes);*
* VecDestroy(&tmp);*
*        PetscFinalize();*
*        return 0;*
*}*
*The Petsc version I have installed is 3.6.1. *
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150819/3ef49ad2/attachment.html>

From bsmith at mcs.anl.gov  Tue Aug 18 20:44:11 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 18 Aug 2015 20:44:11 -0500
Subject: [petsc-users] difference between local and global vectors
In-Reply-To: <55D3DDFB.30302@gmail.com>
References: <55D3DDFB.30302@gmail.com>
Message-ID: <AE5D5349-7BAF-49AF-8577-461AFA9A4F8F@mcs.anl.gov>


  The global vectors are what the "algebraic solvers" TS/SNES/KSP see, while the local vectors are what you use to perform function evaluations and Jacobian evaluations needed by KSP, SNES, and TS, for example with SNESSetFunction(). 

  Barry

> On Aug 18, 2015, at 8:38 PM, TAY wee-beng <zonexo at gmail.com> wrote:
> 
> Hi,
> 
> I am using DA. For e.g.
> 
> DM da_u
> 
> call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&
> 
> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr)
> 
> call DMCreateGlobalVector(da_u,u_global,ierr)
> 
> call DMCreateLocalVector(da_u,u_local,ierr)
> 
> To update the ghost values, I use:
> 
> call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr)
> 
> call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr)
> 
> It seems that I don't need to use global vector at all.
> 
> So what's the difference between local and global vector?
> 
> When will I need to use?:
> 
> call DMGlobalToLocalBegin(da_u,u_global,INSERT_VALUES,u_local,ierr)
> 
> call DMGlobalToLocalEnd(da_u,u_global,INSERT_VALUES,u_local,ierr)
> 
> -- 
> Thank you
> 
> Yours sincerely,
> 
> TAY wee-beng
> 


From zonexo at gmail.com  Wed Aug 19 11:06:59 2015
From: zonexo at gmail.com (Wee Beng Tay)
Date: Thu, 20 Aug 2015 00:06:59 +0800
Subject: [petsc-users] How to view petsc array?
In-Reply-To: <alpine.LFD.2.20.1508190941230.8461@asterix>
References: <55ADE837.8080705@gmail.com>
	<alpine.LFD.2.20.1508190941230.8461@asterix>
Message-ID: <1440000424021-ac869639-5f486918-9750e761@gmail.com>

Hi,

So I can use PetscScalar view directly to view the u_array?


Sent using CloudMagic 
[https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.7&pv=5.0.2]


On Wed, Aug 19, 2015 at 10:41 PM, Satish Balay < balay at mcs.anl.gov 
[balay at mcs.anl.gov] > wrote:
check PetscScalarView()

Satish

On Tue, 21 Jul 2015, TAY wee-beng wrote:

 > Hi,
 >
 > I need to check the contents of the array which was declared using:
 >
 > PetscScalar,pointer ::
 > u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:)
 >
 > I tried to use :
 >
 > call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr)
 >
 > call VecView(p_array,viewer,ierr)
 >
 > or
 >
 > call MatView(p_array,viewer,ierr)
 >
 > call PetscViewerDestroy(viewer,ierr)
 >
 > but I got segmentation error. So is there a PETSc routine I can use?
 >
 >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/9e7e9177/attachment.html>

From eugenio.aulisa at ttu.edu  Wed Aug 19 21:26:25 2015
From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio)
Date: Thu, 20 Aug 2015 02:26:25 +0000
Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
Message-ID: <A14A3EA0F2E5D4409734F444690866A521B0FF@centaur05.ttu.edu>

Hi,

I am solving an iteration of

GMRES -> PCMG -> PCASM 

where I build my particular ASM domain decomposition.

In setting the PCMG I would like at each level 
to use the same pre- and post-smoother
and for this reason I am using 
...
PCMGGetSmoother ( pcMG, level , &subksp );

to extract and set at each level the ksp object.

In setting PCASM then I use
...
KSPGetPC ( subksp, &subpc );
PCSetType ( subpc, PCASM );
...
and then set my own decomposition
...
PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]);
...

Now everything compiles, and runs with no memory leakage, 
but I do not get the expected convergence.

When I checked the output of -ksp_view, I saw something that puzzled me:
at each level >0, while in the MG pre-smoother the ASM domain decomposition
is the one that I set, for example with 4 processes  I get

>>>>>>>>>>>>>>>>>>>
...
 Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (level-2)     4 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=1
      using preconditioner applied to right hand side for initial guess
      tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (level-2)     4 MPI processes
      type: asm
        Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0
        Additive Schwarz: restriction/interpolation type - RESTRICT
        [0] number of local blocks = 52
        [1] number of local blocks = 48
        [2] number of local blocks = 48
        [3] number of local blocks = 50
        Local solve info for each block is in the following KSP and PC objects:
        - - - - - - - - - - - - - - - - - -
...
>>>>>>>>>>>


in the post-smoother I have the default ASM  decomposition with overlapping 1:


>>>>>>>>>>>
...
 Up solver (post-smoother) on level 2 -------------------------------
    KSP Object:    (level-2)     4 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=2
      tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (level-2)     4 MPI processes
      type: asm
        Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
        Additive Schwarz: restriction/interpolation type - RESTRICT
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (level-2sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
...
>>>>>>>>>>>>>
%%%%%%%%%%%%%%%%%%%%%%%%

So it seams that by using

PCMGGetSmoother ( pcMG, level , &subksp );

I was capable  to set both the pre- and post- smoothers to be PCASM
but everything I did after that applied only to the
pre-smoother, while the post-smoother got the default PCASM options.

I know that I can use
PCMGGetSmootherDown and PCMGGetSmootherUp, but that would 
probably double the memory allocation and the computational time in the ASM.

Is there any way I can just use PCMGGetSmoother 
and use the same PCASM in the pre- and post- smoother?

I hope I was clear enough.

Thanks a lot for your help,
Eugenio


From zonexo at gmail.com  Wed Aug 19 22:28:56 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 20 Aug 2015 11:28:56 +0800
Subject: [petsc-users] Debugging KSP output error
Message-ID: <55D54978.8030003@gmail.com>

Hi,

I run my code on 1, 2 and 3 procs. KSP is used to solve the Poisson eqn.

Using MatView and VecView, I found that my LHS matrix and RHS vec are 
the same for 1,2 and 3 procs.

However, my pressure (ans) output is the almost the same (due to 
truncation err) for 1,2 procs.

But for 3 procs, the output is the same as for the 1,2 procs for all 
values except:

1. the last few values for procs 0

2. the first and last few values for procs 1 and 2.

Shouldn't the output be the same when the LHS matrix and RHS vec are the 
same? How can I debug to find the err?


-- 
Thank you

Yours sincerely,

TAY wee-beng


From dave.mayhem23 at gmail.com  Thu Aug 20 02:29:16 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Thu, 20 Aug 2015 09:29:16 +0200
Subject: [petsc-users] Debugging KSP output error
In-Reply-To: <55D54978.8030003@gmail.com>
References: <55D54978.8030003@gmail.com>
Message-ID: <CAJ98EDoOz6O+_qW-1ew+dtB3Z6UTaLA33-M4_K9RvczfdZFPVw@mail.gmail.com>

On 20 August 2015 at 05:28, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
>
> I run my code on 1, 2 and 3 procs. KSP is used to solve the Poisson eqn.
>
> Using MatView and VecView, I found that my LHS matrix and RHS vec are the
> same for 1,2 and 3 procs.
>
> However, my pressure (ans) output is the almost the same (due to
> truncation err) for 1,2 procs.
>
> But for 3 procs, the output is the same as for the 1,2 procs for all
> values except:
>
> 1. the last few values for procs 0
>
> 2. the first and last few values for procs 1 and 2.
>
> Shouldn't the output be the same when the LHS matrix and RHS vec are the
> same? How can I debug to find the err?
>
>
It's a bit hard to say much without knowing exactly what solver
configuration you actually ran and without seeing the difference in the
solution you are referring too.

Some preconditioners have different behaviour in serial and parallel. Thus,
the convergence of the solver and the residual history (and thus the
answer) can look slightly different. This difference will become smaller as
you solve the system more accurately.
Do you solve the system accurately? e.g. something like -ksp_rtol 1.0e-10

To avoid the problem mentioned above, try using -pc_type jacobi. This PC is
the same in serial and parallel. Thus, if your A and b are identical  on
1,2,3 procs, then the residuals and solution will also be identical on
1,2,3 procs (upto machine precision).

Thanks,
  Dave


> --
> Thank you
>
> Yours sincerely,
>
> TAY wee-beng
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/f5a83421/attachment.html>

From bsmith at mcs.anl.gov  Thu Aug 20 02:37:13 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 20 Aug 2015 02:37:13 -0500
Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
In-Reply-To: <A14A3EA0F2E5D4409734F444690866A521B0FF@centaur05.ttu.edu>
References: <A14A3EA0F2E5D4409734F444690866A521B0FF@centaur05.ttu.edu>
Message-ID: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov>


   What you describe is not the expected behavior. I expected exactly the result that you expected.

Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS?  Can you send us some code that we could run that reproduces the problem?

  Barry

> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu> wrote:
> 
> Hi,
> 
> I am solving an iteration of
> 
> GMRES -> PCMG -> PCASM 
> 
> where I build my particular ASM domain decomposition.
> 
> In setting the PCMG I would like at each level 
> to use the same pre- and post-smoother
> and for this reason I am using 
> ...
> PCMGGetSmoother ( pcMG, level , &subksp );
> 
> to extract and set at each level the ksp object.
> 
> In setting PCASM then I use
> ...
> KSPGetPC ( subksp, &subpc );
> PCSetType ( subpc, PCASM );
> ...
> and then set my own decomposition
> ...
> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]);
> ...
> 
> Now everything compiles, and runs with no memory leakage, 
> but I do not get the expected convergence.
> 
> When I checked the output of -ksp_view, I saw something that puzzled me:
> at each level >0, while in the MG pre-smoother the ASM domain decomposition
> is the one that I set, for example with 4 processes  I get
> 
>>>>>>>>>>>>>>>>>>>> 
> ...
> Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:    (level-2)     4 MPI processes
>      type: gmres
>        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>        GMRES: happy breakdown tolerance 1e-30
>      maximum iterations=1
>      using preconditioner applied to right hand side for initial guess
>      tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>      left preconditioning
>      using nonzero initial guess
>      using NONE norm type for convergence test
>    PC Object:    (level-2)     4 MPI processes
>      type: asm
>        Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0
>        Additive Schwarz: restriction/interpolation type - RESTRICT
>        [0] number of local blocks = 52
>        [1] number of local blocks = 48
>        [2] number of local blocks = 48
>        [3] number of local blocks = 50
>        Local solve info for each block is in the following KSP and PC objects:
>        - - - - - - - - - - - - - - - - - -
> ...
>>>>>>>>>>>> 
> 
> 
> in the post-smoother I have the default ASM  decomposition with overlapping 1:
> 
> 
>>>>>>>>>>>> 
> ...
> Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:    (level-2)     4 MPI processes
>      type: gmres
>        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>        GMRES: happy breakdown tolerance 1e-30
>      maximum iterations=2
>      tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>      left preconditioning
>      using nonzero initial guess
>      using NONE norm type for convergence test
>    PC Object:    (level-2)     4 MPI processes
>      type: asm
>        Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
>        Additive Schwarz: restriction/interpolation type - RESTRICT
>        Local solve is same for all blocks, in the following KSP and PC objects:
>      KSP Object:      (level-2sub_)       1 MPI processes
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
> ...
>>>>>>>>>>>>>> 
> %%%%%%%%%%%%%%%%%%%%%%%%
> 
> So it seams that by using
> 
> PCMGGetSmoother ( pcMG, level , &subksp );
> 
> I was capable  to set both the pre- and post- smoothers to be PCASM
> but everything I did after that applied only to the
> pre-smoother, while the post-smoother got the default PCASM options.
> 
> I know that I can use
> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would 
> probably double the memory allocation and the computational time in the ASM.
> 
> Is there any way I can just use PCMGGetSmoother 
> and use the same PCASM in the pre- and post- smoother?
> 
> I hope I was clear enough.
> 
> Thanks a lot for your help,
> Eugenio
> 
> 
> 


From zonexo at gmail.com  Thu Aug 20 04:01:33 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 20 Aug 2015 17:01:33 +0800
Subject: [petsc-users] Debugging KSP output error
In-Reply-To: <CAJ98EDoOz6O+_qW-1ew+dtB3Z6UTaLA33-M4_K9RvczfdZFPVw@mail.gmail.com>
References: <55D54978.8030003@gmail.com>
	<CAJ98EDoOz6O+_qW-1ew+dtB3Z6UTaLA33-M4_K9RvczfdZFPVw@mail.gmail.com>
Message-ID: <55D5976D.8060509@gmail.com>


On 20/8/2015 3:29 PM, Dave May wrote:
>
>
> On 20 August 2015 at 05:28, TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>> wrote:
>
>     Hi,
>
>     I run my code on 1, 2 and 3 procs. KSP is used to solve the
>     Poisson eqn.
>
>     Using MatView and VecView, I found that my LHS matrix and RHS vec
>     are the same for 1,2 and 3 procs.
>
>     However, my pressure (ans) output is the almost the same (due to
>     truncation err) for 1,2 procs.
>
>     But for 3 procs, the output is the same as for the 1,2 procs for
>     all values except:
>
>     1. the last few values for procs 0
>
>     2. the first and last few values for procs 1 and 2.
>
>     Shouldn't the output be the same when the LHS matrix and RHS vec
>     are the same? How can I debug to find the err?
>
>
> It's a bit hard to say much without knowing exactly what solver 
> configuration you actually ran and without seeing the difference in 
> the solution you are referring too.
>
> Some preconditioners have different behaviour in serial and parallel. 
> Thus, the convergence of the solver and the residual history (and thus 
> the answer) can look slightly different. This difference will become 
> smaller as you solve the system more accurately.
> Do you solve the system accurately? e.g. something like -ksp_rtol 1.0e-10
>
> To avoid the problem mentioned above, try using -pc_type jacobi. This 
> PC is the same in serial and parallel. Thus, if your A and b are 
> identical  on 1,2,3 procs, then the residuals and solution will also 
> be identical on 1,2,3 procs (upto machine precision).
>
Hi Dave,

I tried using jacobi and it's the same result. I found out that the 
error is actually due to mismatched size between DMDACreate3d and 
MatGetOwnershipRange.

Using

/*call 
DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&*//*
*//*
*//*size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_w,ierr)*//*
*//*
*//*call 
DMDAGetCorners(da_u,start_ijk(1),start_ijk(2),start_ijk(3),width_ijk(1),width_ijk(2),width_ijk(3),ierr)*/

and

*/call 
MatCreateAIJ(MPI_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,size_x*size_y*size_z,size_x*size_y*size_z,7,PETSC_NULL_INTEGER,7,PETSC_NULL_INTEGER,A_mat,ierr)/**/
/**/
/**/call MatGetOwnershipRange(A_mat,ijk_sta_p,ijk_end_p,ierr)/*

Is this possible? Or is there an error somewhere? It happens when using 
3 procs, instead of 1 or 2.

For my size_x,size_y,size_z = 4,8,10, it was partitioned along z 
direction with 1->4, 5->7, 8->10 using 3 procs with DMDACreate3d which 
should give ownership (with Fortran index + 1) of:

myid,ijk_sta_p,ijk_end_p           1         129         192
  myid,ijk_sta_p,ijk_end_p           0           1         128
  myid,ijk_sta_p,ijk_end_p           2         193         320

But with MatGetOwnershipRange, I got

myid,ijk_sta_p,ijk_end_p           1         108         214
  myid,ijk_sta_p,ijk_end_p           0           1         107
  myid,ijk_sta_p,ijk_end_p           2         215         320

> Thanks,
>   Dave
>
>
>
>     -- 
>     Thank you
>
>     Yours sincerely,
>
>     TAY wee-beng
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/71a42cf5/attachment-0001.html>

From dave.mayhem23 at gmail.com  Thu Aug 20 04:13:55 2015
From: dave.mayhem23 at gmail.com (Dave May)
Date: Thu, 20 Aug 2015 11:13:55 +0200
Subject: [petsc-users] Debugging KSP output error
In-Reply-To: <55D5976D.8060509@gmail.com>
References: <55D54978.8030003@gmail.com>
	<CAJ98EDoOz6O+_qW-1ew+dtB3Z6UTaLA33-M4_K9RvczfdZFPVw@mail.gmail.com>
	<55D5976D.8060509@gmail.com>
Message-ID: <CAJ98EDrJa95RqE_kyB-+B9o6DmDCSO3g8DUpOAcdWYDO8oYD3A@mail.gmail.com>

On 20 August 2015 at 11:01, TAY wee-beng <zonexo at gmail.com> wrote:

>
> On 20/8/2015 3:29 PM, Dave May wrote:
>
>
>
> On 20 August 2015 at 05:28, TAY wee-beng <zonexo at gmail.com> wrote:
>
>> Hi,
>>
>> I run my code on 1, 2 and 3 procs. KSP is used to solve the Poisson eqn.
>>
>> Using MatView and VecView, I found that my LHS matrix and RHS vec are the
>> same for 1,2 and 3 procs.
>>
>> However, my pressure (ans) output is the almost the same (due to
>> truncation err) for 1,2 procs.
>>
>> But for 3 procs, the output is the same as for the 1,2 procs for all
>> values except:
>>
>> 1. the last few values for procs 0
>>
>> 2. the first and last few values for procs 1 and 2.
>>
>> Shouldn't the output be the same when the LHS matrix and RHS vec are the
>> same? How can I debug to find the err?
>>
>>
> It's a bit hard to say much without knowing exactly what solver
> configuration you actually ran and without seeing the difference in the
> solution you are referring too.
>
> Some preconditioners have different behaviour in serial and parallel.
> Thus, the convergence of the solver and the residual history (and thus the
> answer) can look slightly different. This difference will become smaller as
> you solve the system more accurately.
> Do you solve the system accurately? e.g. something like -ksp_rtol 1.0e-10
>
> To avoid the problem mentioned above, try using -pc_type jacobi. This PC
> is the same in serial and parallel. Thus, if your A and b are identical  on
> 1,2,3 procs, then the residuals and solution will also be identical on
> 1,2,3 procs (upto machine precision).
>
> Hi Dave,
>
> I tried using jacobi and it's the same result. I found out that the error
> is actually due to mismatched size between DMDACreate3d and
> MatGetOwnershipRange.
>
> Using
>
> *call
> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&*
>
>
> *size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_w,ierr)*
>
> *call
> DMDAGetCorners(da_u,start_ijk(1),start_ijk(2),start_ijk(3),width_ijk(1),width_ijk(2),width_ijk(3),ierr)*
>
> and
>
> *call
> MatCreateAIJ(MPI_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,size_x*size_y*size_z,size_x*size_y*size_z,7,PETSC_NULL_INTEGER,7,PETSC_NULL_INTEGER,A_mat,ierr)*
>
> *call MatGetOwnershipRange(A_mat,ijk_sta_p,ijk_end_p,ierr)*
>
> Is this possible? Or is there an error somewhere? It happens when using 3
> procs, instead of 1 or 2.
>
>
Sure it is possible you get a mismatch in the local sizes if you create the
matrix this way as the matrix created knows nothing about the DMDA, and
specifically, it does not know how it has been spatially decomposed.

If you want to ensure consistency between the DMDA and the matrix,
you should always use
  DMCreateMatrix()
to create the matrix.
Any subsequent calls to MatGetOwnershipRange() will then be consistent with
the DMDA parallel layout.


> For my size_x,size_y,size_z = 4,8,10, it was partitioned along z direction
> with 1->4, 5->7, 8->10 using 3 procs with DMDACreate3d which should give
> ownership (with Fortran index + 1)  of:
>
> myid,ijk_sta_p,ijk_end_p           1         129         192
>  myid,ijk_sta_p,ijk_end_p           0           1         128
>  myid,ijk_sta_p,ijk_end_p           2         193         320
>
> But with MatGetOwnershipRange, I got
>
> myid,ijk_sta_p,ijk_end_p           1         108         214
>  myid,ijk_sta_p,ijk_end_p           0           1         107
>  myid,ijk_sta_p,ijk_end_p           2         215         320
>
> Thanks,
>   Dave
>
>
>
>> --
>> Thank you
>>
>> Yours sincerely,
>>
>> TAY wee-beng
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/5c0316a1/attachment.html>

From lorenzoalessiobotti at gmail.com  Thu Aug 20 05:29:10 2015
From: lorenzoalessiobotti at gmail.com (Lorenzo Alessio Botti)
Date: Thu, 20 Aug 2015 12:29:10 +0200
Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
Message-ID: <73CC58DF-71E5-4A32-B714-45BAA8247D0A@gmail.com>

I tried to achieve this behaviour getting all the smothers and setting the same preconditioner to the down and up smoother on the same level.


    smoothers.resize(nLevels+1);
    smoothers_up.resize(nLevels);
    for (PetscInt i = 0; i < nLevels; i++)
    {
      PCMGGetSmootherDown(M_pc,nLevels-i,&(smoothers[i]));
      KSPSetInitialGuessNonzero(smoothers[i],PETSC_TRUE); // for full and wCicle
      PCMGGetSmootherUp(M_pc,nLevels-i,&(smoothers_up[i]));
    }
    PCMGSetNumberSmoothDown(M_pc,1);
    PCMGSetNumberSmoothUp(M_pc,1);

? set coarse solver options here

    for (PetscInt i = 0; i < nLevels; i++)
    {
      PC pc;
      KSPSetType(smoothers[i], KSPGMRES);
      KSPGetPC(smoothers[i], &pc);
      KSPSetPCSide(smoothers[i], PC_RIGHT);
      PCSetType(pc, PCASM);
      PCFactorSetPivotInBlocks(pc, PETSC_TRUE);
      PCFactorSetAllowDiagonalFill(pc);
      PCFactorSetReuseFill(pc, PETSC_TRUE);
      PCFactorSetReuseOrdering(pc, PETSC_TRUE);
      KSPSetType(smoothers_up[i], KSPGMRES);
      KSPSetPC(smoothers_up[i], pc);
      KSPSetPCSide(smoothers_up[i], PC_RIGHT);
      KSPSetConvergenceTest(smoothers[i],KSPConvergedSkip,NULL,NULL);
      KSPSetConvergenceTest(smoothers_up[i],KSPConvergedSkip,NULL,NULL);
      KSPSetNormType(smoothers[i],KSP_NORM_NONE);
      KSPSetNormType(smoothers_up[i],KSP_NORM_NONE);
    }

Is this correct?
Note moreover that for Full Multigrid and W cicles to work as expected I need to add the KSPSetInitialGuessNonZero option.

Bests
Lorenzo

> Message: 4
> Date: Thu, 20 Aug 2015 02:37:13 -0500
> From: Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>
> To: "Aulisa, Eugenio" <eugenio.aulisa at ttu.edu <mailto:eugenio.aulisa at ttu.edu>>
> Cc: "petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
> Message-ID: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov <mailto:1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov>>
> Content-Type: text/plain; charset="us-ascii"
> 
> 
>   What you describe is not the expected behavior. I expected exactly the result that you expected.
> 
> Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS?  Can you send us some code that we could run that reproduces the problem?
> 
>  Barry
> 
>> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu <mailto:eugenio.aulisa at ttu.edu>> wrote:
>> 
>> Hi,
>> 
>> I am solving an iteration of
>> 
>> GMRES -> PCMG -> PCASM 
>> 
>> where I build my particular ASM domain decomposition.
>> 
>> In setting the PCMG I would like at each level 
>> to use the same pre- and post-smoother
>> and for this reason I am using 
>> ...
>> PCMGGetSmoother ( pcMG, level , &subksp );
>> 
>> to extract and set at each level the ksp object.
>> 
>> In setting PCASM then I use
>> ...
>> KSPGetPC ( subksp, &subpc );
>> PCSetType ( subpc, PCASM );
>> ...
>> and then set my own decomposition
>> ...
>> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]);
>> ...
>> 
>> Now everything compiles, and runs with no memory leakage, 
>> but I do not get the expected convergence.
>> 
>> When I checked the output of -ksp_view, I saw something that puzzled me:
>> at each level >0, while in the MG pre-smoother the ASM domain decomposition
>> is the one that I set, for example with 4 processes  I get
>> 
>>>>>>>>>>>>>>>>>>>>> 
>> ...
>> Down solver (pre-smoother) on level 2 -------------------------------
>>   KSP Object:    (level-2)     4 MPI processes
>>     type: gmres
>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>       GMRES: happy breakdown tolerance 1e-30
>>     maximum iterations=1
>>     using preconditioner applied to right hand side for initial guess
>>     tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>>     left preconditioning
>>     using nonzero initial guess
>>     using NONE norm type for convergence test
>>   PC Object:    (level-2)     4 MPI processes
>>     type: asm
>>       Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0
>>       Additive Schwarz: restriction/interpolation type - RESTRICT
>>       [0] number of local blocks = 52
>>       [1] number of local blocks = 48
>>       [2] number of local blocks = 48
>>       [3] number of local blocks = 50
>>       Local solve info for each block is in the following KSP and PC objects:
>>       - - - - - - - - - - - - - - - - - -
>> ...
>>>>>>>>>>>>> 
>> 
>> 
>> in the post-smoother I have the default ASM  decomposition with overlapping 1:
>> 
>> 
>>>>>>>>>>>>> 
>> ...
>> Up solver (post-smoother) on level 2 -------------------------------
>>   KSP Object:    (level-2)     4 MPI processes
>>     type: gmres
>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>       GMRES: happy breakdown tolerance 1e-30
>>     maximum iterations=2
>>     tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>>     left preconditioning
>>     using nonzero initial guess
>>     using NONE norm type for convergence test
>>   PC Object:    (level-2)     4 MPI processes
>>     type: asm
>>       Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
>>       Additive Schwarz: restriction/interpolation type - RESTRICT
>>       Local solve is same for all blocks, in the following KSP and PC objects:
>>     KSP Object:      (level-2sub_)       1 MPI processes
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>> ...
>>>>>>>>>>>>>>> 
>> %%%%%%%%%%%%%%%%%%%%%%%%
>> 
>> So it seams that by using
>> 
>> PCMGGetSmoother ( pcMG, level , &subksp );
>> 
>> I was capable  to set both the pre- and post- smoothers to be PCASM
>> but everything I did after that applied only to the
>> pre-smoother, while the post-smoother got the default PCASM options.
>> 
>> I know that I can use
>> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would 
>> probably double the memory allocation and the computational time in the ASM.
>> 
>> Is there any way I can just use PCMGGetSmoother 
>> and use the same PCASM in the pre- and post- smoother?
>> 
>> I hope I was clear enough.
>> 
>> Thanks a lot for your help,
>> Eugenio
>> 
>> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/951ee85c/attachment-0001.html>

From nelsonflsilva at ist.utl.pt  Thu Aug 20 06:30:56 2015
From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva)
Date: Thu, 20 Aug 2015 12:30:56 +0100
Subject: [petsc-users] Scalability issue
In-Reply-To: <FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
Message-ID: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>

Hello.

I am sorry for the long time without response. I decided to rewrite my 
application in a different way and will send the log_summary output when 
done reimplementing.

As for the machine, I am using mpirun to run jobs in a 8 node cluster. 
I modified the makefile on the steams folder so it would run using my 
hostfile.
The output is attached to this email. It seems reasonable for a cluster 
with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 
socket.

Cheers,
Nelson


Em 2015-07-24 16:50, Barry Smith escreveu:
> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16
> ... processes with the option -log_summary and send (as attachments)
> the log summary information.
>
>    Also on the same machine run the streams benchmark; with recent
> releases of PETSc you only need to do
>
> cd $PETSC_DIR
> make streams NPMAX=16 (or whatever your largest process count is)
>
> and send the output.
>
> I suspect that you are doing everything fine and it is more an issue
> with the configuration of your machine. Also read the information at
> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on
> "binding"
>
>   Barry
>
>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva 
>> <nelsonflsilva at ist.utl.pt> wrote:
>>
>> Hello,
>>
>> I have been using PETSc for a few months now, and it truly is 
>> fantastic piece of software.
>>
>> In my particular example I am working with a large, sparse 
>> distributed (MPI AIJ) matrix we can refer as 'G'.
>> G is a horizontal - retangular matrix (for example, 1,1 Million rows 
>> per 2,1 Million columns). This matrix is commonly very sparse and not 
>> diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the 
>> diagonal block of MPI AIJ representation).
>> To work with this matrix, I also have a few parallel vectors 
>> (created using MatCreate Vec), we can refer as 'm' and 'k'.
>> I am trying to parallelize an iterative algorithm in which the most 
>> computational heavy operations are:
>>
>> ->Matrix-Vector Multiplication, more precisely G * m + k = b 
>> (MatMultAdd). From what I have been reading, to achive a good speedup 
>> in this operation, G should be as much diagonal as possible, due to 
>> overlapping communication and computation. But even when using a G 
>> matrix in which the diagonal block has ~95% of the nnz, I cannot get a 
>> decent speedup. Most of the times, the performance even gets worse.
>>
>> ->Matrix-Matrix Multiplication, in this case I need to perform G * 
>> G' = A, where A is later used on the linear solver and G' is transpose 
>> of G. The speedup in this operation is not worse, although is not very 
>> good.
>>
>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" 
>> from the last two operations. I tried to apply a RCM permutation to A 
>> to make it more diagonal, for better performance. However, the problem 
>> I faced was that, the permutation is performed locally in each 
>> processor and thus, the final result is different with different 
>> number of processors. I assume this was intended to reduce 
>> communication. The solution I found was
>> 1-calculate A
>> 2-calculate, localy to 1 machine, the RCM permutation IS using A
>> 3-apply this permutation to the lines of G.
>> This works well, and A is generated as if RCM permuted. It is fine 
>> to do this operation in one machine because it is only done once while 
>> reading the input. The nnz of G become more spread and less diagonal, 
>> causing problems when calculating G * m + k = b.
>>
>> These 3 operations (except the permutation) are performed in each 
>> iteration of my algorithm.
>>
>> So, my questions are.
>> -What are the characteristics of G that lead to a good speedup in 
>> the operations I described? Am I missing something and too much 
>> obsessed with the diagonal block?
>>
>> -Is there a better way to permute A without permute G and still get 
>> the same result using 1 or N machines?
>>
>>
>> I have been avoiding asking for help for a while. I'm very sorry for 
>> the long email.
>> Thank you very much for your time.
>> Best Regards,
>> Nelson
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: streams.output
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/22dda83c/attachment.ksh>

From eugenio.aulisa at ttu.edu  Thu Aug 20 06:51:17 2015
From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio)
Date: Thu, 20 Aug 2015 11:51:17 +0000
Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
In-Reply-To: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov>
References: <A14A3EA0F2E5D4409734F444690866A521B0FF@centaur05.ttu.edu>,
	<1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov>
Message-ID: <A14A3EA0F2E5D4409734F444690866A521B15B@centaur05.ttu.edu>

Hi Barry,

Thanks for your answer.

I run my applications with no command line, and I do not think I changed any PETSC_OPTIONS,
at least not voluntarily.

For the source it is available on 
https://github.com/NumPDEClassTTU/femus
but it is part of a much larger library and 
I do not think any of you want to install and run it
just to find what I messed up.

In any case, if you just want to look at the source  code
where I set up the level smoother  it is in

https://github.com/NumPDEClassTTU/femus/blob/master/src/algebra/AsmPetscLinearEquationSolver.cpp
 
line 400

void AsmPetscLinearEquationSolver::MGsetLevels (
    LinearEquationSolver *LinSolver, const unsigned &level, const unsigned &levelMax,
    const vector <unsigned> &variable_to_be_solved, SparseMatrix* PP, SparseMatrix* RR ){

Be aware, that even if it seams that this takes care of the coarse level it is not.
The coarse level smoother is set some where else.

Thanks,
Eugenio

________________________________________
From: Barry Smith [bsmith at mcs.anl.gov]
Sent: Thursday, August 20, 2015 2:37 AM
To: Aulisa, Eugenio
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother

   What you describe is not the expected behavior. I expected exactly the result that you expected.

Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS?  Can you send us some code that we could run that reproduces the problem?

  Barry

> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu> wrote:
>
> Hi,
>
> I am solving an iteration of
>
> GMRES -> PCMG -> PCASM
>
> where I build my particular ASM domain decomposition.
>
> In setting the PCMG I would like at each level
> to use the same pre- and post-smoother
> and for this reason I am using
> ...
> PCMGGetSmoother ( pcMG, level , &subksp );
>
> to extract and set at each level the ksp object.
>
> In setting PCASM then I use
> ...
> KSPGetPC ( subksp, &subpc );
> PCSetType ( subpc, PCASM );
> ...
> and then set my own decomposition
> ...
> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]);
> ...
>
> Now everything compiles, and runs with no memory leakage,
> but I do not get the expected convergence.
>
> When I checked the output of -ksp_view, I saw something that puzzled me:
> at each level >0, while in the MG pre-smoother the ASM domain decomposition
> is the one that I set, for example with 4 processes  I get
>
>>>>>>>>>>>>>>>>>>>>
> ...
> Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:    (level-2)     4 MPI processes
>      type: gmres
>        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>        GMRES: happy breakdown tolerance 1e-30
>      maximum iterations=1
>      using preconditioner applied to right hand side for initial guess
>      tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>      left preconditioning
>      using nonzero initial guess
>      using NONE norm type for convergence test
>    PC Object:    (level-2)     4 MPI processes
>      type: asm
>        Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0
>        Additive Schwarz: restriction/interpolation type - RESTRICT
>        [0] number of local blocks = 52
>        [1] number of local blocks = 48
>        [2] number of local blocks = 48
>        [3] number of local blocks = 50
>        Local solve info for each block is in the following KSP and PC objects:
>        - - - - - - - - - - - - - - - - - -
> ...
>>>>>>>>>>>>
>
>
> in the post-smoother I have the default ASM  decomposition with overlapping 1:
>
>
>>>>>>>>>>>>
> ...
> Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:    (level-2)     4 MPI processes
>      type: gmres
>        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>        GMRES: happy breakdown tolerance 1e-30
>      maximum iterations=2
>      tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>      left preconditioning
>      using nonzero initial guess
>      using NONE norm type for convergence test
>    PC Object:    (level-2)     4 MPI processes
>      type: asm
>        Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
>        Additive Schwarz: restriction/interpolation type - RESTRICT
>        Local solve is same for all blocks, in the following KSP and PC objects:
>      KSP Object:      (level-2sub_)       1 MPI processes
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
> ...
>>>>>>>>>>>>>>
> %%%%%%%%%%%%%%%%%%%%%%%%
>
> So it seams that by using
>
> PCMGGetSmoother ( pcMG, level , &subksp );
>
> I was capable  to set both the pre- and post- smoothers to be PCASM
> but everything I did after that applied only to the
> pre-smoother, while the post-smoother got the default PCASM options.
>
> I know that I can use
> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would
> probably double the memory allocation and the computational time in the ASM.
>
> Is there any way I can just use PCMGGetSmoother
> and use the same PCASM in the pre- and post- smoother?
>
> I hope I was clear enough.
>
> Thanks a lot for your help,
> Eugenio
>
>
>


From eugenio.aulisa at ttu.edu  Thu Aug 20 07:00:16 2015
From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio)
Date: Thu, 20 Aug 2015 12:00:16 +0000
Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
In-Reply-To: <73CC58DF-71E5-4A32-B714-45BAA8247D0A@gmail.com>
References: <73CC58DF-71E5-4A32-B714-45BAA8247D0A@gmail.com>
Message-ID: <A14A3EA0F2E5D4409734F444690866A521B176@centaur05.ttu.edu>

Thanks Lorenzo

If I well understood what you say is

Set up PCASM for the smoother_down[i] and then
feed it up to smoother_up[i]

Does this assure that only one
memory allocation is done for  PCASM?


Eugenio
________________________________
From: Lorenzo Alessio Botti [lorenzoalessiobotti at gmail.com]
Sent: Thursday, August 20, 2015 5:29 AM
To: petsc-users at mcs.anl.gov
Cc: Aulisa, Eugenio
Subject: GMRES -> PCMG -> PCASM pre- post- smoother

I tried to achieve this behaviour getting all the smothers and setting the same preconditioner to the down and up smoother on the same level.


    smoothers.resize(nLevels+1);

    smoothers_up.resize(nLevels);

    for (PetscInt i = 0; i < nLevels; i++)

    {

      PCMGGetSmootherDown(M_pc,nLevels-i,&(smoothers[i]));

      KSPSetInitialGuessNonzero(smoothers[i],PETSC_TRUE); // for full and wCicle

      PCMGGetSmootherUp(M_pc,nLevels-i,&(smoothers_up[i]));

    }

    PCMGSetNumberSmoothDown(M_pc,1);

    PCMGSetNumberSmoothUp(M_pc,1);

? set coarse solver options here


    for (PetscInt i = 0; i < nLevels; i++)

    {
      PC pc;

      KSPSetType(smoothers[i], KSPGMRES);

      KSPGetPC(smoothers[i], &pc);

      KSPSetPCSide(smoothers[i], PC_RIGHT);

      PCSetType(pc, PCASM);

      PCFactorSetPivotInBlocks(pc, PETSC_TRUE);

      PCFactorSetAllowDiagonalFill(pc);

      PCFactorSetReuseFill(pc, PETSC_TRUE);

      PCFactorSetReuseOrdering(pc, PETSC_TRUE);

      KSPSetType(smoothers_up[i], KSPGMRES);

      KSPSetPC(smoothers_up[i], pc);

      KSPSetPCSide(smoothers_up[i], PC_RIGHT);

      KSPSetConvergenceTest(smoothers[i],KSPConvergedSkip,NULL,NULL);

      KSPSetConvergenceTest(smoothers_up[i],KSPConvergedSkip,NULL,NULL);

      KSPSetNormType(smoothers[i],KSP_NORM_NONE);

      KSPSetNormType(smoothers_up[i],KSP_NORM_NONE);

    }

Is this correct?
Note moreover that for Full Multigrid and W cicles to work as expected I need to add the KSPSetInitialGuessNonZero option.

Bests
Lorenzo

Message: 4
Date: Thu, 20 Aug 2015 02:37:13 -0500
From: Barry Smith <bsmith at mcs.anl.gov<mailto:bsmith at mcs.anl.gov>>
To: "Aulisa, Eugenio" <eugenio.aulisa at ttu.edu<mailto:eugenio.aulisa at ttu.edu>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
Message-ID: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov<mailto:1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov>>
Content-Type: text/plain; charset="us-ascii"


  What you describe is not the expected behavior. I expected exactly the result that you expected.

Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS?  Can you send us some code that we could run that reproduces the problem?

 Barry

On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu<mailto:eugenio.aulisa at ttu.edu>> wrote:

Hi,

I am solving an iteration of

GMRES -> PCMG -> PCASM

where I build my particular ASM domain decomposition.

In setting the PCMG I would like at each level
to use the same pre- and post-smoother
and for this reason I am using
...
PCMGGetSmoother ( pcMG, level , &subksp );

to extract and set at each level the ksp object.

In setting PCASM then I use
...
KSPGetPC ( subksp, &subpc );
PCSetType ( subpc, PCASM );
...
and then set my own decomposition
...
PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]);
...

Now everything compiles, and runs with no memory leakage,
but I do not get the expected convergence.

When I checked the output of -ksp_view, I saw something that puzzled me:
at each level >0, while in the MG pre-smoother the ASM domain decomposition
is the one that I set, for example with 4 processes  I get


...
Down solver (pre-smoother) on level 2 -------------------------------
  KSP Object:    (level-2)     4 MPI processes
    type: gmres
      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-30
    maximum iterations=1
    using preconditioner applied to right hand side for initial guess
    tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
    left preconditioning
    using nonzero initial guess
    using NONE norm type for convergence test
  PC Object:    (level-2)     4 MPI processes
    type: asm
      Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0
      Additive Schwarz: restriction/interpolation type - RESTRICT
      [0] number of local blocks = 52
      [1] number of local blocks = 48
      [2] number of local blocks = 48
      [3] number of local blocks = 50
      Local solve info for each block is in the following KSP and PC objects:
      - - - - - - - - - - - - - - - - - -
...


in the post-smoother I have the default ASM  decomposition with overlapping 1:


...
Up solver (post-smoother) on level 2 -------------------------------
  KSP Object:    (level-2)     4 MPI processes
    type: gmres
      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-30
    maximum iterations=2
    tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
    left preconditioning
    using nonzero initial guess
    using NONE norm type for convergence test
  PC Object:    (level-2)     4 MPI processes
    type: asm
      Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
      Additive Schwarz: restriction/interpolation type - RESTRICT
      Local solve is same for all blocks, in the following KSP and PC objects:
    KSP Object:      (level-2sub_)       1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
...

%%%%%%%%%%%%%%%%%%%%%%%%

So it seams that by using

PCMGGetSmoother ( pcMG, level , &subksp );

I was capable  to set both the pre- and post- smoothers to be PCASM
but everything I did after that applied only to the
pre-smoother, while the post-smoother got the default PCASM options.

I know that I can use
PCMGGetSmootherDown and PCMGGetSmootherUp, but that would
probably double the memory allocation and the computational time in the ASM.

Is there any way I can just use PCMGGetSmoother
and use the same PCASM in the pre- and post- smoother?

I hope I was clear enough.

Thanks a lot for your help,
Eugenio


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/5b709b6e/attachment-0001.html>

From lorenzoalessiobotti at gmail.com  Thu Aug 20 07:57:38 2015
From: lorenzoalessiobotti at gmail.com (Lorenzo Alessio Botti)
Date: Thu, 20 Aug 2015 14:57:38 +0200
Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
In-Reply-To: <A14A3EA0F2E5D4409734F444690866A521B176@centaur05.ttu.edu>
References: <73CC58DF-71E5-4A32-B714-45BAA8247D0A@gmail.com>
	<A14A3EA0F2E5D4409734F444690866A521B176@centaur05.ttu.edu>
Message-ID: <170951A3-97AA-4639-9E80-3E235C4A5244@gmail.com>

I guess that only one memory allocation is done, but basically this is what I?m asking as well.
Is this correct? I?d be more confident if one of the developers confirmed this.

Lorenzo


> On 20 Aug 2015, at 14:00, Aulisa, Eugenio <eugenio.aulisa at ttu.edu> wrote:
> 
> Thanks Lorenzo
> 
> If I well understood what you say is
> 
> Set up PCASM for the smoother_down[i] and then
> feed it up to smoother_up[i]
> 
> Does this assure that only one 
> memory allocation is done for  PCASM?
> 
> 
> Eugenio
> From: Lorenzo Alessio Botti [lorenzoalessiobotti at gmail.com]
> Sent: Thursday, August 20, 2015 5:29 AM
> To: petsc-users at mcs.anl.gov
> Cc: Aulisa, Eugenio
> Subject: GMRES -> PCMG -> PCASM pre- post- smoother
> 
> I tried to achieve this behaviour getting all the smothers and setting the same preconditioner to the down and up smoother on the same level.
> 
> 
>     smoothers.resize(nLevels+1);
>     smoothers_up.resize(nLevels);
>     for (PetscInt i = 0; i < nLevels; i++)
>     {
>       PCMGGetSmootherDown(M_pc,nLevels-i,&(smoothers[i]));
>       KSPSetInitialGuessNonzero(smoothers[i],PETSC_TRUE); // for full and wCicle
>       PCMGGetSmootherUp(M_pc,nLevels-i,&(smoothers_up[i]));
>     }
>     PCMGSetNumberSmoothDown(M_pc,1);
>     PCMGSetNumberSmoothUp(M_pc,1);
> 
> ? set coarse solver options here
> 
>     for (PetscInt i = 0; i < nLevels; i++)
>     {
>       PC pc;
>       KSPSetType(smoothers[i], KSPGMRES);
>       KSPGetPC(smoothers[i], &pc);
>       KSPSetPCSide(smoothers[i], PC_RIGHT);
>       PCSetType(pc, PCASM);
>       PCFactorSetPivotInBlocks(pc, PETSC_TRUE);
>       PCFactorSetAllowDiagonalFill(pc);
>       PCFactorSetReuseFill(pc, PETSC_TRUE);
>       PCFactorSetReuseOrdering(pc, PETSC_TRUE);
>       KSPSetType(smoothers_up[i], KSPGMRES);
>       KSPSetPC(smoothers_up[i], pc);
>       KSPSetPCSide(smoothers_up[i], PC_RIGHT);
>       KSPSetConvergenceTest(smoothers[i],KSPConvergedSkip,NULL,NULL);
>       KSPSetConvergenceTest(smoothers_up[i],KSPConvergedSkip,NULL,NULL);
>       KSPSetNormType(smoothers[i],KSP_NORM_NONE);
>       KSPSetNormType(smoothers_up[i],KSP_NORM_NONE);
>     }
> 
> Is this correct?
> Note moreover that for Full Multigrid and W cicles to work as expected I need to add the KSPSetInitialGuessNonZero option.
> 
> Bests
> Lorenzo
> 
>> Message: 4
>> Date: Thu, 20 Aug 2015 02:37:13 -0500
>> From: Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>
>> To: "Aulisa, Eugenio" <eugenio.aulisa at ttu.edu <mailto:eugenio.aulisa at ttu.edu>>
>> Cc: "petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
>> Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
>> Message-ID: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov <mailto:1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov>>
>> Content-Type: text/plain; charset="us-ascii"
>> 
>> 
>>   What you describe is not the expected behavior. I expected exactly the result that you expected.
>> 
>> Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS?  Can you send us some code that we could run that reproduces the problem?
>> 
>>  Barry
>> 
>>> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu <mailto:eugenio.aulisa at ttu.edu>> wrote:
>>> 
>>> Hi,
>>> 
>>> I am solving an iteration of
>>> 
>>> GMRES -> PCMG -> PCASM 
>>> 
>>> where I build my particular ASM domain decomposition.
>>> 
>>> In setting the PCMG I would like at each level 
>>> to use the same pre- and post-smoother
>>> and for this reason I am using 
>>> ...
>>> PCMGGetSmoother ( pcMG, level , &subksp );
>>> 
>>> to extract and set at each level the ksp object.
>>> 
>>> In setting PCASM then I use
>>> ...
>>> KSPGetPC ( subksp, &subpc );
>>> PCSetType ( subpc, PCASM );
>>> ...
>>> and then set my own decomposition
>>> ...
>>> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]);
>>> ...
>>> 
>>> Now everything compiles, and runs with no memory leakage, 
>>> but I do not get the expected convergence.
>>> 
>>> When I checked the output of -ksp_view, I saw something that puzzled me:
>>> at each level >0, while in the MG pre-smoother the ASM domain decomposition
>>> is the one that I set, for example with 4 processes  I get
>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>> ...
>>> Down solver (pre-smoother) on level 2 -------------------------------
>>>   KSP Object:    (level-2)     4 MPI processes
>>>     type: gmres
>>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>       GMRES: happy breakdown tolerance 1e-30
>>>     maximum iterations=1
>>>     using preconditioner applied to right hand side for initial guess
>>>     tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>>>     left preconditioning
>>>     using nonzero initial guess
>>>     using NONE norm type for convergence test
>>>   PC Object:    (level-2)     4 MPI processes
>>>     type: asm
>>>       Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0
>>>       Additive Schwarz: restriction/interpolation type - RESTRICT
>>>       [0] number of local blocks = 52
>>>       [1] number of local blocks = 48
>>>       [2] number of local blocks = 48
>>>       [3] number of local blocks = 50
>>>       Local solve info for each block is in the following KSP and PC objects:
>>>       - - - - - - - - - - - - - - - - - -
>>> ...
>>>>>>>>>>>>>> 
>>> 
>>> 
>>> in the post-smoother I have the default ASM  decomposition with overlapping 1:
>>> 
>>> 
>>>>>>>>>>>>>> 
>>> ...
>>> Up solver (post-smoother) on level 2 -------------------------------
>>>   KSP Object:    (level-2)     4 MPI processes
>>>     type: gmres
>>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>       GMRES: happy breakdown tolerance 1e-30
>>>     maximum iterations=2
>>>     tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>>>     left preconditioning
>>>     using nonzero initial guess
>>>     using NONE norm type for convergence test
>>>   PC Object:    (level-2)     4 MPI processes
>>>     type: asm
>>>       Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
>>>       Additive Schwarz: restriction/interpolation type - RESTRICT
>>>       Local solve is same for all blocks, in the following KSP and PC objects:
>>>     KSP Object:      (level-2sub_)       1 MPI processes
>>>       type: preonly
>>>       maximum iterations=10000, initial guess is zero
>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>       left preconditioning
>>> ...
>>>>>>>>>>>>>>>> 
>>> %%%%%%%%%%%%%%%%%%%%%%%%
>>> 
>>> So it seams that by using
>>> 
>>> PCMGGetSmoother ( pcMG, level , &subksp );
>>> 
>>> I was capable  to set both the pre- and post- smoothers to be PCASM
>>> but everything I did after that applied only to the
>>> pre-smoother, while the post-smoother got the default PCASM options.
>>> 
>>> I know that I can use
>>> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would 
>>> probably double the memory allocation and the computational time in the ASM.
>>> 
>>> Is there any way I can just use PCMGGetSmoother 
>>> and use the same PCASM in the pre- and post- smoother?
>>> 
>>> I hope I was clear enough.
>>> 
>>> Thanks a lot for your help,
>>> Eugenio

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/850361b6/attachment.html>

From knepley at gmail.com  Thu Aug 20 10:17:26 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 20 Aug 2015 10:17:26 -0500
Subject: [petsc-users] Scalability issue
In-Reply-To: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
	<6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
Message-ID: <CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>

On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva <
nelsonflsilva at ist.utl.pt> wrote:

> Hello.
>
> I am sorry for the long time without response. I decided to rewrite my
> application in a different way and will send the log_summary output when
> done reimplementing.
>
> As for the machine, I am using mpirun to run jobs in a 8 node cluster. I
> modified the makefile on the steams folder so it would run using my
> hostfile.
> The output is attached to this email. It seems reasonable for a cluster
> with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket.
>

1) You launcher is placing processes haphazardly. I would figure out how to
assign them to certain nodes

2) Each node has enough bandwidth for 1 core, so it does not make much
sense to use more than 1.

  Thanks,

    Matt


> Cheers,
> Nelson
>
>
> Em 2015-07-24 16:50, Barry Smith escreveu:
>
>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16
>> ... processes with the option -log_summary and send (as attachments)
>> the log summary information.
>>
>>    Also on the same machine run the streams benchmark; with recent
>> releases of PETSc you only need to do
>>
>> cd $PETSC_DIR
>> make streams NPMAX=16 (or whatever your largest process count is)
>>
>> and send the output.
>>
>> I suspect that you are doing everything fine and it is more an issue
>> with the configuration of your machine. Also read the information at
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on
>> "binding"
>>
>>   Barry
>>
>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva <
>>> nelsonflsilva at ist.utl.pt> wrote:
>>>
>>> Hello,
>>>
>>> I have been using PETSc for a few months now, and it truly is fantastic
>>> piece of software.
>>>
>>> In my particular example I am working with a large, sparse distributed
>>> (MPI AIJ) matrix we can refer as 'G'.
>>> G is a horizontal - retangular matrix (for example, 1,1 Million rows per
>>> 2,1 Million columns). This matrix is commonly very sparse and not diagonal
>>> 'heavy' (for example 5,2 Million nnz in which ~50% are on the diagonal
>>> block of MPI AIJ representation).
>>> To work with this matrix, I also have a few parallel vectors (created
>>> using MatCreate Vec), we can refer as 'm' and 'k'.
>>> I am trying to parallelize an iterative algorithm in which the most
>>> computational heavy operations are:
>>>
>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b
>>> (MatMultAdd). From what I have been reading, to achive a good speedup in
>>> this operation, G should be as much diagonal as possible, due to
>>> overlapping communication and computation. But even when using a G matrix
>>> in which the diagonal block has ~95% of the nnz, I cannot get a decent
>>> speedup. Most of the times, the performance even gets worse.
>>>
>>> ->Matrix-Matrix Multiplication, in this case I need to perform G * G' =
>>> A, where A is later used on the linear solver and G' is transpose of G. The
>>> speedup in this operation is not worse, although is not very good.
>>>
>>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b"
>>> from the last two operations. I tried to apply a RCM permutation to A to
>>> make it more diagonal, for better performance. However, the problem I faced
>>> was that, the permutation is performed locally in each processor and thus,
>>> the final result is different with different number of processors. I assume
>>> this was intended to reduce communication. The solution I found was
>>> 1-calculate A
>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A
>>> 3-apply this permutation to the lines of G.
>>> This works well, and A is generated as if RCM permuted. It is fine to do
>>> this operation in one machine because it is only done once while reading
>>> the input. The nnz of G become more spread and less diagonal, causing
>>> problems when calculating G * m + k = b.
>>>
>>> These 3 operations (except the permutation) are performed in each
>>> iteration of my algorithm.
>>>
>>> So, my questions are.
>>> -What are the characteristics of G that lead to a good speedup in the
>>> operations I described? Am I missing something and too much obsessed with
>>> the diagonal block?
>>>
>>> -Is there a better way to permute A without permute G and still get the
>>> same result using 1 or N machines?
>>>
>>>
>>> I have been avoiding asking for help for a while. I'm very sorry for the
>>> long email.
>>> Thank you very much for your time.
>>> Best Regards,
>>> Nelson
>>>
>>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150820/e6933e96/attachment-0001.html>

From bsmith at mcs.anl.gov  Thu Aug 20 23:54:58 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 20 Aug 2015 23:54:58 -0500
Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
In-Reply-To: <A14A3EA0F2E5D4409734F444690866A521B15B@centaur05.ttu.edu>
References: <A14A3EA0F2E5D4409734F444690866A521B0FF@centaur05.ttu.edu>
	<1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov>
	<A14A3EA0F2E5D4409734F444690866A521B15B@centaur05.ttu.edu>
Message-ID: <625658D1-0B72-493C-955B-CA6582F63967@mcs.anl.gov>


Ahhh,


void AsmPetscLinearEquationSolver::MGsolve ( const bool ksp_clean , const unsigned &npre, const unsigned &npost ) {

  if ( ksp_clean ) {
    PetscMatrix* KKp = static_cast< PetscMatrix* > ( _KK );
    Mat KK = KKp->mat();
    KSPSetOperators ( _ksp, KK, _Pmat );
    KSPSetTolerances ( _ksp, _rtol, _abstol, _dtol, _maxits );
    KSPSetFromOptions ( _ksp );
    PC pcMG;
    KSPGetPC(_ksp, &pcMG);
    PCMGSetNumberSmoothDown(pcMG, npre);
    PCMGSetNumberSmoothUp(pcMG, npost);
  }

PetscErrorCode  PCMGSetNumberSmoothDown(PC pc,PetscInt n)
{
  PC_MG          *mg        = (PC_MG*)pc->data;
  PC_MG_Levels   **mglevels = mg->levels;
  PetscErrorCode ierr;
  PetscInt       i,levels;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_CLASSID,1);
  if (!mglevels) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_WRONGSTATE,"Must set MG levels before calling");
  PetscValidLogicalCollectiveInt(pc,n,2);
  levels = mglevels[0]->levels;

  for (i=1; i<levels; i++) {
    /* make sure smoother up and down are different */
    ierr = PCMGGetSmootherUp(pc,i,NULL);CHKERRQ(ierr);
    ierr = KSPSetTolerances(mglevels[i]->smoothd,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT,n);CHKERRQ(ierr);

    mg->default_smoothd = n;
  }
  PetscFunctionReturn(0);
}

PetscErrorCode  PCMGGetSmootherUp(PC pc,PetscInt l,KSP *ksp)
{
  PC_MG          *mg        = (PC_MG*)pc->data;
  PC_MG_Levels   **mglevels = mg->levels;
  PetscErrorCode ierr;
  const char     *prefix;
  MPI_Comm       comm;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_CLASSID,1);
  /*
     This is called only if user wants a different pre-smoother from post.
     Thus we check if a different one has already been allocated,
     if not we allocate it.
  */
  if (!l) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_OUTOFRANGE,"There is no such thing as a up smoother on the coarse grid");
  if (mglevels[l]->smoothu == mglevels[l]->smoothd) {
    KSPType     ksptype;
    PCType      pctype;
    PC          ipc;
    PetscReal   rtol,abstol,dtol;
    PetscInt    maxits;
    KSPNormType normtype;
    ierr = PetscObjectGetComm((PetscObject)mglevels[l]->smoothd,&comm);CHKERRQ(ierr);
    ierr = KSPGetOptionsPrefix(mglevels[l]->smoothd,&prefix);CHKERRQ(ierr);
    ierr = KSPGetTolerances(mglevels[l]->smoothd,&rtol,&abstol,&dtol,&maxits);CHKERRQ(ierr);
    ierr = KSPGetType(mglevels[l]->smoothd,&ksptype);CHKERRQ(ierr);
    ierr = KSPGetNormType(mglevels[l]->smoothd,&normtype);CHKERRQ(ierr);
    ierr = KSPGetPC(mglevels[l]->smoothd,&ipc);CHKERRQ(ierr);
    ierr = PCGetType(ipc,&pctype);CHKERRQ(ierr);

    ierr = KSPCreate(comm,&mglevels[l]->smoothu);CHKERRQ(ierr);
    ierr = KSPSetErrorIfNotConverged(mglevels[l]->smoothu,pc->erroriffailure);CHKERRQ(ierr);
    ierr = PetscObjectIncrementTabLevel((PetscObject)mglevels[l]->smoothu,(PetscObject)pc,mglevels[0]->levels-l);CHKERRQ(ierr);
    ierr = KSPSetOptionsPrefix(mglevels[l]->smoothu,prefix);CHKERRQ(ierr);
    ierr = KSPSetTolerances(mglevels[l]->smoothu,rtol,abstol,dtol,maxits);CHKERRQ(ierr);
    ierr = KSPSetType(mglevels[l]->smoothu,ksptype);CHKERRQ(ierr);
    ierr = KSPSetNormType(mglevels[l]->smoothu,normtype);CHKERRQ(ierr);
    ierr = KSPSetConvergenceTest(mglevels[l]->smoothu,KSPConvergedSkip,NULL,NULL);CHKERRQ(ierr);
    ierr = KSPGetPC(mglevels[l]->smoothu,&ipc);CHKERRQ(ierr);
    ierr = PCSetType(ipc,pctype);CHKERRQ(ierr);
    ierr = PetscLogObjectParent((PetscObject)pc,(PetscObject)mglevels[l]->smoothu);CHKERRQ(ierr);
  }
  if (ksp) *ksp = mglevels[l]->smoothu;
  PetscFunctionReturn(0);
}

  As soon as you set both the up and down number of iterations it causes a duplication of the current smoother with some options preserved but others not (we don't have a KSPDuplicate() that duplicates everything). 

  So if you are fine with the number of pre and post smooths the same just don't set both 

    PCMGSetNumberSmoothDown(pcMG, npre);
    PCMGSetNumberSmoothUp(pcMG, npost);

  if you want them to be different you can share the same PC between the two (which has the overlapping matrices in it) but you cannot share the same KSP. I can tell you how to do that but suggest it is simpler just to have the same number of  pre and post smooths

  Barry


> On Aug 20, 2015, at 6:51 AM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu> wrote:
> 
> Hi Barry,
> 
> Thanks for your answer.
> 
> I run my applications with no command line, and I do not think I changed any PETSC_OPTIONS,
> at least not voluntarily.
> 
> For the source it is available on 
> https://github.com/NumPDEClassTTU/femus
> but it is part of a much larger library and 
> I do not think any of you want to install and run it
> just to find what I messed up.
> 
> In any case, if you just want to look at the source  code
> where I set up the level smoother  it is in
> 
> https://github.com/NumPDEClassTTU/femus/blob/master/src/algebra/AsmPetscLinearEquationSolver.cpp
> 
> line 400
> 
> void AsmPetscLinearEquationSolver::MGsetLevels (
>    LinearEquationSolver *LinSolver, const unsigned &level, const unsigned &levelMax,
>    const vector <unsigned> &variable_to_be_solved, SparseMatrix* PP, SparseMatrix* RR ){
> 
> Be aware, that even if it seams that this takes care of the coarse level it is not.
> The coarse level smoother is set some where else.
> 
> Thanks,
> Eugenio
> 
> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Thursday, August 20, 2015 2:37 AM
> To: Aulisa, Eugenio
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
> 
>   What you describe is not the expected behavior. I expected exactly the result that you expected.
> 
> Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS?  Can you send us some code that we could run that reproduces the problem?
> 
>  Barry
> 
>> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu> wrote:
>> 
>> Hi,
>> 
>> I am solving an iteration of
>> 
>> GMRES -> PCMG -> PCASM
>> 
>> where I build my particular ASM domain decomposition.
>> 
>> In setting the PCMG I would like at each level
>> to use the same pre- and post-smoother
>> and for this reason I am using
>> ...
>> PCMGGetSmoother ( pcMG, level , &subksp );
>> 
>> to extract and set at each level the ksp object.
>> 
>> In setting PCASM then I use
>> ...
>> KSPGetPC ( subksp, &subpc );
>> PCSetType ( subpc, PCASM );
>> ...
>> and then set my own decomposition
>> ...
>> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]);
>> ...
>> 
>> Now everything compiles, and runs with no memory leakage,
>> but I do not get the expected convergence.
>> 
>> When I checked the output of -ksp_view, I saw something that puzzled me:
>> at each level >0, while in the MG pre-smoother the ASM domain decomposition
>> is the one that I set, for example with 4 processes  I get
>> 
>>>>>>>>>>>>>>>>>>>>> 
>> ...
>> Down solver (pre-smoother) on level 2 -------------------------------
>>   KSP Object:    (level-2)     4 MPI processes
>>     type: gmres
>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>       GMRES: happy breakdown tolerance 1e-30
>>     maximum iterations=1
>>     using preconditioner applied to right hand side for initial guess
>>     tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>>     left preconditioning
>>     using nonzero initial guess
>>     using NONE norm type for convergence test
>>   PC Object:    (level-2)     4 MPI processes
>>     type: asm
>>       Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0
>>       Additive Schwarz: restriction/interpolation type - RESTRICT
>>       [0] number of local blocks = 52
>>       [1] number of local blocks = 48
>>       [2] number of local blocks = 48
>>       [3] number of local blocks = 50
>>       Local solve info for each block is in the following KSP and PC objects:
>>       - - - - - - - - - - - - - - - - - -
>> ...
>>>>>>>>>>>>> 
>> 
>> 
>> in the post-smoother I have the default ASM  decomposition with overlapping 1:
>> 
>> 
>>>>>>>>>>>>> 
>> ...
>> Up solver (post-smoother) on level 2 -------------------------------
>>   KSP Object:    (level-2)     4 MPI processes
>>     type: gmres
>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>       GMRES: happy breakdown tolerance 1e-30
>>     maximum iterations=2
>>     tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>>     left preconditioning
>>     using nonzero initial guess
>>     using NONE norm type for convergence test
>>   PC Object:    (level-2)     4 MPI processes
>>     type: asm
>>       Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
>>       Additive Schwarz: restriction/interpolation type - RESTRICT
>>       Local solve is same for all blocks, in the following KSP and PC objects:
>>     KSP Object:      (level-2sub_)       1 MPI processes
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>> ...
>>>>>>>>>>>>>>> 
>> %%%%%%%%%%%%%%%%%%%%%%%%
>> 
>> So it seams that by using
>> 
>> PCMGGetSmoother ( pcMG, level , &subksp );
>> 
>> I was capable  to set both the pre- and post- smoothers to be PCASM
>> but everything I did after that applied only to the
>> pre-smoother, while the post-smoother got the default PCASM options.
>> 
>> I know that I can use
>> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would
>> probably double the memory allocation and the computational time in the ASM.
>> 
>> Is there any way I can just use PCMGGetSmoother
>> and use the same PCASM in the pre- and post- smoother?
>> 
>> I hope I was clear enough.
>> 
>> Thanks a lot for your help,
>> Eugenio
>> 
>> 
>> 
> 


From eugenio.aulisa at ttu.edu  Fri Aug 21 19:38:27 2015
From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio)
Date: Sat, 22 Aug 2015 00:38:27 +0000
Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
In-Reply-To: <625658D1-0B72-493C-955B-CA6582F63967@mcs.anl.gov>
References: <A14A3EA0F2E5D4409734F444690866A521B0FF@centaur05.ttu.edu>
	<1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov>
	<A14A3EA0F2E5D4409734F444690866A521B15B@centaur05.ttu.edu>,
	<625658D1-0B72-493C-955B-CA6582F63967@mcs.anl.gov>
Message-ID: <A14A3EA0F2E5D4409734F444690866A521B5B3@centaur05.ttu.edu>

Thanks Barry.

Yes that was my problem now if I run with the same down and up
number of iterations I see the in -ksp_view output 
that smoother up is the same as smoother down.

I think I figure it out how to set up different smoothers 
up and down but use the same ASM Preconditioner, 
which is more or less what Lorenzo suggested.

Thanks again
Eugenio
 
________________________________________
From: Barry Smith [bsmith at mcs.anl.gov]
Sent: Thursday, August 20, 2015 11:54 PM
To: Aulisa, Eugenio
Cc: PETSc list
Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother

Ahhh,


void AsmPetscLinearEquationSolver::MGsolve ( const bool ksp_clean , const unsigned &npre, const unsigned &npost ) {

  if ( ksp_clean ) {
    PetscMatrix* KKp = static_cast< PetscMatrix* > ( _KK );
    Mat KK = KKp->mat();
    KSPSetOperators ( _ksp, KK, _Pmat );
    KSPSetTolerances ( _ksp, _rtol, _abstol, _dtol, _maxits );
    KSPSetFromOptions ( _ksp );
    PC pcMG;
    KSPGetPC(_ksp, &pcMG);
    PCMGSetNumberSmoothDown(pcMG, npre);
    PCMGSetNumberSmoothUp(pcMG, npost);
  }

PetscErrorCode  PCMGSetNumberSmoothDown(PC pc,PetscInt n)
{
  PC_MG          *mg        = (PC_MG*)pc->data;
  PC_MG_Levels   **mglevels = mg->levels;
  PetscErrorCode ierr;
  PetscInt       i,levels;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_CLASSID,1);
  if (!mglevels) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_WRONGSTATE,"Must set MG levels before calling");
  PetscValidLogicalCollectiveInt(pc,n,2);
  levels = mglevels[0]->levels;

  for (i=1; i<levels; i++) {
    /* make sure smoother up and down are different */
    ierr = PCMGGetSmootherUp(pc,i,NULL);CHKERRQ(ierr);
    ierr = KSPSetTolerances(mglevels[i]->smoothd,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT,n);CHKERRQ(ierr);

    mg->default_smoothd = n;
  }
  PetscFunctionReturn(0);
}

PetscErrorCode  PCMGGetSmootherUp(PC pc,PetscInt l,KSP *ksp)
{
  PC_MG          *mg        = (PC_MG*)pc->data;
  PC_MG_Levels   **mglevels = mg->levels;
  PetscErrorCode ierr;
  const char     *prefix;
  MPI_Comm       comm;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_CLASSID,1);
  /*
     This is called only if user wants a different pre-smoother from post.
     Thus we check if a different one has already been allocated,
     if not we allocate it.
  */
  if (!l) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_OUTOFRANGE,"There is no such thing as a up smoother on the coarse grid");
  if (mglevels[l]->smoothu == mglevels[l]->smoothd) {
    KSPType     ksptype;
    PCType      pctype;
    PC          ipc;
    PetscReal   rtol,abstol,dtol;
    PetscInt    maxits;
    KSPNormType normtype;
    ierr = PetscObjectGetComm((PetscObject)mglevels[l]->smoothd,&comm);CHKERRQ(ierr);
    ierr = KSPGetOptionsPrefix(mglevels[l]->smoothd,&prefix);CHKERRQ(ierr);
    ierr = KSPGetTolerances(mglevels[l]->smoothd,&rtol,&abstol,&dtol,&maxits);CHKERRQ(ierr);
    ierr = KSPGetType(mglevels[l]->smoothd,&ksptype);CHKERRQ(ierr);
    ierr = KSPGetNormType(mglevels[l]->smoothd,&normtype);CHKERRQ(ierr);
    ierr = KSPGetPC(mglevels[l]->smoothd,&ipc);CHKERRQ(ierr);
    ierr = PCGetType(ipc,&pctype);CHKERRQ(ierr);

    ierr = KSPCreate(comm,&mglevels[l]->smoothu);CHKERRQ(ierr);
    ierr = KSPSetErrorIfNotConverged(mglevels[l]->smoothu,pc->erroriffailure);CHKERRQ(ierr);
    ierr = PetscObjectIncrementTabLevel((PetscObject)mglevels[l]->smoothu,(PetscObject)pc,mglevels[0]->levels-l);CHKERRQ(ierr);
    ierr = KSPSetOptionsPrefix(mglevels[l]->smoothu,prefix);CHKERRQ(ierr);
    ierr = KSPSetTolerances(mglevels[l]->smoothu,rtol,abstol,dtol,maxits);CHKERRQ(ierr);
    ierr = KSPSetType(mglevels[l]->smoothu,ksptype);CHKERRQ(ierr);
    ierr = KSPSetNormType(mglevels[l]->smoothu,normtype);CHKERRQ(ierr);
    ierr = KSPSetConvergenceTest(mglevels[l]->smoothu,KSPConvergedSkip,NULL,NULL);CHKERRQ(ierr);
    ierr = KSPGetPC(mglevels[l]->smoothu,&ipc);CHKERRQ(ierr);
    ierr = PCSetType(ipc,pctype);CHKERRQ(ierr);
    ierr = PetscLogObjectParent((PetscObject)pc,(PetscObject)mglevels[l]->smoothu);CHKERRQ(ierr);
  }
  if (ksp) *ksp = mglevels[l]->smoothu;
  PetscFunctionReturn(0);
}

  As soon as you set both the up and down number of iterations it causes a duplication of the current smoother with some options preserved but others not (we don't have a KSPDuplicate() that duplicates everything).

  So if you are fine with the number of pre and post smooths the same just don't set both

    PCMGSetNumberSmoothDown(pcMG, npre);
    PCMGSetNumberSmoothUp(pcMG, npost);

  if you want them to be different you can share the same PC between the two (which has the overlapping matrices in it) but you cannot share the same KSP. I can tell you how to do that but suggest it is simpler just to have the same number of  pre and post smooths

  Barry


> On Aug 20, 2015, at 6:51 AM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu> wrote:
>
> Hi Barry,
>
> Thanks for your answer.
>
> I run my applications with no command line, and I do not think I changed any PETSC_OPTIONS,
> at least not voluntarily.
>
> For the source it is available on
> https://github.com/NumPDEClassTTU/femus
> but it is part of a much larger library and
> I do not think any of you want to install and run it
> just to find what I messed up.
>
> In any case, if you just want to look at the source  code
> where I set up the level smoother  it is in
>
> https://github.com/NumPDEClassTTU/femus/blob/master/src/algebra/AsmPetscLinearEquationSolver.cpp
>
> line 400
>
> void AsmPetscLinearEquationSolver::MGsetLevels (
>    LinearEquationSolver *LinSolver, const unsigned &level, const unsigned &levelMax,
>    const vector <unsigned> &variable_to_be_solved, SparseMatrix* PP, SparseMatrix* RR ){
>
> Be aware, that even if it seams that this takes care of the coarse level it is not.
> The coarse level smoother is set some where else.
>
> Thanks,
> Eugenio
>
> ________________________________________
> From: Barry Smith [bsmith at mcs.anl.gov]
> Sent: Thursday, August 20, 2015 2:37 AM
> To: Aulisa, Eugenio
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother
>
>   What you describe is not the expected behavior. I expected exactly the result that you expected.
>
> Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS?  Can you send us some code that we could run that reproduces the problem?
>
>  Barry
>
>> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio <eugenio.aulisa at ttu.edu> wrote:
>>
>> Hi,
>>
>> I am solving an iteration of
>>
>> GMRES -> PCMG -> PCASM
>>
>> where I build my particular ASM domain decomposition.
>>
>> In setting the PCMG I would like at each level
>> to use the same pre- and post-smoother
>> and for this reason I am using
>> ...
>> PCMGGetSmoother ( pcMG, level , &subksp );
>>
>> to extract and set at each level the ksp object.
>>
>> In setting PCASM then I use
>> ...
>> KSPGetPC ( subksp, &subpc );
>> PCSetType ( subpc, PCASM );
>> ...
>> and then set my own decomposition
>> ...
>> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]);
>> ...
>>
>> Now everything compiles, and runs with no memory leakage,
>> but I do not get the expected convergence.
>>
>> When I checked the output of -ksp_view, I saw something that puzzled me:
>> at each level >0, while in the MG pre-smoother the ASM domain decomposition
>> is the one that I set, for example with 4 processes  I get
>>
>>>>>>>>>>>>>>>>>>>>>
>> ...
>> Down solver (pre-smoother) on level 2 -------------------------------
>>   KSP Object:    (level-2)     4 MPI processes
>>     type: gmres
>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>       GMRES: happy breakdown tolerance 1e-30
>>     maximum iterations=1
>>     using preconditioner applied to right hand side for initial guess
>>     tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>>     left preconditioning
>>     using nonzero initial guess
>>     using NONE norm type for convergence test
>>   PC Object:    (level-2)     4 MPI processes
>>     type: asm
>>       Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0
>>       Additive Schwarz: restriction/interpolation type - RESTRICT
>>       [0] number of local blocks = 52
>>       [1] number of local blocks = 48
>>       [2] number of local blocks = 48
>>       [3] number of local blocks = 50
>>       Local solve info for each block is in the following KSP and PC objects:
>>       - - - - - - - - - - - - - - - - - -
>> ...
>>>>>>>>>>>>>
>>
>>
>> in the post-smoother I have the default ASM  decomposition with overlapping 1:
>>
>>
>>>>>>>>>>>>>
>> ...
>> Up solver (post-smoother) on level 2 -------------------------------
>>   KSP Object:    (level-2)     4 MPI processes
>>     type: gmres
>>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>       GMRES: happy breakdown tolerance 1e-30
>>     maximum iterations=2
>>     tolerances:  relative=1e-12, absolute=1e-20, divergence=1e+50
>>     left preconditioning
>>     using nonzero initial guess
>>     using NONE norm type for convergence test
>>   PC Object:    (level-2)     4 MPI processes
>>     type: asm
>>       Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1
>>       Additive Schwarz: restriction/interpolation type - RESTRICT
>>       Local solve is same for all blocks, in the following KSP and PC objects:
>>     KSP Object:      (level-2sub_)       1 MPI processes
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>> ...
>>>>>>>>>>>>>>>
>> %%%%%%%%%%%%%%%%%%%%%%%%
>>
>> So it seams that by using
>>
>> PCMGGetSmoother ( pcMG, level , &subksp );
>>
>> I was capable  to set both the pre- and post- smoothers to be PCASM
>> but everything I did after that applied only to the
>> pre-smoother, while the post-smoother got the default PCASM options.
>>
>> I know that I can use
>> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would
>> probably double the memory allocation and the computational time in the ASM.
>>
>> Is there any way I can just use PCMGGetSmoother
>> and use the same PCASM in the pre- and post- smoother?
>>
>> I hope I was clear enough.
>>
>> Thanks a lot for your help,
>> Eugenio
>>
>>
>>
>


From david.knezevic at akselos.com  Sat Aug 22 06:59:33 2015
From: david.knezevic at akselos.com (David Knezevic)
Date: Sat, 22 Aug 2015 07:59:33 -0400
Subject: [petsc-users] Variatonal inequalities
Message-ID: <CAJCWK9ATaF2DNyxtOHotjT8N9YRKfZbqy8Q0VZ35yE4dRV5S3g@mail.gmail.com>

Hi all,

I see from Section 5.7 of the manual that SNES supports box constraints on
variables, which is great. However, I was also hoping to also be able to
consider general linear inequality constraints, so I was wondering if
anyone has any suggestions on how (or if) that could be done with PETSc?

Thanks,
David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/c8332b16/attachment.html>

From simpson at math.drexel.edu  Sat Aug 22 16:02:32 2015
From: simpson at math.drexel.edu (Gideon Simpson)
Date: Sat, 22 Aug 2015 17:02:32 -0400
Subject: [petsc-users] two issues with sparse direct solvers
Message-ID: <94C1CB4B-C560-4E12-A3B8-2341061548EC@math.drexel.edu>

I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense:

1.  For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES.    There?s no error message, it just sits there and doesn?t do anything.

2.  When running with SuperLU dist, I got the following error, with no further information:

MPI_ABORT was invoked on rank 36 in communicator MPI_COMM_WORLD 
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.


-gideon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/46f47af0/attachment.html>

From gideon.simpson at gmail.com  Sat Aug 22 16:04:18 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Sat, 22 Aug 2015 17:04:18 -0400
Subject: [petsc-users] issues with sparse direct solvers
Message-ID: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>

I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense:

1.  For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES.    There?s no error message, it just sits there and doesn?t do anything.

2.  When running with SuperLU dist, I got the following error, with no further information:

[3]PETSC ERROR: ------------------------------------------------------------------------
[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[3]PETSC ERROR: likely location of problem given in stack below
[3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[3]PETSC ERROR:       INSTEAD the line number of the start of the function
[3]PETSC ERROR:       is given.
[3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
[3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
[3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
[3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[3]PETSC ERROR: Signal received
[3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
[3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
[3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
[3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD 
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[6]PETSC ERROR: ------------------------------------------------------------------------
[6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[6]PETSC ERROR: likely location of problem given in stack below
[6]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[6]PETSC ERROR:       INSTEAD the line number of the start of the function
[6]PETSC ERROR:       is given.
[6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
[6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
[6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
[6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[6]PETSC ERROR: Signal received
[6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
[6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
[6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
[6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
[7]PETSC ERROR: ------------------------------------------------------------------------
[7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[7]PETSC ERROR: likely location of problem given in stack below
[7]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[7]PETSC ERROR:       INSTEAD the line number of the start of the function
[7]PETSC ERROR:       is given.
[7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
[7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
[7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
[7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[7]PETSC ERROR: Signal received
[7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
[7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
[7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
[7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
[0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
[0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
[0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
[0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[1]PETSC ERROR: likely location of problem given in stack below
[1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[1]PETSC ERROR:       INSTEAD the line number of the start of the function
[1]PETSC ERROR:       is given.
[1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
[1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
[1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
[1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: Signal received
[1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
[1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
[1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
[1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
[2]PETSC ERROR: ------------------------------------------------------------------------
[2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[2]PETSC ERROR: likely location of problem given in stack below
[2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[2]PETSC ERROR:       INSTEAD the line number of the start of the function
[2]PETSC ERROR:       is given.
[2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
[2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
[2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
[2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[2]PETSC ERROR: Signal received
[2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
[2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
[2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
[2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
[4]PETSC ERROR: ------------------------------------------------------------------------
[4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[4]PETSC ERROR: likely location of problem given in stack below
[4]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[4]PETSC ERROR:       INSTEAD the line number of the start of the function
[4]PETSC ERROR:       is given.
[4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
[4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
[4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
[4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[4]PETSC ERROR: Signal received
[4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
[4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
[4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
[4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
[5]PETSC ERROR: ------------------------------------------------------------------------
[5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
[5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[5]PETSC ERROR: likely location of problem given in stack below
[5]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[5]PETSC ERROR:       INSTEAD the line number of the start of the function
[5]PETSC ERROR:       is given.
[5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
[5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
[5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
[5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
[5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[5]PETSC ERROR: Signal received
[5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
[5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
[5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
[5]PETSC ERROR: #1 User provided function() line 0 in  unknown file

-gideon


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/b1cfd13e/attachment-0001.html>

From bsmith at mcs.anl.gov  Sat Aug 22 16:12:53 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 22 Aug 2015 16:12:53 -0500
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
Message-ID: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>


> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense:
> 
> 1.  For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES.    There?s no error message, it just sits there and doesn?t do anything.

  You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this.
> 
> 2.  When running with SuperLU dist, I got the following error, with no further information:

  The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist  If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes

   Barry


> 
> [3]PETSC ERROR: ------------------------------------------------------------------------
> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [3]PETSC ERROR: likely location of problem given in stack below
> [3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [3]PETSC ERROR:       INSTEAD the line number of the start of the function
> [3]PETSC ERROR:       is given.
> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [3]PETSC ERROR: Signal received
> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD 
> with errorcode 59.
> 
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort
> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
> [6]PETSC ERROR: ------------------------------------------------------------------------
> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [6]PETSC ERROR: likely location of problem given in stack below
> [6]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [6]PETSC ERROR:       INSTEAD the line number of the start of the function
> [6]PETSC ERROR:       is given.
> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> [6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [6]PETSC ERROR: Signal received
> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> [7]PETSC ERROR: ------------------------------------------------------------------------
> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [7]PETSC ERROR: likely location of problem given in stack below
> [7]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [7]PETSC ERROR:       INSTEAD the line number of the start of the function
> [7]PETSC ERROR:       is given.
> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> [7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [7]PETSC ERROR: Signal received
> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> [0]PETSC ERROR:       is given.
> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> [1]PETSC ERROR: ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [1]PETSC ERROR: likely location of problem given in stack below
> [1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
> [1]PETSC ERROR:       is given.
> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [1]PETSC ERROR: Signal received
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> [2]PETSC ERROR: ------------------------------------------------------------------------
> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [2]PETSC ERROR: likely location of problem given in stack below
> [2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [2]PETSC ERROR:       INSTEAD the line number of the start of the function
> [2]PETSC ERROR:       is given.
> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [2]PETSC ERROR: Signal received
> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> [4]PETSC ERROR: ------------------------------------------------------------------------
> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [4]PETSC ERROR: likely location of problem given in stack below
> [4]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [4]PETSC ERROR:       INSTEAD the line number of the start of the function
> [4]PETSC ERROR:       is given.
> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> [4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [4]PETSC ERROR: Signal received
> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> [5]PETSC ERROR: ------------------------------------------------------------------------
> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [5]PETSC ERROR: likely location of problem given in stack below
> [5]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [5]PETSC ERROR:       INSTEAD the line number of the start of the function
> [5]PETSC ERROR:       is given.
> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> [5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [5]PETSC ERROR: Signal received
> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> 
> -gideon
> 
> 


From gideon.simpson at gmail.com  Sat Aug 22 16:16:01 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Sat, 22 Aug 2015 17:16:01 -0400
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
	<1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
Message-ID: <5CAA6DE5-E31F-4868-BF98-B17B1D5CED44@gmail.com>

Thanks Barry, I?ll take a look at debugging.  I?m also going to try petsc 3.6, since that has a newer MUMPS build.  

Regarding the SuperLU bugs, are they bad enough hat I should distrust output even when errors were not generated?


-gideon

> On Aug 22, 2015, at 5:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense:
>> 
>> 1.  For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES.    There?s no error message, it just sits there and doesn?t do anything.
> 
>  You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this.
>> 
>> 2.  When running with SuperLU dist, I got the following error, with no further information:
> 
>  The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist  If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes
> 
>   Barry
> 
> 
>> 
>> [3]PETSC ERROR: ------------------------------------------------------------------------
>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [3]PETSC ERROR: likely location of problem given in stack below
>> [3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [3]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [3]PETSC ERROR:       is given.
>> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [3]PETSC ERROR: Signal received
>> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> --------------------------------------------------------------------------
>> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD 
>> with errorcode 59.
>> 
>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> You may or may not see output from other processes, depending on
>> exactly when Open MPI kills them.
>> --------------------------------------------------------------------------
>> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort
>> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>> [6]PETSC ERROR: ------------------------------------------------------------------------
>> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [6]PETSC ERROR: likely location of problem given in stack below
>> [6]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [6]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [6]PETSC ERROR:       is given.
>> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [6]PETSC ERROR: Signal received
>> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [7]PETSC ERROR: ------------------------------------------------------------------------
>> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [7]PETSC ERROR: likely location of problem given in stack below
>> [7]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [7]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [7]PETSC ERROR:       is given.
>> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [7]PETSC ERROR: Signal received
>> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [0]PETSC ERROR: ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [0]PETSC ERROR: likely location of problem given in stack below
>> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [0]PETSC ERROR:       is given.
>> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [0]PETSC ERROR: Signal received
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [1]PETSC ERROR: ------------------------------------------------------------------------
>> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [1]PETSC ERROR: likely location of problem given in stack below
>> [1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [1]PETSC ERROR:       is given.
>> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [1]PETSC ERROR: Signal received
>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [2]PETSC ERROR: ------------------------------------------------------------------------
>> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [2]PETSC ERROR: likely location of problem given in stack below
>> [2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [2]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [2]PETSC ERROR:       is given.
>> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [2]PETSC ERROR: Signal received
>> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [4]PETSC ERROR: ------------------------------------------------------------------------
>> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [4]PETSC ERROR: likely location of problem given in stack below
>> [4]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [4]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [4]PETSC ERROR:       is given.
>> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [4]PETSC ERROR: Signal received
>> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [5]PETSC ERROR: ------------------------------------------------------------------------
>> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [5]PETSC ERROR: likely location of problem given in stack below
>> [5]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [5]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [5]PETSC ERROR:       is given.
>> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [5]PETSC ERROR: Signal received
>> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> 
>> -gideon
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/ff8dc9a6/attachment-0001.html>

From nelsonflsilva at ist.utl.pt  Sat Aug 22 16:17:21 2015
From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva)
Date: Sat, 22 Aug 2015 22:17:21 +0100
Subject: [petsc-users] Scalability issue
In-Reply-To: <CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
	<6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
	<CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>
Message-ID: <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt>

 
Hi. 

I managed to finish the re-implementation. I ran the program
with 1,2,3,4,5,6 machines and saved the summary. I send each of them in
this email.
In these executions, the program performs Matrix-Vector
(MatMult, MatMultAdd) products and Vector-Vector operations. From what I
understand while reading the logs, the program takes most of the time in
"VecScatterEnd". 
In this example, the matrix taking part on the
Matrix-Vector products is not "much diagonal heavy". 
The following
numbers are the percentages of nnz values on the matrix diagonal block
for each machine, and each execution time.
NMachines %NNZ ExecTime 
1
machine0 100%; 16min08sec

2 machine0 91.1%; 24min58sec 
 machine1
69.2%; 

3 machine0 90.9% 25min42sec
 machine1 82.8%
 machine2 51.6%

4
machine0 91.9% 26min27sec 
 machine1 82.4%
 machine2 73.1%
 machine3
39.9%

5 machine0 93.2% 39min23sec
 machine1 82.8%
 machine2 74.4%

machine3 64.6%
 machine4 31.6%

6 machine0 94.2% 54min54sec
 machine1
82.6%
 machine2 73.1%
 machine3 65.2%
 machine4 55.9% 
 machine5 25.4%


In this implementation I'm using MatCreate and VecCreate. I'm also
leaving the partition sizes in PETSC_DECIDE. 

Finally, to run the
application, I'm using mpirun.hydra from mpich, downloaded by PETSc
configure script.
I'm checking the process assignment as suggested on
the last email.

Am I missing anything?

Regards,
Nelson 

Em 2015-08-20
16:17, Matthew Knepley escreveu: 

> On Thu, Aug 20, 2015 at 6:30 AM,
Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt [3]> wrote:
> 
>>
Hello.
>> 
>> I am sorry for the long time without response. I decided
to rewrite my application in a different way and will send the
log_summary output when done reimplementing.
>> 
>> As for the machine,
I am using mpirun to run jobs in a 8 node cluster. I modified the
makefile on the steams folder so it would run using my hostfile.
>> The
output is attached to this email. It seems reasonable for a cluster with
8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket.
>

> 1) You launcher is placing processes haphazardly. I would figure out
how to assign them to certain nodes 
> 2) Each node has enough bandwidth
for 1 core, so it does not make much sense to use more than 1. 
>
Thanks, 
> Matt 
> 
>> Cheers,
>> Nelson
>> 
>> Em 2015-07-24 16:50,
Barry Smith escreveu:
>> 
>>> It would be very helpful if you ran the
code on say 1, 2, 4, 8, 16
>>> ... processes with the option
-log_summary and send (as attachments)
>>> the log summary
information.
>>> 
>>> Also on the same machine run the streams
benchmark; with recent
>>> releases of PETSc you only need to do
>>>

>>> cd $PETSC_DIR
>>> make streams NPMAX=16 (or whatever your largest
process count is)
>>> 
>>> and send the output.
>>> 
>>> I suspect that
you are doing everything fine and it is more an issue
>>> with the
configuration of your machine. Also read the information at
>>>
http://www.mcs.anl.gov/petsc/documentation/faq.html#computers [2] on
>>>
"binding"
>>> 
>>> Barry
>>> 
>>>> On Jul 24, 2015, at 10:41 AM, Nelson
Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt [1]> wrote:
>>>> 
>>>>
Hello,
>>>> 
>>>> I have been using PETSc for a few months now, and it
truly is fantastic piece of software.
>>>> 
>>>> In my particular
example I am working with a large, sparse distributed (MPI AIJ) matrix
we can refer as 'G'.
>>>> G is a horizontal - retangular matrix (for
example, 1,1 Million rows per 2,1 Million columns). This matrix is
commonly very sparse and not diagonal 'heavy' (for example 5,2 Million
nnz in which ~50% are on the diagonal block of MPI AIJ
representation).
>>>> To work with this matrix, I also have a few
parallel vectors (created using MatCreate Vec), we can refer as 'm' and
'k'.
>>>> I am trying to parallelize an iterative algorithm in which the
most computational heavy operations are:
>>>> 
>>>> ->Matrix-Vector
Multiplication, more precisely G * m + k = b (MatMultAdd). From what I
have been reading, to achive a good speedup in this operation, G should
be as much diagonal as possible, due to overlapping communication and
computation. But even when using a G matrix in which the diagonal block
has ~95% of the nnz, I cannot get a decent speedup. Most of the times,
the performance even gets worse.
>>>> 
>>>> ->Matrix-Matrix
Multiplication, in this case I need to perform G * G' = A, where A is
later used on the linear solver and G' is transpose of G. The speedup in
this operation is not worse, although is not very good.
>>>> 
>>>>
->Linear problem solving. Lastly, In this operation I compute "Ax=b"
from the last two operations. I tried to apply a RCM permutation to A to
make it more diagonal, for better performance. However, the problem I
faced was that, the permutation is performed locally in each processor
and thus, the final result is different with different number of
processors. I assume this was intended to reduce communication. The
solution I found was
>>>> 1-calculate A
>>>> 2-calculate, localy to 1
machine, the RCM permutation IS using A
>>>> 3-apply this permutation to
the lines of G.
>>>> This works well, and A is generated as if RCM
permuted. It is fine to do this operation in one machine because it is
only done once while reading the input. The nnz of G become more spread
and less diagonal, causing problems when calculating G * m + k = b.
>>>>

>>>> These 3 operations (except the permutation) are performed in each
iteration of my algorithm.
>>>> 
>>>> So, my questions are.
>>>> -What
are the characteristics of G that lead to a good speedup in the
operations I described? Am I missing something and too much obsessed
with the diagonal block?
>>>> 
>>>> -Is there a better way to permute A
without permute G and still get the same result using 1 or N
machines?
>>>> 
>>>> I have been avoiding asking for help for a while.
I'm very sorry for the long email.
>>>> Thank you very much for your
time.
>>>> Best Regards,
>>>> Nelson
> 
> -- 
> 
> What most
experimenters take for granted before they begin their experiments is
infinitely more interesting than any results to which their experiments
lead.
> -- Norbert Wiener

 
Links:
------
[1]
mailto:nelsonflsilva at ist.utl.pt
[2]
http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
[3]
mailto:nelsonflsilva at ist.utl.pt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/b1b107e9/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log01P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/b1b107e9/attachment-0006.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log02P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/b1b107e9/attachment-0007.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log03P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/b1b107e9/attachment-0008.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log04P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/b1b107e9/attachment-0009.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log05P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/b1b107e9/attachment-0010.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log06P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150822/b1b107e9/attachment-0011.ksh>

From bsmith at mcs.anl.gov  Sat Aug 22 16:22:33 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 22 Aug 2015 16:22:33 -0500
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <5CAA6DE5-E31F-4868-BF98-B17B1D5CED44@gmail.com>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
	<1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
	<5CAA6DE5-E31F-4868-BF98-B17B1D5CED44@gmail.com>
Message-ID: <15C59F64-2281-47D9-8040-193AA04E603E@mcs.anl.gov>


> On Aug 22, 2015, at 4:16 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> Thanks Barry, I?ll take a look at debugging.  I?m also going to try petsc 3.6, since that has a newer MUMPS build.  
> 
> Regarding the SuperLU bugs, are they bad enough hat I should distrust output even when errors were not generated?

  In my experience and what I've see no. It either crashes or runs correctly. But you can always check the residual after the solution 


> 
> 
> -gideon
> 
>> On Aug 22, 2015, at 5:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>> 
>>> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense:
>>> 
>>> 1.  For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES.    There?s no error message, it just sits there and doesn?t do anything.
>> 
>>  You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this.
>>> 
>>> 2.  When running with SuperLU dist, I got the following error, with no further information:
>> 
>>  The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist  If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes
>> 
>>   Barry
>> 
>> 
>>> 
>>> [3]PETSC ERROR: ------------------------------------------------------------------------
>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [3]PETSC ERROR: likely location of problem given in stack below
>>> [3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [3]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [3]PETSC ERROR:       is given.
>>> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [3]PETSC ERROR: Signal received
>>> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>>> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> --------------------------------------------------------------------------
>>> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD 
>>> with errorcode 59.
>>> 
>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>> You may or may not see output from other processes, depending on
>>> exactly when Open MPI kills them.
>>> --------------------------------------------------------------------------
>>> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort
>>> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>> [6]PETSC ERROR: ------------------------------------------------------------------------
>>> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>>> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [6]PETSC ERROR: likely location of problem given in stack below
>>> [6]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [6]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [6]PETSC ERROR:       is given.
>>> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> [6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [6]PETSC ERROR: Signal received
>>> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>>> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> [7]PETSC ERROR: ------------------------------------------------------------------------
>>> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>>> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [7]PETSC ERROR: likely location of problem given in stack below
>>> [7]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [7]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [7]PETSC ERROR:       is given.
>>> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> [7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [7]PETSC ERROR: Signal received
>>> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>>> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [0]PETSC ERROR: likely location of problem given in stack below
>>> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [0]PETSC ERROR:       is given.
>>> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [0]PETSC ERROR: Signal received
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>>> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> [1]PETSC ERROR: ------------------------------------------------------------------------
>>> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [1]PETSC ERROR: likely location of problem given in stack below
>>> [1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [1]PETSC ERROR:       is given.
>>> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [1]PETSC ERROR: Signal received
>>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>>> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> [2]PETSC ERROR: ------------------------------------------------------------------------
>>> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>>> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [2]PETSC ERROR: likely location of problem given in stack below
>>> [2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [2]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [2]PETSC ERROR:       is given.
>>> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [2]PETSC ERROR: Signal received
>>> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>>> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> [4]PETSC ERROR: ------------------------------------------------------------------------
>>> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>>> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [4]PETSC ERROR: likely location of problem given in stack below
>>> [4]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [4]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [4]PETSC ERROR:       is given.
>>> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> [4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [4]PETSC ERROR: Signal received
>>> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>>> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> [5]PETSC ERROR: ------------------------------------------------------------------------
>>> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>>> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>> [5]PETSC ERROR: likely location of problem given in stack below
>>> [5]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>> [5]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [5]PETSC ERROR:       is given.
>>> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> [5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [5]PETSC ERROR: Signal received
>>> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>>> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> 
>>> -gideon
>>> 
>>> 
>> 
> 


From bsmith at mcs.anl.gov  Sat Aug 22 16:49:30 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 22 Aug 2015 16:49:30 -0500
Subject: [petsc-users] Scalability issue
In-Reply-To: <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
	<6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
	<CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>
	<11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt>
Message-ID: <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov>


> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt> wrote:
> 
> Hi.
> 
> 
> I managed to finish the re-implementation. I ran the program with 1,2,3,4,5,6 machines and saved the summary. I send each of them in this email.
> In these executions, the program performs Matrix-Vector (MatMult, MatMultAdd) products and Vector-Vector operations. From what I understand while reading the logs, the program takes most of the time in "VecScatterEnd". 
> In this example, the matrix taking part on the Matrix-Vector products is not "much diagonal heavy". 
> The following numbers are the percentages of nnz values on the matrix diagonal block for each machine, and each execution time.
> NMachines                      %NNZ       ExecTime                
> 1                   machine0   100%;      16min08sec
> 
> 2                   machine0   91.1%;     24min58sec      
>                      machine1   69.2%; 
> 
> 3                   machine0   90.9%      25min42sec
>                      machine1   82.8%
>                      machine2   51.6%
> 
> 4                   machine0   91.9%      26min27sec   
>                      machine1   82.4%
>                      machine2   73.1%
>                      machine3   39.9%
> 
> 5                   machine0   93.2%      39min23sec
>                      machine1   82.8%
>                      machine2   74.4%
>                      machine3   64.6%
>                      machine4   31.6%
> 
> 6                   machine0   94.2%      54min54sec
>                      machine1   82.6%
>                      machine2   73.1%
>                      machine3   65.2%
>                      machine4   55.9%                     
>                      machine5   25.4%

   Based on this I am guessing the last rows of the matrix have a lot of nonzeros away from the diagonal? 

   There is a big load imbalance in something: for example with 2 processes you have

VecMax             10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+04  9  0  0  0 72   9  0  0  0 72     0
VecScatterEnd      18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 53  0  0  0  0  53  0  0  0  0     0
MatMult            10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04 1.2e+06 0.0e+00 37 33 58 38  0  37 33 58 38  0    83
MatMultAdd          7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04 2.8e+06 0.0e+00 34 29 42 62  0  34 29 42 62  0    69

  the 5th column has the imbalance between slowest and fastest process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to get good speed ups these need to be much closer to 1. 

  How many nonzeros in the matrix are there per process? Is it very different for difference processes? You really need to have each process have similar number of matrix nonzeros.   Do you have a picture of the nonzero structure of the matrix?  Where does the matrix come from, why does it have this structure?

  Also likely there are just to many vector entries that need to be scattered to the last process for the matmults.
> 
> In this implementation I'm using MatCreate and VecCreate. I'm also leaving the partition sizes in PETSC_DECIDE.
> 
> Finally, to run the application, I'm using mpirun.hydra from mpich, downloaded by PETSc configure script.
> I'm checking the process assignment as suggested on the last email.
> 
> Am I missing anything?

  Your network is very poor; likely ethernet. It is had to get much speedup with such slow reductions and sends and receives.

Average time to get PetscTime(): 1.19209e-07
Average time for MPI_Barrier(): 0.000215769
Average time for zero size MPI_Send(): 5.94854e-05

  I think you are seeing such bad results due to an unkind matrix nonzero structure giving per load balance and too much communication and a very poor computer network that just makes all the needed communication totally dominate. 


> 
> Regards,
> Nelson
> 
> Em 2015-08-20 16:17, Matthew Knepley escreveu:
> 
>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt> wrote:
>> Hello.
>> 
>> I am sorry for the long time without response. I decided to rewrite my application in a different way and will send the log_summary output when done reimplementing.
>> 
>> As for the machine, I am using mpirun to run jobs in a 8 node cluster. I modified the makefile on the steams folder so it would run using my hostfile.
>> The output is attached to this email. It seems reasonable for a cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket.
>> 1) You launcher is placing processes haphazardly. I would figure out how to assign them to certain nodes
>> 2) Each node has enough bandwidth for 1 core, so it does not make much sense to use more than 1.
>>   Thanks,
>>     Matt
>>  
>> Cheers,
>> Nelson
>> 
>> 
>> Em 2015-07-24 16:50, Barry Smith escreveu:
>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16
>> ... processes with the option -log_summary and send (as attachments)
>> the log summary information.
>> 
>>    Also on the same machine run the streams benchmark; with recent
>> releases of PETSc you only need to do
>> 
>> cd $PETSC_DIR
>> make streams NPMAX=16 (or whatever your largest process count is)
>> 
>> and send the output.
>> 
>> I suspect that you are doing everything fine and it is more an issue
>> with the configuration of your machine. Also read the information at
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on
>> "binding"
>> 
>>   Barry
>> 
>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt> wrote:
>> 
>> Hello,
>> 
>> I have been using PETSc for a few months now, and it truly is fantastic piece of software.
>> 
>> In my particular example I am working with a large, sparse distributed (MPI AIJ) matrix we can refer as 'G'.
>> G is a horizontal - retangular matrix (for example, 1,1 Million rows per 2,1 Million columns). This matrix is commonly very sparse and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the diagonal block of MPI AIJ representation).
>> To work with this matrix, I also have a few parallel vectors (created using MatCreate Vec), we can refer as 'm' and 'k'.
>> I am trying to parallelize an iterative algorithm in which the most computational heavy operations are:
>> 
>> ->Matrix-Vector Multiplication, more precisely G * m + k = b (MatMultAdd). From what I have been reading, to achive a good speedup in this operation, G should be as much diagonal as possible, due to overlapping communication and computation. But even when using a G matrix in which the diagonal block has ~95% of the nnz, I cannot get a decent speedup. Most of the times, the performance even gets worse.
>> 
>> ->Matrix-Matrix Multiplication, in this case I need to perform G * G' = A, where A is later used on the linear solver and G' is transpose of G. The speedup in this operation is not worse, although is not very good.
>> 
>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" from the last two operations. I tried to apply a RCM permutation to A to make it more diagonal, for better performance. However, the problem I faced was that, the permutation is performed locally in each processor and thus, the final result is different with different number of processors. I assume this was intended to reduce communication. The solution I found was
>> 1-calculate A
>> 2-calculate, localy to 1 machine, the RCM permutation IS using A
>> 3-apply this permutation to the lines of G.
>> This works well, and A is generated as if RCM permuted. It is fine to do this operation in one machine because it is only done once while reading the input. The nnz of G become more spread and less diagonal, causing problems when calculating G * m + k = b.
>> 
>> These 3 operations (except the permutation) are performed in each iteration of my algorithm.
>> 
>> So, my questions are.
>> -What are the characteristics of G that lead to a good speedup in the operations I described? Am I missing something and too much obsessed with the diagonal block?
>> 
>> -Is there a better way to permute A without permute G and still get the same result using 1 or N machines?
>> 
>> 
>> I have been avoiding asking for help for a while. I'm very sorry for the long email.
>> Thank you very much for your time.
>> Best Regards,
>> Nelson
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>  
> <Log01P.txt><Log02P.txt><Log03P.txt><Log04P.txt><Log05P.txt><Log06P.txt>


From bsmith at mcs.anl.gov  Sat Aug 22 21:29:25 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 22 Aug 2015 21:29:25 -0500
Subject: [petsc-users] Variatonal inequalities
In-Reply-To: <CAJCWK9ATaF2DNyxtOHotjT8N9YRKfZbqy8Q0VZ35yE4dRV5S3g@mail.gmail.com>
References: <CAJCWK9ATaF2DNyxtOHotjT8N9YRKfZbqy8Q0VZ35yE4dRV5S3g@mail.gmail.com>
Message-ID: <31849CD7-9C14-487E-8133-5460E40FA6D3@mcs.anl.gov>


  David,

   Currently the only way to do this without adding a lot of additional PETSc code is to add additional variables such that only box constraints appear in the final problem. For example say you have constraints   c <= Ax <= d then introduce new variables y = Ax and then you have the larger problem of unknowns (x,y) and box constrains on y with -infinity and +infinity constraints on x. 

  Barry

> On Aug 22, 2015, at 6:59 AM, David Knezevic <david.knezevic at akselos.com> wrote:
> 
> Hi all,
> 
> I see from Section 5.7 of the manual that SNES supports box constraints on variables, which is great. However, I was also hoping to also be able to consider general linear inequality constraints, so I was wondering if anyone has any suggestions on how (or if) that could be done with PETSc?
> 
> Thanks,
> David
> 


From dongluo at pku.edu.cn  Sat Aug 22 23:38:09 2015
From: dongluo at pku.edu.cn (=?UTF-8?B?572X5Lic?=)
Date: Sun, 23 Aug 2015 12:38:09 +0800 (GMT+08:00)
Subject: [petsc-users] [petsc-user] intall problem when make test
Message-ID: <6fa66142.d93f.14f58d9711b.Coremail.dongluo@pku.edu.cn>

Dear all,


I'm Luo Dong from Peking University in Beijing, China. I'm trying to intall the Petsc on my account on Tianhe supercomputer. But I met some problem. I have successfully make the build, but when I use make test, there is some error:


[taojj at ln3%tianhe 3.6.1]$ make test
Running test examples to verify correct installation
Using PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 and PETSC_ARCH=linux-dbg
*mpiexec not found*. Please run src/snes/examples/tutorials/ex19 manually
*******************Error detected during compile or link!*******************
See http://www.mcs.anl.gov/petsc/documentation/faq.html
/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1/src/snes/examples/tutorials ex5f
*********************************************************


I don't know why thers is 'mpiexec not found'. I have use compilers mpicc, mpicxx, mpif90. On my account, I should use 'yhrun -n 1 -t 30 -p TH_NET' to submit a running work. I don't know if I should set some thing about this and where I should do that.


I have attached the uname-a output below:


[taojj at ln3%tianhe tutorials]$ uname -a
Linux ln3 2.6.32-358.11.1.2.ky3.1.x86_64 #1 SMP Mon Jul 8 13:05:58 CST 2013 x86_64 x86_64 x86_64 GNU/Linux


Please give me some help. 


Thanks for your help in advance.


Best,


Dong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150823/1dd84753/attachment-0001.html>

From balay at mcs.anl.gov  Sun Aug 23 00:47:42 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Sun, 23 Aug 2015 00:47:42 -0500
Subject: [petsc-users] [petsc-user] intall problem when make test
In-Reply-To: <6fa66142.d93f.14f58d9711b.Coremail.dongluo@pku.edu.cn>
References: <6fa66142.d93f.14f58d9711b.Coremail.dongluo@pku.edu.cn>
Message-ID: <alpine.LFD.2.20.1508230045590.26198@asterix>

As the message says - try running the examples manually. i.e

cd src/snes/examples/tutorials
make PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 PETSC_ARCH=linux-dbg ex19 ex5f


yhrun -n 1 -t 30 -p TH_NET ./ex29
yhrun -n 1 -t 30 -p TH_NET ./ex5f

Satish

On Sat, 22 Aug 2015, ?? wrote:

> Dear all,
> 
> 
> I'm Luo Dong from Peking University in Beijing, China. I'm trying to intall the Petsc on my account on Tianhe supercomputer. But I met some problem. I have successfully make the build, but when I use make test, there is some error:
> 
> 
> [taojj at ln3%tianhe 3.6.1]$ make test
> Running test examples to verify correct installation
> Using PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 and PETSC_ARCH=linux-dbg
> *mpiexec not found*. Please run src/snes/examples/tutorials/ex19 manually
> *******************Error detected during compile or link!*******************
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> /vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1/src/snes/examples/tutorials ex5f
> *********************************************************
> 
> 
> 
> 
> I don't know why thers is 'mpiexec not found'. I have use compilers mpicc, mpicxx, mpif90. On my account, I should use 'yhrun -n 1 -t 30 -p TH_NET' to submit a running work. I don't know if I should set some thing about this and where I should do that.
> 
> 
> I have attached the uname-a output below:
> 
> 
> [taojj at ln3%tianhe tutorials]$ uname -a
> Linux ln3 2.6.32-358.11.1.2.ky3.1.x86_64 #1 SMP Mon Jul 8 13:05:58 CST 2013 x86_64 x86_64 x86_64 GNU/Linux
> 
> 
> Please give me some help. 
> 
> 
> Thanks for your help in advance.
> 
> 
> Best,
> 
> 
> Dong

From dongluo at pku.edu.cn  Sun Aug 23 01:23:58 2015
From: dongluo at pku.edu.cn (=?UTF-8?B?572X5Lic?=)
Date: Sun, 23 Aug 2015 14:23:58 +0800 (GMT+08:00)
Subject: [petsc-users] [petsc-user] intall problem when make test
In-Reply-To: <alpine.LFD.2.20.1508230045590.26198@asterix>
References: <6fa66142.d93f.14f58d9711b.Coremail.dongluo@pku.edu.cn>
	<alpine.LFD.2.20.1508230045590.26198@asterix>
Message-ID: <3a73c956.d9cb.14f593a50ba.Coremail.dongluo@pku.edu.cn>


Dear Satish,

Thanks for your kind help.

I have done as you said.

And I get the following:

[taojj at ln3%tianhe tutorials]$ yhrun -n 1 -t 30 -p debug ./ex19
lid velocity = 0.0625, prandtl # = 1, grashof # = 1
Number of SNES iterations = 2
[taojj at ln3%tianhe tutorials]$ yhrun -n 1 -t 30 -p debug ./ex5f
Number of SNES iterations =     4
[taojj at ln3%tianhe tutorials]$ 


I think this means that it goes well.

Much appreciate for your help again.

Best,

Dong

> -----????-----
> ???: "Satish Balay" <balay at mcs.anl.gov>
> ????: 2015-08-23 13:47:42 (???)
> ???: "??" <dongluo at pku.edu.cn>
> ??: petsc-users at mcs.anl.gov
> ??: Re: [petsc-users] [petsc-user] intall problem when make test
> 
> As the message says - try running the examples manually. i.e
> 
> cd src/snes/examples/tutorials
> make PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 PETSC_ARCH=linux-dbg ex19 ex5f
> 
> 
> yhrun -n 1 -t 30 -p TH_NET ./ex29
> yhrun -n 1 -t 30 -p TH_NET ./ex5f
> 
> Satish
> 
> On Sat, 22 Aug 2015, ?? wrote:
> 
> > Dear all,
> > 
> > 
> > I'm Luo Dong from Peking University in Beijing, China. I'm trying to intall the Petsc on my account on Tianhe supercomputer. But I met some problem. I have successfully make the build, but when I use make test, there is some error:
> > 
> > 
> > [taojj at ln3%tianhe 3.6.1]$ make test
> > Running test examples to verify correct installation
> > Using PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 and PETSC_ARCH=linux-dbg
> > *mpiexec not found*. Please run src/snes/examples/tutorials/ex19 manually
> > *******************Error detected during compile or link!*******************
> > See http://www.mcs.anl.gov/petsc/documentation/faq.html
> > /vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1/src/snes/examples/tutorials ex5f
> > *********************************************************
> > 
> > 
> > 
> > 
> > I don't know why thers is 'mpiexec not found'. I have use compilers mpicc, mpicxx, mpif90. On my account, I should use 'yhrun -n 1 -t 30 -p TH_NET' to submit a running work. I don't know if I should set some thing about this and where I should do that.
> > 
> > 
> > I have attached the uname-a output below:
> > 
> > 
> > [taojj at ln3%tianhe tutorials]$ uname -a
> > Linux ln3 2.6.32-358.11.1.2.ky3.1.x86_64 #1 SMP Mon Jul 8 13:05:58 CST 2013 x86_64 x86_64 x86_64 GNU/Linux
> > 
> > 
> > Please give me some help. 
> > 
> > 
> > Thanks for your help in advance.
> > 
> > 
> > Best,
> > 
> > 
> > Dong

From david.knezevic at akselos.com  Sun Aug 23 09:29:01 2015
From: david.knezevic at akselos.com (David Knezevic)
Date: Sun, 23 Aug 2015 10:29:01 -0400
Subject: [petsc-users] Variatonal inequalities
In-Reply-To: <31849CD7-9C14-487E-8133-5460E40FA6D3@mcs.anl.gov>
References: <CAJCWK9ATaF2DNyxtOHotjT8N9YRKfZbqy8Q0VZ35yE4dRV5S3g@mail.gmail.com>
	<31849CD7-9C14-487E-8133-5460E40FA6D3@mcs.anl.gov>
Message-ID: <CAJCWK9AvT19XRnNLHLnTbYLe99xN=-D93pRg2vsPDqCaUQxTgw@mail.gmail.com>

On Sat, Aug 22, 2015 at 10:29 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   David,
>
>    Currently the only way to do this without adding a lot of additional
> PETSc code is to add additional variables such that only box constraints
> appear in the final problem. For example say you have constraints   c <= Ax
> <= d then introduce new variables y = Ax and then you have the larger
> problem of unknowns (x,y) and box constrains on y with -infinity and
> +infinity constraints on x.
>


OK, that makes sense, thanks for the info!

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150823/c6ba1ed9/attachment.html>

From nelsonflsilva at ist.utl.pt  Sun Aug 23 10:12:22 2015
From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva)
Date: Sun, 23 Aug 2015 16:12:22 +0100
Subject: [petsc-users] Scalability issue
In-Reply-To: <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
	<6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
	<CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>
	<11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt>
	<69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov>
Message-ID: <b53dee5fd906088b3208f09b34ef08b6@mail.ist.utl.pt>

Thank you for the fast response!

Yes. The last rows of the matrix are indeed more dense, compared with 
the remaining ones.
For this example, concerning load balance between machines, the last 
process had 46% of the matrix nonzero entries. A few weeks ago I 
suspected of this problem and wrote a little function that could permute 
the matrix rows based on their number of nonzeros. However, the matrix 
would become less pleasant regarding "diagonal block weight", and I stop 
using it as i thought I was becoming worse.

Also, due to this problem, I thought I could have a complete vector 
copy in each processor, instead of a distributed vector. I tried to 
implement this idea, but had no luck with the results. However, even if 
this solution would work, the communication for vector update was 
inevitable once each iteration of my algorithm.
Since this is a rectangular matrix, I cannot apply RCM or such 
permutations, however I can permute rows and columns though.

More specifically, the problem I'm trying to solve is one of balance 
the best guess and uncertainty estimates of a set of Input-Output 
subject to linear constraints and ancillary informations. The matrix is 
called an aggregation matrix, and each entry can be 1, 0 or -1. I don't 
know the cause of its nonzero structure. I'm addressing this problem 
using a weighted least-squares algorithm.

I ran the code with a different, more friendly problem topology, 
logging the load of nonzero entries and the "diagonal load" per 
processor.
I'm sending images of both matrices nonzero structure. The last email 
example used matrix1, the example in this email uses matrix2.
Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns and 
5.171.901 nnz.
The matrix2 (this email example) is 800.000 rows x 8.800.000 columns 
and 16.800.000 nnz.


With 1,2,3 machines, I have these distributions of nonzeros (using 
matrix2). I'm sending the logs in this email.
1 machine
[0] Matrix diagonal_nnz:16800000 (100.00 %)
[0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 
%)
ExecTime: 4min47sec

2 machines
[0] Matrix diagonal_nnz:4400000 (52.38 %)
[1] Matrix diagonal_nnz:4000000 (47.62 %)

[0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
[1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
ExecTime: 13min23sec

3 machines
[0] Matrix diagonal_nnz:2933334 (52.38 %)
[1] Matrix diagonal_nnz:533327 (9.52 %)
[2] Matrix diagonal_nnz:2399999 (42.86 %)

[0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
[1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
[2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %)
ExecTime: 20min26sec

As for the network, I ran the make streams NPMAX=3 again. I'm also 
sending it in this email.

I too think that these bad results are caused by a combination of bad 
matrix structure, especially the "diagonal weight", and maybe network.

I really should find a way to permute these matrices to a more friendly 
structure.

Thank you very much for the help.
Nelson

Em 2015-08-22 22:49, Barry Smith escreveu:
>> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva 
>> <nelsonflsilva at ist.utl.pt> wrote:
>>
>> Hi.
>>
>>
>> I managed to finish the re-implementation. I ran the program with 
>> 1,2,3,4,5,6 machines and saved the summary. I send each of them in 
>> this email.
>> In these executions, the program performs Matrix-Vector (MatMult, 
>> MatMultAdd) products and Vector-Vector operations. From what I 
>> understand while reading the logs, the program takes most of the time 
>> in "VecScatterEnd".
>> In this example, the matrix taking part on the Matrix-Vector 
>> products is not "much diagonal heavy".
>> The following numbers are the percentages of nnz values on the 
>> matrix diagonal block for each machine, and each execution time.
>> NMachines                      %NNZ       ExecTime
>> 1                   machine0   100%;      16min08sec
>>
>> 2                   machine0   91.1%;     24min58sec
>>                      machine1   69.2%;
>>
>> 3                   machine0   90.9%      25min42sec
>>                      machine1   82.8%
>>                      machine2   51.6%
>>
>> 4                   machine0   91.9%      26min27sec
>>                      machine1   82.4%
>>                      machine2   73.1%
>>                      machine3   39.9%
>>
>> 5                   machine0   93.2%      39min23sec
>>                      machine1   82.8%
>>                      machine2   74.4%
>>                      machine3   64.6%
>>                      machine4   31.6%
>>
>> 6                   machine0   94.2%      54min54sec
>>                      machine1   82.6%
>>                      machine2   73.1%
>>                      machine3   65.2%
>>                      machine4   55.9%
>>                      machine5   25.4%
>
>    Based on this I am guessing the last rows of the matrix have a lot
> of nonzeros away from the diagonal?
>
>    There is a big load imbalance in something: for example with 2
> processes you have
>
> VecMax             10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00
> 0.0e+00 1.1e+04  9  0  0  0 72   9  0  0  0 72     0
> VecScatterEnd      18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00 53  0  0  0  0  53  0  0  0  0     0
> MatMult            10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04
> 1.2e+06 0.0e+00 37 33 58 38  0  37 33 58 38  0    83
> MatMultAdd          7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04
> 2.8e+06 0.0e+00 34 29 42 62  0  34 29 42 62  0    69
>
>   the 5th column has the imbalance between slowest and fastest
> process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to
> get good speed ups these need to be much closer to 1.
>
>   How many nonzeros in the matrix are there per process? Is it very
> different for difference processes? You really need to have each
> process have similar number of matrix nonzeros.   Do you have a
> picture of the nonzero structure of the matrix?  Where does the 
> matrix
> come from, why does it have this structure?
>
>   Also likely there are just to many vector entries that need to be
> scattered to the last process for the matmults.
>>
>> In this implementation I'm using MatCreate and VecCreate. I'm also 
>> leaving the partition sizes in PETSC_DECIDE.
>>
>> Finally, to run the application, I'm using mpirun.hydra from mpich, 
>> downloaded by PETSc configure script.
>> I'm checking the process assignment as suggested on the last email.
>>
>> Am I missing anything?
>
>   Your network is very poor; likely ethernet. It is had to get much
> speedup with such slow reductions and sends and receives.
>
> Average time to get PetscTime(): 1.19209e-07
> Average time for MPI_Barrier(): 0.000215769
> Average time for zero size MPI_Send(): 5.94854e-05
>
>   I think you are seeing such bad results due to an unkind matrix
> nonzero structure giving per load balance and too much communication
> and a very poor computer network that just makes all the needed
> communication totally dominate.
>
>
>>
>> Regards,
>> Nelson
>>
>> Em 2015-08-20 16:17, Matthew Knepley escreveu:
>>
>>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva 
>>> <nelsonflsilva at ist.utl.pt> wrote:
>>> Hello.
>>>
>>> I am sorry for the long time without response. I decided to rewrite 
>>> my application in a different way and will send the log_summary 
>>> output when done reimplementing.
>>>
>>> As for the machine, I am using mpirun to run jobs in a 8 node 
>>> cluster. I modified the makefile on the steams folder so it would run 
>>> using my hostfile.
>>> The output is attached to this email. It seems reasonable for a 
>>> cluster with 8 machines. From "lscpu", each machine cpu has 4 cores 
>>> and 1 socket.
>>> 1) You launcher is placing processes haphazardly. I would figure 
>>> out how to assign them to certain nodes
>>> 2) Each node has enough bandwidth for 1 core, so it does not make 
>>> much sense to use more than 1.
>>>   Thanks,
>>>     Matt
>>>
>>> Cheers,
>>> Nelson
>>>
>>>
>>> Em 2015-07-24 16:50, Barry Smith escreveu:
>>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16
>>> ... processes with the option -log_summary and send (as 
>>> attachments)
>>> the log summary information.
>>>
>>>    Also on the same machine run the streams benchmark; with recent
>>> releases of PETSc you only need to do
>>>
>>> cd $PETSC_DIR
>>> make streams NPMAX=16 (or whatever your largest process count is)
>>>
>>> and send the output.
>>>
>>> I suspect that you are doing everything fine and it is more an 
>>> issue
>>> with the configuration of your machine. Also read the information 
>>> at
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on
>>> "binding"
>>>
>>>   Barry
>>>
>>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva 
>>> <nelsonflsilva at ist.utl.pt> wrote:
>>>
>>> Hello,
>>>
>>> I have been using PETSc for a few months now, and it truly is 
>>> fantastic piece of software.
>>>
>>> In my particular example I am working with a large, sparse 
>>> distributed (MPI AIJ) matrix we can refer as 'G'.
>>> G is a horizontal - retangular matrix (for example, 1,1 Million 
>>> rows per 2,1 Million columns). This matrix is commonly very sparse 
>>> and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% 
>>> are on the diagonal block of MPI AIJ representation).
>>> To work with this matrix, I also have a few parallel vectors 
>>> (created using MatCreate Vec), we can refer as 'm' and 'k'.
>>> I am trying to parallelize an iterative algorithm in which the most 
>>> computational heavy operations are:
>>>
>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b 
>>> (MatMultAdd). From what I have been reading, to achive a good speedup 
>>> in this operation, G should be as much diagonal as possible, due to 
>>> overlapping communication and computation. But even when using a G 
>>> matrix in which the diagonal block has ~95% of the nnz, I cannot get 
>>> a decent speedup. Most of the times, the performance even gets worse.
>>>
>>> ->Matrix-Matrix Multiplication, in this case I need to perform G * 
>>> G' = A, where A is later used on the linear solver and G' is 
>>> transpose of G. The speedup in this operation is not worse, although 
>>> is not very good.
>>>
>>> ->Linear problem solving. Lastly, In this operation I compute 
>>> "Ax=b" from the last two operations. I tried to apply a RCM 
>>> permutation to A to make it more diagonal, for better performance. 
>>> However, the problem I faced was that, the permutation is performed 
>>> locally in each processor and thus, the final result is different 
>>> with different number of processors. I assume this was intended to 
>>> reduce communication. The solution I found was
>>> 1-calculate A
>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A
>>> 3-apply this permutation to the lines of G.
>>> This works well, and A is generated as if RCM permuted. It is fine 
>>> to do this operation in one machine because it is only done once 
>>> while reading the input. The nnz of G become more spread and less 
>>> diagonal, causing problems when calculating G * m + k = b.
>>>
>>> These 3 operations (except the permutation) are performed in each 
>>> iteration of my algorithm.
>>>
>>> So, my questions are.
>>> -What are the characteristics of G that lead to a good speedup in 
>>> the operations I described? Am I missing something and too much 
>>> obsessed with the diagonal block?
>>>
>>> -Is there a better way to permute A without permute G and still get 
>>> the same result using 1 or N machines?
>>>
>>>
>>> I have been avoiding asking for help for a while. I'm very sorry 
>>> for the long email.
>>> Thank you very much for your time.
>>> Best Regards,
>>> Nelson
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their 
>>> experiments is infinitely more interesting than any results to which 
>>> their experiments lead.
>>> -- Norbert Wiener
>>
>> 
>> <Log01P.txt><Log02P.txt><Log03P.txt><Log04P.txt><Log05P.txt><Log06P.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log01P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150823/bb91eb4b/attachment-0004.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log02P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150823/bb91eb4b/attachment-0005.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log03P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150823/bb91eb4b/attachment-0006.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matrix1.png
Type: application/octet-stream
Size: 1936 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150823/bb91eb4b/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matrix2.png
Type: application/octet-stream
Size: 2058 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150823/bb91eb4b/attachment-0003.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: streams.out
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150823/bb91eb4b/attachment-0007.ksh>

From bsmith at mcs.anl.gov  Sun Aug 23 14:19:52 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 23 Aug 2015 14:19:52 -0500
Subject: [petsc-users] Scalability issue
In-Reply-To: <b53dee5fd906088b3208f09b34ef08b6@mail.ist.utl.pt>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
	<6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
	<CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>
	<11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt>
	<69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov>
	<b53dee5fd906088b3208f09b34ef08b6@mail.ist.utl.pt>
Message-ID: <53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov>


   A suggestion: take your second ordering and now interlace the second half of the rows with the first half of the rows (keeping the some column ordering) That is, order the rows 0, n/2, 1, n/2+1, 2, n/2+2 etc   this will take the two separate "diagonal" bands and form a single "diagonal band".  This will increase the "diagonal block weight" to be pretty high and the only scatters will need to be for the final rows of the input vector that all processes need to do their part of the multiply. Generate the image to make sure what I suggest make sense and then run this ordering with 1, 2, and 3 processes. Send the logs.

  Barry

> On Aug 23, 2015, at 10:12 AM, Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt> wrote:
> 
> Thank you for the fast response!
> 
> Yes. The last rows of the matrix are indeed more dense, compared with the remaining ones.
> For this example, concerning load balance between machines, the last process had 46% of the matrix nonzero entries. A few weeks ago I suspected of this problem and wrote a little function that could permute the matrix rows based on their number of nonzeros. However, the matrix would become less pleasant regarding "diagonal block weight", and I stop using it as i thought I was becoming worse.
> 
> Also, due to this problem, I thought I could have a complete vector copy in each processor, instead of a distributed vector. I tried to implement this idea, but had no luck with the results. However, even if this solution would work, the communication for vector update was inevitable once each iteration of my algorithm.
> Since this is a rectangular matrix, I cannot apply RCM or such permutations, however I can permute rows and columns though.
> 
> More specifically, the problem I'm trying to solve is one of balance the best guess and uncertainty estimates of a set of Input-Output subject to linear constraints and ancillary informations. The matrix is called an aggregation matrix, and each entry can be 1, 0 or -1. I don't know the cause of its nonzero structure. I'm addressing this problem using a weighted least-squares algorithm.
> 
> I ran the code with a different, more friendly problem topology, logging the load of nonzero entries and the "diagonal load" per processor.
> I'm sending images of both matrices nonzero structure. The last email example used matrix1, the example in this email uses matrix2.
> Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns and 5.171.901 nnz.
> The matrix2 (this email example) is 800.000 rows x 8.800.000 columns and 16.800.000 nnz.
> 
> 
> With 1,2,3 machines, I have these distributions of nonzeros (using matrix2). I'm sending the logs in this email.
> 1 machine
> [0] Matrix diagonal_nnz:16800000 (100.00 %)
> [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %)
> ExecTime: 4min47sec
> 
> 2 machines
> [0] Matrix diagonal_nnz:4400000 (52.38 %)
> [1] Matrix diagonal_nnz:4000000 (47.62 %)
> 
> [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
> [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
> ExecTime: 13min23sec
> 
> 3 machines
> [0] Matrix diagonal_nnz:2933334 (52.38 %)
> [1] Matrix diagonal_nnz:533327 (9.52 %)
> [2] Matrix diagonal_nnz:2399999 (42.86 %)
> 
> [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
> [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
> [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %)
> ExecTime: 20min26sec
> 
> As for the network, I ran the make streams NPMAX=3 again. I'm also sending it in this email.
> 
> I too think that these bad results are caused by a combination of bad matrix structure, especially the "diagonal weight", and maybe network.
> 
> I really should find a way to permute these matrices to a more friendly structure.
> 
> Thank you very much for the help.
> Nelson
> 
> Em 2015-08-22 22:49, Barry Smith escreveu:
>>> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt> wrote:
>>> 
>>> Hi.
>>> 
>>> 
>>> I managed to finish the re-implementation. I ran the program with 1,2,3,4,5,6 machines and saved the summary. I send each of them in this email.
>>> In these executions, the program performs Matrix-Vector (MatMult, MatMultAdd) products and Vector-Vector operations. From what I understand while reading the logs, the program takes most of the time in "VecScatterEnd".
>>> In this example, the matrix taking part on the Matrix-Vector products is not "much diagonal heavy".
>>> The following numbers are the percentages of nnz values on the matrix diagonal block for each machine, and each execution time.
>>> NMachines                      %NNZ       ExecTime
>>> 1                   machine0   100%;      16min08sec
>>> 
>>> 2                   machine0   91.1%;     24min58sec
>>>                     machine1   69.2%;
>>> 
>>> 3                   machine0   90.9%      25min42sec
>>>                     machine1   82.8%
>>>                     machine2   51.6%
>>> 
>>> 4                   machine0   91.9%      26min27sec
>>>                     machine1   82.4%
>>>                     machine2   73.1%
>>>                     machine3   39.9%
>>> 
>>> 5                   machine0   93.2%      39min23sec
>>>                     machine1   82.8%
>>>                     machine2   74.4%
>>>                     machine3   64.6%
>>>                     machine4   31.6%
>>> 
>>> 6                   machine0   94.2%      54min54sec
>>>                     machine1   82.6%
>>>                     machine2   73.1%
>>>                     machine3   65.2%
>>>                     machine4   55.9%
>>>                     machine5   25.4%
>> 
>>   Based on this I am guessing the last rows of the matrix have a lot
>> of nonzeros away from the diagonal?
>> 
>>   There is a big load imbalance in something: for example with 2
>> processes you have
>> 
>> VecMax             10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00
>> 0.0e+00 1.1e+04  9  0  0  0 72   9  0  0  0 72     0
>> VecScatterEnd      18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00 53  0  0  0  0  53  0  0  0  0     0
>> MatMult            10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04
>> 1.2e+06 0.0e+00 37 33 58 38  0  37 33 58 38  0    83
>> MatMultAdd          7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04
>> 2.8e+06 0.0e+00 34 29 42 62  0  34 29 42 62  0    69
>> 
>>  the 5th column has the imbalance between slowest and fastest
>> process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to
>> get good speed ups these need to be much closer to 1.
>> 
>>  How many nonzeros in the matrix are there per process? Is it very
>> different for difference processes? You really need to have each
>> process have similar number of matrix nonzeros.   Do you have a
>> picture of the nonzero structure of the matrix?  Where does the matrix
>> come from, why does it have this structure?
>> 
>>  Also likely there are just to many vector entries that need to be
>> scattered to the last process for the matmults.
>>> 
>>> In this implementation I'm using MatCreate and VecCreate. I'm also leaving the partition sizes in PETSC_DECIDE.
>>> 
>>> Finally, to run the application, I'm using mpirun.hydra from mpich, downloaded by PETSc configure script.
>>> I'm checking the process assignment as suggested on the last email.
>>> 
>>> Am I missing anything?
>> 
>>  Your network is very poor; likely ethernet. It is had to get much
>> speedup with such slow reductions and sends and receives.
>> 
>> Average time to get PetscTime(): 1.19209e-07
>> Average time for MPI_Barrier(): 0.000215769
>> Average time for zero size MPI_Send(): 5.94854e-05
>> 
>>  I think you are seeing such bad results due to an unkind matrix
>> nonzero structure giving per load balance and too much communication
>> and a very poor computer network that just makes all the needed
>> communication totally dominate.
>> 
>> 
>>> 
>>> Regards,
>>> Nelson
>>> 
>>> Em 2015-08-20 16:17, Matthew Knepley escreveu:
>>> 
>>>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt> wrote:
>>>> Hello.
>>>> 
>>>> I am sorry for the long time without response. I decided to rewrite my application in a different way and will send the log_summary output when done reimplementing.
>>>> 
>>>> As for the machine, I am using mpirun to run jobs in a 8 node cluster. I modified the makefile on the steams folder so it would run using my hostfile.
>>>> The output is attached to this email. It seems reasonable for a cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket.
>>>> 1) You launcher is placing processes haphazardly. I would figure out how to assign them to certain nodes
>>>> 2) Each node has enough bandwidth for 1 core, so it does not make much sense to use more than 1.
>>>>  Thanks,
>>>>    Matt
>>>> 
>>>> Cheers,
>>>> Nelson
>>>> 
>>>> 
>>>> Em 2015-07-24 16:50, Barry Smith escreveu:
>>>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16
>>>> ... processes with the option -log_summary and send (as attachments)
>>>> the log summary information.
>>>> 
>>>>   Also on the same machine run the streams benchmark; with recent
>>>> releases of PETSc you only need to do
>>>> 
>>>> cd $PETSC_DIR
>>>> make streams NPMAX=16 (or whatever your largest process count is)
>>>> 
>>>> and send the output.
>>>> 
>>>> I suspect that you are doing everything fine and it is more an issue
>>>> with the configuration of your machine. Also read the information at
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on
>>>> "binding"
>>>> 
>>>>  Barry
>>>> 
>>>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> I have been using PETSc for a few months now, and it truly is fantastic piece of software.
>>>> 
>>>> In my particular example I am working with a large, sparse distributed (MPI AIJ) matrix we can refer as 'G'.
>>>> G is a horizontal - retangular matrix (for example, 1,1 Million rows per 2,1 Million columns). This matrix is commonly very sparse and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the diagonal block of MPI AIJ representation).
>>>> To work with this matrix, I also have a few parallel vectors (created using MatCreate Vec), we can refer as 'm' and 'k'.
>>>> I am trying to parallelize an iterative algorithm in which the most computational heavy operations are:
>>>> 
>>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b (MatMultAdd). From what I have been reading, to achive a good speedup in this operation, G should be as much diagonal as possible, due to overlapping communication and computation. But even when using a G matrix in which the diagonal block has ~95% of the nnz, I cannot get a decent speedup. Most of the times, the performance even gets worse.
>>>> 
>>>> ->Matrix-Matrix Multiplication, in this case I need to perform G * G' = A, where A is later used on the linear solver and G' is transpose of G. The speedup in this operation is not worse, although is not very good.
>>>> 
>>>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" from the last two operations. I tried to apply a RCM permutation to A to make it more diagonal, for better performance. However, the problem I faced was that, the permutation is performed locally in each processor and thus, the final result is different with different number of processors. I assume this was intended to reduce communication. The solution I found was
>>>> 1-calculate A
>>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A
>>>> 3-apply this permutation to the lines of G.
>>>> This works well, and A is generated as if RCM permuted. It is fine to do this operation in one machine because it is only done once while reading the input. The nnz of G become more spread and less diagonal, causing problems when calculating G * m + k = b.
>>>> 
>>>> These 3 operations (except the permutation) are performed in each iteration of my algorithm.
>>>> 
>>>> So, my questions are.
>>>> -What are the characteristics of G that lead to a good speedup in the operations I described? Am I missing something and too much obsessed with the diagonal block?
>>>> 
>>>> -Is there a better way to permute A without permute G and still get the same result using 1 or N machines?
>>>> 
>>>> 
>>>> I have been avoiding asking for help for a while. I'm very sorry for the long email.
>>>> Thank you very much for your time.
>>>> Best Regards,
>>>> Nelson
>>>> 
>>>> 
>>>> --
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>> 
>>> <Log01P.txt><Log02P.txt><Log03P.txt><Log04P.txt><Log05P.txt><Log06P.txt>
> <Log01P.txt><Log02P.txt><Log03P.txt><matrix1.png><matrix2.png><streams.out>


From zonexo at gmail.com  Mon Aug 24 04:09:53 2015
From: zonexo at gmail.com (Wee-Beng Tay)
Date: Mon, 24 Aug 2015 17:09:53 +0800
Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil
	or MatSetValuesLocal
Message-ID: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>

Hi,

I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI
along 2 directions (y,z)

Previously I was using MatSetValues with global indices. However, now I'm
using DM and global indices is much more difficult.

I come across MatSetValuesStencil or MatSetValuesLocal.

So what's the difference bet the one since they both seem to work locally?

Which is a simpler/better option?

Is there an example in Fortran for MatSetValuesStencil?

Do I also need to use DMDAGetAO together with MatSetValuesStencil or
MatSetValuesLocal?

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/6e1d33d9/attachment.html>

From timothee.nicolas at gmail.com  Mon Aug 24 04:54:54 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Mon, 24 Aug 2015 18:54:54 +0900
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
Message-ID: <CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>

Hi,

ex5 of snes can give you an example of the two routines.

The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version
ex5f90.F uses MatSetValuesLocal.

However, I use MatSetValuesStencil also in Fortran, there is no problem,
and no need to mess around with DMDAGetAO, I think.

To input values in the matrix, you need to do the following :

! Declare the matstencils for matrix columns and rows
MatStencil  :: row(4,1),col(4,n)
! Declare the quantity which will store the actual matrix elements
PetscScalar :: v(8)

The first dimension in row and col is 4 to allow for 3 spatial dimensions
(even if you use only 2) plus one degree of freedom if you have several
fields in your DMDA. The second dimension is 1 for row (you input one row
at a time) and n for col, where n is the number of columns that you input.
For instance, if at node (1,i,j)  (1 is the index of the degree of
freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j),
(1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set
n=6

Then you define the row number by naturally doing the following, inside a
local loop :

row(MatStencil_i,1) = i          -1
row(MatStencil_j,1) = j          -1
row(MatStencil_c,1) = 1          -1

the -1 are here because FORTRAN indexing is different from the native C
indexing. I put them on the right to make this more apparent.

Then the column information. For instance to declare the coupling with node
(1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will have
to write (still within the same local loop on i and j)

col(MatStencil_i,1) = i         -1
col(MatStencil_j,1) = j         -1
col(MatStencil_c,1) = 1         -1
v(1) = whatever_it_is

col(MatStencil_i,2) = i-1       -1
col(MatStencil_j,2) = j         -1
col(MatStencil_c,2) = 1         -1
v(2) = whatever_it_is

col(MatStencil_i,3) = i       -1
col(MatStencil_j,3) = j         -1
col(MatStencil_c,3) = 2         -1
v(3) = whatever_it_is

...
...
..

...
...
...

Note that the index of the degree of freedom (or what field you are
coupling to), is indicated by MatStencil_c


Finally use MatSetValuesStencil

ione = 1
isix = 6
call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr)

If it is not clear don't hesitate to ask more details. For me it worked
that way, I succesfully computed a Jacobian that way. It is very sensitive.
If you slightly depart from the right jacobian, you will see a huge
difference compared to using  matrix free with -snes_mf, so you can hardly
make a mistake because you would see it. That's how I finally got it to
work.

Best

Timothee


2015-08-24 18:09 GMT+09:00 Wee-Beng Tay <zonexo at gmail.com>:

> Hi,
>
> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI
> along 2 directions (y,z)
>
> Previously I was using MatSetValues with global indices. However, now I'm
> using DM and global indices is much more difficult.
>
> I come across MatSetValuesStencil or MatSetValuesLocal.
>
> So what's the difference bet the one since they both seem to work locally?
>
> Which is a simpler/better option?
>
> Is there an example in Fortran for MatSetValuesStencil?
>
> Do I also need to use DMDAGetAO together with MatSetValuesStencil or
> MatSetValuesLocal?
>
> Thanks!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/bcd1849a/attachment.html>

From timothee.nicolas at gmail.com  Mon Aug 24 04:56:43 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Mon, 24 Aug 2015 18:56:43 +0900
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
Message-ID: <CAGi1ndRmyJNkm7_n3QfXp4AXaKwFKqf=Y=nF_heGjC5vff0DJQ@mail.gmail.com>

Small erratum, in the declaration for v, it should be

PetscScalar :: v(n)

where n is the same as for col (6 in the example, not 8 which I copied from
my particular case)

2015-08-24 18:54 GMT+09:00 Timoth?e Nicolas <timothee.nicolas at gmail.com>:

> Hi,
>
> ex5 of snes can give you an example of the two routines.
>
> The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version
> ex5f90.F uses MatSetValuesLocal.
>
> However, I use MatSetValuesStencil also in Fortran, there is no problem,
> and no need to mess around with DMDAGetAO, I think.
>
> To input values in the matrix, you need to do the following :
>
> ! Declare the matstencils for matrix columns and rows
> MatStencil  :: row(4,1),col(4,n)
> ! Declare the quantity which will store the actual matrix elements
> PetscScalar :: v(8)
>
> The first dimension in row and col is 4 to allow for 3 spatial dimensions
> (even if you use only 2) plus one degree of freedom if you have several
> fields in your DMDA. The second dimension is 1 for row (you input one row
> at a time) and n for col, where n is the number of columns that you input.
> For instance, if at node (1,i,j)  (1 is the index of the degree of
> freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j),
> (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set
> n=6
>
> Then you define the row number by naturally doing the following, inside a
> local loop :
>
> row(MatStencil_i,1) = i          -1
> row(MatStencil_j,1) = j          -1
> row(MatStencil_c,1) = 1          -1
>
> the -1 are here because FORTRAN indexing is different from the native C
> indexing. I put them on the right to make this more apparent.
>
> Then the column information. For instance to declare the coupling with
> node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will
> have to write (still within the same local loop on i and j)
>
> col(MatStencil_i,1) = i         -1
> col(MatStencil_j,1) = j         -1
> col(MatStencil_c,1) = 1         -1
> v(1) = whatever_it_is
>
> col(MatStencil_i,2) = i-1       -1
> col(MatStencil_j,2) = j         -1
> col(MatStencil_c,2) = 1         -1
> v(2) = whatever_it_is
>
> col(MatStencil_i,3) = i       -1
> col(MatStencil_j,3) = j         -1
> col(MatStencil_c,3) = 2         -1
> v(3) = whatever_it_is
>
> ...
> ...
> ..
>
> ...
> ...
> ...
>
> Note that the index of the degree of freedom (or what field you are
> coupling to), is indicated by MatStencil_c
>
>
> Finally use MatSetValuesStencil
>
> ione = 1
> isix = 6
> call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr)
>
> If it is not clear don't hesitate to ask more details. For me it worked
> that way, I succesfully computed a Jacobian that way. It is very sensitive.
> If you slightly depart from the right jacobian, you will see a huge
> difference compared to using  matrix free with -snes_mf, so you can hardly
> make a mistake because you would see it. That's how I finally got it to
> work.
>
> Best
>
> Timothee
>
>
> 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay <zonexo at gmail.com>:
>
>> Hi,
>>
>> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI
>> along 2 directions (y,z)
>>
>> Previously I was using MatSetValues with global indices. However, now I'm
>> using DM and global indices is much more difficult.
>>
>> I come across MatSetValuesStencil or MatSetValuesLocal.
>>
>> So what's the difference bet the one since they both seem to work locally?
>>
>> Which is a simpler/better option?
>>
>> Is there an example in Fortran for MatSetValuesStencil?
>>
>> Do I also need to use DMDAGetAO together with MatSetValuesStencil or
>> MatSetValuesLocal?
>>
>> Thanks!
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/8cdf0d98/attachment-0001.html>

From knepley at gmail.com  Mon Aug 24 05:21:21 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 24 Aug 2015 05:21:21 -0500
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
Message-ID: <CAMYG4G=M+uegDpdwNy32DhtpJsCVQFviA62UE6vguxxpDJMBJA@mail.gmail.com>

On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay <zonexo at gmail.com> wrote:

> Hi,
>
> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI
> along 2 directions (y,z)
>
> Previously I was using MatSetValues with global indices. However, now I'm
> using DM and global indices is much more difficult.
>
> I come across MatSetValuesStencil or MatSetValuesLocal.
>
> So what's the difference bet the one since they both seem to work locally?
>

No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes
global vertex numbers.


> Which is a simpler/better option?
>

MatSetValuesStencil()


> Is there an example in Fortran for MatSetValuesStencil?
>

Timoth?e Nicolas shows one in his reply.

Do I also need to use DMDAGetAO together with MatSetValuesStencil or
> MatSetValuesLocal?
>

No.

  Thanks,

     Matt


> Thanks!
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/35e8b8c3/attachment.html>

From nelsonflsilva at ist.utl.pt  Mon Aug 24 09:24:27 2015
From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva)
Date: Mon, 24 Aug 2015 15:24:27 +0100
Subject: [petsc-users] Scalability issue
In-Reply-To: <53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
	<6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
	<CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>
	<11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt>
	<69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov>
	<b53dee5fd906088b3208f09b34ef08b6@mail.ist.utl.pt>
	<53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov>
Message-ID: <0c09f050aa7e81b476b7f2245a60f184@mail.ist.utl.pt>

Hello. Thank you very much for your time.

I understood the idea, it works very well.
I also noticed that my algorithm performs a different number of 
iterations with different number of machines. The stop conditions are 
calculated using PETSc "matmultadd". I'm very positive that there may be 
a program bug in my code, or could it be something with PETSc?
I also need to figure out why those vecmax ratio are so high. The 
vecset is understandable as I'm distributing the initial information 
from the root machine in sequencial.

These are the new values:
1 machine
[0] Matrix diagonal_nnz:16800000 (100.00 %)
[0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 
%)
ExecTime: 4min47sec
Iterations: 236

2 machines
[0] Matrix diagonal_nnz:8000000 (95.24 %)
[1] Matrix diagonal_nnz:7600000 (90.48 %)

[0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
[1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
ExecTime: 5min26sec
Iterations: 330

3 machines
[0] Matrix diagonal_nnz:5333340 (95.24 %)
[1] Matrix diagonal_nnz:4800012 (85.71 %)
[2] Matrix diagonal_nnz:4533332 (80.95 %)

[0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
[1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
[2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %))
ExecTime: 5min25sec
Iterations: 346

The suggested permutation worked very well in comparison with the 
original matrix structure. The no-speedup may be related with the 
different number of iterations.

Once again, thank you very much for the time.
Cheers,
Nelson

Em 2015-08-23 20:19, Barry Smith escreveu:
> A suggestion: take your second ordering and now interlace the second
> half of the rows with the first half of the rows (keeping the some
> column ordering) That is, order the rows 0, n/2, 1, n/2+1, 2, n/2+2
> etc   this will take the two separate "diagonal" bands and form a
> single "diagonal band".  This will increase the "diagonal block
> weight" to be pretty high and the only scatters will need to be for
> the final rows of the input vector that all processes need to do 
> their
> part of the multiply. Generate the image to make sure what I suggest
> make sense and then run this ordering with 1, 2, and 3 processes. 
> Send
> the logs.
>
>   Barry
>
>> On Aug 23, 2015, at 10:12 AM, Nelson Filipe Lopes da Silva 
>> <nelsonflsilva at ist.utl.pt> wrote:
>>
>> Thank you for the fast response!
>>
>> Yes. The last rows of the matrix are indeed more dense, compared 
>> with the remaining ones.
>> For this example, concerning load balance between machines, the last 
>> process had 46% of the matrix nonzero entries. A few weeks ago I 
>> suspected of this problem and wrote a little function that could 
>> permute the matrix rows based on their number of nonzeros. However, 
>> the matrix would become less pleasant regarding "diagonal block 
>> weight", and I stop using it as i thought I was becoming worse.
>>
>> Also, due to this problem, I thought I could have a complete vector 
>> copy in each processor, instead of a distributed vector. I tried to 
>> implement this idea, but had no luck with the results. However, even 
>> if this solution would work, the communication for vector update was 
>> inevitable once each iteration of my algorithm.
>> Since this is a rectangular matrix, I cannot apply RCM or such 
>> permutations, however I can permute rows and columns though.
>>
>> More specifically, the problem I'm trying to solve is one of balance 
>> the best guess and uncertainty estimates of a set of Input-Output 
>> subject to linear constraints and ancillary informations. The matrix 
>> is called an aggregation matrix, and each entry can be 1, 0 or -1. I 
>> don't know the cause of its nonzero structure. I'm addressing this 
>> problem using a weighted least-squares algorithm.
>>
>> I ran the code with a different, more friendly problem topology, 
>> logging the load of nonzero entries and the "diagonal load" per 
>> processor.
>> I'm sending images of both matrices nonzero structure. The last 
>> email example used matrix1, the example in this email uses matrix2.
>> Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns 
>> and 5.171.901 nnz.
>> The matrix2 (this email example) is 800.000 rows x 8.800.000 columns 
>> and 16.800.000 nnz.
>>
>>
>> With 1,2,3 machines, I have these distributions of nonzeros (using 
>> matrix2). I'm sending the logs in this email.
>> 1 machine
>> [0] Matrix diagonal_nnz:16800000 (100.00 %)
>> [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 
>> (100.00 %)
>> ExecTime: 4min47sec
>>
>> 2 machines
>> [0] Matrix diagonal_nnz:4400000 (52.38 %)
>> [1] Matrix diagonal_nnz:4000000 (47.62 %)
>>
>> [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 
>> %)
>> [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 
>> %)
>> ExecTime: 13min23sec
>>
>> 3 machines
>> [0] Matrix diagonal_nnz:2933334 (52.38 %)
>> [1] Matrix diagonal_nnz:533327 (9.52 %)
>> [2] Matrix diagonal_nnz:2399999 (42.86 %)
>>
>> [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 
>> %)
>> [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 
>> %)
>> [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 
>> %)
>> ExecTime: 20min26sec
>>
>> As for the network, I ran the make streams NPMAX=3 again. I'm also 
>> sending it in this email.
>>
>> I too think that these bad results are caused by a combination of 
>> bad matrix structure, especially the "diagonal weight", and maybe 
>> network.
>>
>> I really should find a way to permute these matrices to a more 
>> friendly structure.
>>
>> Thank you very much for the help.
>> Nelson
>>
>> Em 2015-08-22 22:49, Barry Smith escreveu:
>>>> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva 
>>>> <nelsonflsilva at ist.utl.pt> wrote:
>>>>
>>>> Hi.
>>>>
>>>>
>>>> I managed to finish the re-implementation. I ran the program with 
>>>> 1,2,3,4,5,6 machines and saved the summary. I send each of them in 
>>>> this email.
>>>> In these executions, the program performs Matrix-Vector (MatMult, 
>>>> MatMultAdd) products and Vector-Vector operations. From what I 
>>>> understand while reading the logs, the program takes most of the 
>>>> time in "VecScatterEnd".
>>>> In this example, the matrix taking part on the Matrix-Vector 
>>>> products is not "much diagonal heavy".
>>>> The following numbers are the percentages of nnz values on the 
>>>> matrix diagonal block for each machine, and each execution time.
>>>> NMachines                      %NNZ       ExecTime
>>>> 1                   machine0   100%;      16min08sec
>>>>
>>>> 2                   machine0   91.1%;     24min58sec
>>>>                     machine1   69.2%;
>>>>
>>>> 3                   machine0   90.9%      25min42sec
>>>>                     machine1   82.8%
>>>>                     machine2   51.6%
>>>>
>>>> 4                   machine0   91.9%      26min27sec
>>>>                     machine1   82.4%
>>>>                     machine2   73.1%
>>>>                     machine3   39.9%
>>>>
>>>> 5                   machine0   93.2%      39min23sec
>>>>                     machine1   82.8%
>>>>                     machine2   74.4%
>>>>                     machine3   64.6%
>>>>                     machine4   31.6%
>>>>
>>>> 6                   machine0   94.2%      54min54sec
>>>>                     machine1   82.6%
>>>>                     machine2   73.1%
>>>>                     machine3   65.2%
>>>>                     machine4   55.9%
>>>>                     machine5   25.4%
>>>
>>>   Based on this I am guessing the last rows of the matrix have a 
>>> lot
>>> of nonzeros away from the diagonal?
>>>
>>>   There is a big load imbalance in something: for example with 2
>>> processes you have
>>>
>>> VecMax             10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 1.1e+04  9  0  0  0 72   9  0  0  0 72     0
>>> VecScatterEnd      18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00 53  0  0  0  0  53  0  0  0  0     0
>>> MatMult            10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04
>>> 1.2e+06 0.0e+00 37 33 58 38  0  37 33 58 38  0    83
>>> MatMultAdd          7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04
>>> 2.8e+06 0.0e+00 34 29 42 62  0  34 29 42 62  0    69
>>>
>>>  the 5th column has the imbalance between slowest and fastest
>>> process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, 
>>> to
>>> get good speed ups these need to be much closer to 1.
>>>
>>>  How many nonzeros in the matrix are there per process? Is it very
>>> different for difference processes? You really need to have each
>>> process have similar number of matrix nonzeros.   Do you have a
>>> picture of the nonzero structure of the matrix?  Where does the 
>>> matrix
>>> come from, why does it have this structure?
>>>
>>>  Also likely there are just to many vector entries that need to be
>>> scattered to the last process for the matmults.
>>>>
>>>> In this implementation I'm using MatCreate and VecCreate. I'm also 
>>>> leaving the partition sizes in PETSC_DECIDE.
>>>>
>>>> Finally, to run the application, I'm using mpirun.hydra from 
>>>> mpich, downloaded by PETSc configure script.
>>>> I'm checking the process assignment as suggested on the last 
>>>> email.
>>>>
>>>> Am I missing anything?
>>>
>>>  Your network is very poor; likely ethernet. It is had to get much
>>> speedup with such slow reductions and sends and receives.
>>>
>>> Average time to get PetscTime(): 1.19209e-07
>>> Average time for MPI_Barrier(): 0.000215769
>>> Average time for zero size MPI_Send(): 5.94854e-05
>>>
>>>  I think you are seeing such bad results due to an unkind matrix
>>> nonzero structure giving per load balance and too much 
>>> communication
>>> and a very poor computer network that just makes all the needed
>>> communication totally dominate.
>>>
>>>
>>>>
>>>> Regards,
>>>> Nelson
>>>>
>>>> Em 2015-08-20 16:17, Matthew Knepley escreveu:
>>>>
>>>>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva 
>>>>> <nelsonflsilva at ist.utl.pt> wrote:
>>>>> Hello.
>>>>>
>>>>> I am sorry for the long time without response. I decided to 
>>>>> rewrite my application in a different way and will send the 
>>>>> log_summary output when done reimplementing.
>>>>>
>>>>> As for the machine, I am using mpirun to run jobs in a 8 node 
>>>>> cluster. I modified the makefile on the steams folder so it would 
>>>>> run using my hostfile.
>>>>> The output is attached to this email. It seems reasonable for a 
>>>>> cluster with 8 machines. From "lscpu", each machine cpu has 4 cores 
>>>>> and 1 socket.
>>>>> 1) You launcher is placing processes haphazardly. I would figure 
>>>>> out how to assign them to certain nodes
>>>>> 2) Each node has enough bandwidth for 1 core, so it does not make 
>>>>> much sense to use more than 1.
>>>>>  Thanks,
>>>>>    Matt
>>>>>
>>>>> Cheers,
>>>>> Nelson
>>>>>
>>>>>
>>>>> Em 2015-07-24 16:50, Barry Smith escreveu:
>>>>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 
>>>>> 16
>>>>> ... processes with the option -log_summary and send (as 
>>>>> attachments)
>>>>> the log summary information.
>>>>>
>>>>>   Also on the same machine run the streams benchmark; with recent
>>>>> releases of PETSc you only need to do
>>>>>
>>>>> cd $PETSC_DIR
>>>>> make streams NPMAX=16 (or whatever your largest process count is)
>>>>>
>>>>> and send the output.
>>>>>
>>>>> I suspect that you are doing everything fine and it is more an 
>>>>> issue
>>>>> with the configuration of your machine. Also read the information 
>>>>> at
>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on
>>>>> "binding"
>>>>>
>>>>>  Barry
>>>>>
>>>>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva 
>>>>> <nelsonflsilva at ist.utl.pt> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I have been using PETSc for a few months now, and it truly is 
>>>>> fantastic piece of software.
>>>>>
>>>>> In my particular example I am working with a large, sparse 
>>>>> distributed (MPI AIJ) matrix we can refer as 'G'.
>>>>> G is a horizontal - retangular matrix (for example, 1,1 Million 
>>>>> rows per 2,1 Million columns). This matrix is commonly very sparse 
>>>>> and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% 
>>>>> are on the diagonal block of MPI AIJ representation).
>>>>> To work with this matrix, I also have a few parallel vectors 
>>>>> (created using MatCreate Vec), we can refer as 'm' and 'k'.
>>>>> I am trying to parallelize an iterative algorithm in which the 
>>>>> most computational heavy operations are:
>>>>>
>>>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b 
>>>>> (MatMultAdd). From what I have been reading, to achive a good 
>>>>> speedup in this operation, G should be as much diagonal as 
>>>>> possible, due to overlapping communication and computation. But 
>>>>> even when using a G matrix in which the diagonal block has ~95% of 
>>>>> the nnz, I cannot get a decent speedup. Most of the times, the 
>>>>> performance even gets worse.
>>>>>
>>>>> ->Matrix-Matrix Multiplication, in this case I need to perform G 
>>>>> * G' = A, where A is later used on the linear solver and G' is 
>>>>> transpose of G. The speedup in this operation is not worse, 
>>>>> although is not very good.
>>>>>
>>>>> ->Linear problem solving. Lastly, In this operation I compute 
>>>>> "Ax=b" from the last two operations. I tried to apply a RCM 
>>>>> permutation to A to make it more diagonal, for better performance. 
>>>>> However, the problem I faced was that, the permutation is performed 
>>>>> locally in each processor and thus, the final result is different 
>>>>> with different number of processors. I assume this was intended to 
>>>>> reduce communication. The solution I found was
>>>>> 1-calculate A
>>>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A
>>>>> 3-apply this permutation to the lines of G.
>>>>> This works well, and A is generated as if RCM permuted. It is 
>>>>> fine to do this operation in one machine because it is only done 
>>>>> once while reading the input. The nnz of G become more spread and 
>>>>> less diagonal, causing problems when calculating G * m + k = b.
>>>>>
>>>>> These 3 operations (except the permutation) are performed in each 
>>>>> iteration of my algorithm.
>>>>>
>>>>> So, my questions are.
>>>>> -What are the characteristics of G that lead to a good speedup in 
>>>>> the operations I described? Am I missing something and too much 
>>>>> obsessed with the diagonal block?
>>>>>
>>>>> -Is there a better way to permute A without permute G and still 
>>>>> get the same result using 1 or N machines?
>>>>>
>>>>>
>>>>> I have been avoiding asking for help for a while. I'm very sorry 
>>>>> for the long email.
>>>>> Thank you very much for your time.
>>>>> Best Regards,
>>>>> Nelson
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their 
>>>>> experiments is infinitely more interesting than any results to 
>>>>> which their experiments lead.
>>>>> -- Norbert Wiener
>>>>
>>>> 
>>>> <Log01P.txt><Log02P.txt><Log03P.txt><Log04P.txt><Log05P.txt><Log06P.txt>
>> 
>> <Log01P.txt><Log02P.txt><Log03P.txt><matrix1.png><matrix2.png><streams.out>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log01P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/e2babbd6/attachment-0003.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log02P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/e2babbd6/attachment-0004.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Log03P
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/e2babbd6/attachment-0005.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matrix-after.png
Type: application/octet-stream
Size: 1818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/e2babbd6/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matrix-before.png
Type: application/octet-stream
Size: 2058 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/e2babbd6/attachment-0003.obj>

From knepley at gmail.com  Mon Aug 24 09:28:06 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 24 Aug 2015 09:28:06 -0500
Subject: [petsc-users] Scalability issue
In-Reply-To: <0c09f050aa7e81b476b7f2245a60f184@mail.ist.utl.pt>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
	<6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
	<CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>
	<11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt>
	<69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov>
	<b53dee5fd906088b3208f09b34ef08b6@mail.ist.utl.pt>
	<53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov>
	<0c09f050aa7e81b476b7f2245a60f184@mail.ist.utl.pt>
Message-ID: <CAMYG4G=SN_uLJSFRP5XYq1jxJB+2N-iHwJuRWV5eAphiULKRSA@mail.gmail.com>

On Mon, Aug 24, 2015 at 9:24 AM, Nelson Filipe Lopes da Silva <
nelsonflsilva at ist.utl.pt> wrote:

> Hello. Thank you very much for your time.
>
> I understood the idea, it works very well.
> I also noticed that my algorithm performs a different number of iterations
> with different number of machines. The stop conditions are calculated using
> PETSc "matmultadd". I'm very positive that there may be a program bug in my
> code, or could it be something with PETSc?
>

In parallel, a total order on summation is not guaranteed, and thus you
will have jitter in the result. However, your
iteration seems extremely sensitive to this (10s of iterations difference).
Thus it seems that either your iterative
tolerance is down around round error, which is usually oversolving, or you
have an incredibly ill-conditioned system.

  Thanks,

     Matt


> I also need to figure out why those vecmax ratio are so high. The vecset
> is understandable as I'm distributing the initial information from the root
> machine in sequencial.
>
> These are the new values:
> 1 machine
> [0] Matrix diagonal_nnz:16800000 (100.00 %)
> [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %)
> ExecTime: 4min47sec
> Iterations: 236
>
> 2 machines
> [0] Matrix diagonal_nnz:8000000 (95.24 %)
> [1] Matrix diagonal_nnz:7600000 (90.48 %)
>
> [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
> [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
> ExecTime: 5min26sec
> Iterations: 330
>
> 3 machines
> [0] Matrix diagonal_nnz:5333340 (95.24 %)
> [1] Matrix diagonal_nnz:4800012 (85.71 %)
> [2] Matrix diagonal_nnz:4533332 (80.95 %)
>
> [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
> [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
> [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %))
> ExecTime: 5min25sec
> Iterations: 346
>
> The suggested permutation worked very well in comparison with the original
> matrix structure. The no-speedup may be related with the different number
> of iterations.
>
> Once again, thank you very much for the time.
> Cheers,
> Nelson
>
>
> Em 2015-08-23 20:19, Barry Smith escreveu:
>
>> A suggestion: take your second ordering and now interlace the second
>> half of the rows with the first half of the rows (keeping the some
>> column ordering) That is, order the rows 0, n/2, 1, n/2+1, 2, n/2+2
>> etc   this will take the two separate "diagonal" bands and form a
>> single "diagonal band".  This will increase the "diagonal block
>> weight" to be pretty high and the only scatters will need to be for
>> the final rows of the input vector that all processes need to do their
>> part of the multiply. Generate the image to make sure what I suggest
>> make sense and then run this ordering with 1, 2, and 3 processes. Send
>> the logs.
>>
>>   Barry
>>
>> On Aug 23, 2015, at 10:12 AM, Nelson Filipe Lopes da Silva <
>>> nelsonflsilva at ist.utl.pt> wrote:
>>>
>>> Thank you for the fast response!
>>>
>>> Yes. The last rows of the matrix are indeed more dense, compared with
>>> the remaining ones.
>>> For this example, concerning load balance between machines, the last
>>> process had 46% of the matrix nonzero entries. A few weeks ago I suspected
>>> of this problem and wrote a little function that could permute the matrix
>>> rows based on their number of nonzeros. However, the matrix would become
>>> less pleasant regarding "diagonal block weight", and I stop using it as i
>>> thought I was becoming worse.
>>>
>>> Also, due to this problem, I thought I could have a complete vector copy
>>> in each processor, instead of a distributed vector. I tried to implement
>>> this idea, but had no luck with the results. However, even if this solution
>>> would work, the communication for vector update was inevitable once each
>>> iteration of my algorithm.
>>> Since this is a rectangular matrix, I cannot apply RCM or such
>>> permutations, however I can permute rows and columns though.
>>>
>>> More specifically, the problem I'm trying to solve is one of balance the
>>> best guess and uncertainty estimates of a set of Input-Output subject to
>>> linear constraints and ancillary informations. The matrix is called an
>>> aggregation matrix, and each entry can be 1, 0 or -1. I don't know the
>>> cause of its nonzero structure. I'm addressing this problem using a
>>> weighted least-squares algorithm.
>>>
>>> I ran the code with a different, more friendly problem topology, logging
>>> the load of nonzero entries and the "diagonal load" per processor.
>>> I'm sending images of both matrices nonzero structure. The last email
>>> example used matrix1, the example in this email uses matrix2.
>>> Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns and
>>> 5.171.901 nnz.
>>> The matrix2 (this email example) is 800.000 rows x 8.800.000 columns and
>>> 16.800.000 nnz.
>>>
>>>
>>> With 1,2,3 machines, I have these distributions of nonzeros (using
>>> matrix2). I'm sending the logs in this email.
>>> 1 machine
>>> [0] Matrix diagonal_nnz:16800000 (100.00 %)
>>> [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %)
>>> ExecTime: 4min47sec
>>>
>>> 2 machines
>>> [0] Matrix diagonal_nnz:4400000 (52.38 %)
>>> [1] Matrix diagonal_nnz:4000000 (47.62 %)
>>>
>>> [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
>>> [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
>>> ExecTime: 13min23sec
>>>
>>> 3 machines
>>> [0] Matrix diagonal_nnz:2933334 (52.38 %)
>>> [1] Matrix diagonal_nnz:533327 (9.52 %)
>>> [2] Matrix diagonal_nnz:2399999 (42.86 %)
>>>
>>> [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
>>> [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
>>> [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %)
>>> ExecTime: 20min26sec
>>>
>>> As for the network, I ran the make streams NPMAX=3 again. I'm also
>>> sending it in this email.
>>>
>>> I too think that these bad results are caused by a combination of bad
>>> matrix structure, especially the "diagonal weight", and maybe network.
>>>
>>> I really should find a way to permute these matrices to a more friendly
>>> structure.
>>>
>>> Thank you very much for the help.
>>> Nelson
>>>
>>> Em 2015-08-22 22:49, Barry Smith escreveu:
>>>
>>>> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva <
>>>>> nelsonflsilva at ist.utl.pt> wrote:
>>>>>
>>>>> Hi.
>>>>>
>>>>>
>>>>> I managed to finish the re-implementation. I ran the program with
>>>>> 1,2,3,4,5,6 machines and saved the summary. I send each of them in this
>>>>> email.
>>>>> In these executions, the program performs Matrix-Vector (MatMult,
>>>>> MatMultAdd) products and Vector-Vector operations. From what I understand
>>>>> while reading the logs, the program takes most of the time in
>>>>> "VecScatterEnd".
>>>>> In this example, the matrix taking part on the Matrix-Vector products
>>>>> is not "much diagonal heavy".
>>>>> The following numbers are the percentages of nnz values on the matrix
>>>>> diagonal block for each machine, and each execution time.
>>>>> NMachines                      %NNZ       ExecTime
>>>>> 1                   machine0   100%;      16min08sec
>>>>>
>>>>> 2                   machine0   91.1%;     24min58sec
>>>>>                     machine1   69.2%;
>>>>>
>>>>> 3                   machine0   90.9%      25min42sec
>>>>>                     machine1   82.8%
>>>>>                     machine2   51.6%
>>>>>
>>>>> 4                   machine0   91.9%      26min27sec
>>>>>                     machine1   82.4%
>>>>>                     machine2   73.1%
>>>>>                     machine3   39.9%
>>>>>
>>>>> 5                   machine0   93.2%      39min23sec
>>>>>                     machine1   82.8%
>>>>>                     machine2   74.4%
>>>>>                     machine3   64.6%
>>>>>                     machine4   31.6%
>>>>>
>>>>> 6                   machine0   94.2%      54min54sec
>>>>>                     machine1   82.6%
>>>>>                     machine2   73.1%
>>>>>                     machine3   65.2%
>>>>>                     machine4   55.9%
>>>>>                     machine5   25.4%
>>>>>
>>>>
>>>>   Based on this I am guessing the last rows of the matrix have a lot
>>>> of nonzeros away from the diagonal?
>>>>
>>>>   There is a big load imbalance in something: for example with 2
>>>> processes you have
>>>>
>>>> VecMax             10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 1.1e+04  9  0  0  0 72   9  0  0  0 72     0
>>>> VecScatterEnd      18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 0.0e+00 53  0  0  0  0  53  0  0  0  0     0
>>>> MatMult            10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04
>>>> 1.2e+06 0.0e+00 37 33 58 38  0  37 33 58 38  0    83
>>>> MatMultAdd          7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04
>>>> 2.8e+06 0.0e+00 34 29 42 62  0  34 29 42 62  0    69
>>>>
>>>>  the 5th column has the imbalance between slowest and fastest
>>>> process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to
>>>> get good speed ups these need to be much closer to 1.
>>>>
>>>>  How many nonzeros in the matrix are there per process? Is it very
>>>> different for difference processes? You really need to have each
>>>> process have similar number of matrix nonzeros.   Do you have a
>>>> picture of the nonzero structure of the matrix?  Where does the matrix
>>>> come from, why does it have this structure?
>>>>
>>>>  Also likely there are just to many vector entries that need to be
>>>> scattered to the last process for the matmults.
>>>>
>>>>>
>>>>> In this implementation I'm using MatCreate and VecCreate. I'm also
>>>>> leaving the partition sizes in PETSC_DECIDE.
>>>>>
>>>>> Finally, to run the application, I'm using mpirun.hydra from mpich,
>>>>> downloaded by PETSc configure script.
>>>>> I'm checking the process assignment as suggested on the last email.
>>>>>
>>>>> Am I missing anything?
>>>>>
>>>>
>>>>  Your network is very poor; likely ethernet. It is had to get much
>>>> speedup with such slow reductions and sends and receives.
>>>>
>>>> Average time to get PetscTime(): 1.19209e-07
>>>> Average time for MPI_Barrier(): 0.000215769
>>>> Average time for zero size MPI_Send(): 5.94854e-05
>>>>
>>>>  I think you are seeing such bad results due to an unkind matrix
>>>> nonzero structure giving per load balance and too much communication
>>>> and a very poor computer network that just makes all the needed
>>>> communication totally dominate.
>>>>
>>>>
>>>>
>>>>> Regards,
>>>>> Nelson
>>>>>
>>>>> Em 2015-08-20 16:17, Matthew Knepley escreveu:
>>>>>
>>>>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva <
>>>>>> nelsonflsilva at ist.utl.pt> wrote:
>>>>>> Hello.
>>>>>>
>>>>>> I am sorry for the long time without response. I decided to rewrite
>>>>>> my application in a different way and will send the log_summary output when
>>>>>> done reimplementing.
>>>>>>
>>>>>> As for the machine, I am using mpirun to run jobs in a 8 node
>>>>>> cluster. I modified the makefile on the steams folder so it would run using
>>>>>> my hostfile.
>>>>>> The output is attached to this email. It seems reasonable for a
>>>>>> cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1
>>>>>> socket.
>>>>>> 1) You launcher is placing processes haphazardly. I would figure out
>>>>>> how to assign them to certain nodes
>>>>>> 2) Each node has enough bandwidth for 1 core, so it does not make
>>>>>> much sense to use more than 1.
>>>>>>  Thanks,
>>>>>>    Matt
>>>>>>
>>>>>> Cheers,
>>>>>> Nelson
>>>>>>
>>>>>>
>>>>>> Em 2015-07-24 16:50, Barry Smith escreveu:
>>>>>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16
>>>>>> ... processes with the option -log_summary and send (as attachments)
>>>>>> the log summary information.
>>>>>>
>>>>>>   Also on the same machine run the streams benchmark; with recent
>>>>>> releases of PETSc you only need to do
>>>>>>
>>>>>> cd $PETSC_DIR
>>>>>> make streams NPMAX=16 (or whatever your largest process count is)
>>>>>>
>>>>>> and send the output.
>>>>>>
>>>>>> I suspect that you are doing everything fine and it is more an issue
>>>>>> with the configuration of your machine. Also read the information at
>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on
>>>>>> "binding"
>>>>>>
>>>>>>  Barry
>>>>>>
>>>>>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva <
>>>>>> nelsonflsilva at ist.utl.pt> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I have been using PETSc for a few months now, and it truly is
>>>>>> fantastic piece of software.
>>>>>>
>>>>>> In my particular example I am working with a large, sparse
>>>>>> distributed (MPI AIJ) matrix we can refer as 'G'.
>>>>>> G is a horizontal - retangular matrix (for example, 1,1 Million rows
>>>>>> per 2,1 Million columns). This matrix is commonly very sparse and not
>>>>>> diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the
>>>>>> diagonal block of MPI AIJ representation).
>>>>>> To work with this matrix, I also have a few parallel vectors (created
>>>>>> using MatCreate Vec), we can refer as 'm' and 'k'.
>>>>>> I am trying to parallelize an iterative algorithm in which the most
>>>>>> computational heavy operations are:
>>>>>>
>>>>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b
>>>>>> (MatMultAdd). From what I have been reading, to achive a good speedup in
>>>>>> this operation, G should be as much diagonal as possible, due to
>>>>>> overlapping communication and computation. But even when using a G matrix
>>>>>> in which the diagonal block has ~95% of the nnz, I cannot get a decent
>>>>>> speedup. Most of the times, the performance even gets worse.
>>>>>>
>>>>>> ->Matrix-Matrix Multiplication, in this case I need to perform G * G'
>>>>>> = A, where A is later used on the linear solver and G' is transpose of G.
>>>>>> The speedup in this operation is not worse, although is not very good.
>>>>>>
>>>>>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b"
>>>>>> from the last two operations. I tried to apply a RCM permutation to A to
>>>>>> make it more diagonal, for better performance. However, the problem I faced
>>>>>> was that, the permutation is performed locally in each processor and thus,
>>>>>> the final result is different with different number of processors. I assume
>>>>>> this was intended to reduce communication. The solution I found was
>>>>>> 1-calculate A
>>>>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A
>>>>>> 3-apply this permutation to the lines of G.
>>>>>> This works well, and A is generated as if RCM permuted. It is fine to
>>>>>> do this operation in one machine because it is only done once while reading
>>>>>> the input. The nnz of G become more spread and less diagonal, causing
>>>>>> problems when calculating G * m + k = b.
>>>>>>
>>>>>> These 3 operations (except the permutation) are performed in each
>>>>>> iteration of my algorithm.
>>>>>>
>>>>>> So, my questions are.
>>>>>> -What are the characteristics of G that lead to a good speedup in the
>>>>>> operations I described? Am I missing something and too much obsessed with
>>>>>> the diagonal block?
>>>>>>
>>>>>> -Is there a better way to permute A without permute G and still get
>>>>>> the same result using 1 or N machines?
>>>>>>
>>>>>>
>>>>>> I have been avoiding asking for help for a while. I'm very sorry for
>>>>>> the long email.
>>>>>> Thank you very much for your time.
>>>>>> Best Regards,
>>>>>> Nelson
>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <Log01P.txt><Log02P.txt><Log03P.txt><Log04P.txt><Log05P.txt><Log06P.txt>
>>>>>
>>>>
>>>
>>> <Log01P.txt><Log02P.txt><Log03P.txt><matrix1.png><matrix2.png><streams.out>
>>>
>>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/f273bf96/attachment-0001.html>

From nelsonflsilva at ist.utl.pt  Mon Aug 24 11:08:48 2015
From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva)
Date: Mon, 24 Aug 2015 17:08:48 +0100
Subject: [petsc-users] Scalability issue
In-Reply-To: <CAMYG4G=SN_uLJSFRP5XYq1jxJB+2N-iHwJuRWV5eAphiULKRSA@mail.gmail.com>
References: <f2f21dd8b5fb4064dafdb9d0a728a34e@mail.ist.utl.pt>
	<FDE7FD86-2BA9-4840-86B4-7DB5F5A34F4C@mcs.anl.gov>
	<6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt>
	<CAMYG4GnQMSqxPhdC73+69r3mSOqargOkkY1VDoZXZNBfnWb83w@mail.gmail.com>
	<11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt>
	<69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov>
	<b53dee5fd906088b3208f09b34ef08b6@mail.ist.utl.pt>
	<53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov>
	<0c09f050aa7e81b476b7f2245a60f184@mail.ist.utl.pt>
	<CAMYG4G=SN_uLJSFRP5XYq1jxJB+2N-iHwJuRWV5eAphiULKRSA@mail.gmail.com>
Message-ID: <7562512c261da763c0c7b1807640c637@mail.ist.utl.pt>

 
I understand. 

That was indeed the case. I have been experimenting
with different values and thresholds. The program was indeed oversolving
due to severe low threshold values. 
Now all executions run for the same
number of iterations.

The computational part of the program seems to be
showing some speedup! The program was indeed suffering from the poor
matrix structure and the problem was solved with the suggested
permutation. 

I'll keep experimenting with different matrices to figure
out the best permutations for each case. 

Thank you very much for your
time! 

Best regards, 

Nelson 

Em 2015-08-24 15:28, Matthew Knepley
escreveu: 

> On Mon, Aug 24, 2015 at 9:24 AM, Nelson Filipe Lopes da
Silva <nelsonflsilva at ist.utl.pt [5]> wrote:
> 
>> Hello. Thank you very
much for your time.
>> 
>> I understood the idea, it works very well.
>>
I also noticed that my algorithm performs a different number of
iterations with different number of machines. The stop conditions are
calculated using PETSc "matmultadd". I'm very positive that there may be
a program bug in my code, or could it be something with PETSc?
> 
> In
parallel, a total order on summation is not guaranteed, and thus you
will have jitter in the result. However, your 
> iteration seems
extremely sensitive to this (10s of iterations difference). Thus it
seems that either your iterative 
> tolerance is down around round
error, which is usually oversolving, or you have an incredibly
ill-conditioned system. 
> Thanks, 
> Matt 
> 
>> I also need to figure
out why those vecmax ratio are so high. The vecset is understandable as
I'm distributing the initial information from the root machine in
sequencial.
>> 
>> These are the new values:
>> 1 machine
>> [0] Matrix
diagonal_nnz:16800000 (100.00 %)
>> [0] Matrix local nnz: 16800000
(100.00 %), local rows: 800000 (100.00 %)
>> ExecTime: 4min47sec
>>
Iterations: 236
>> 
>> 2 machines
>> [0] Matrix diagonal_nnz:8000000
(95.24 %)
>> [1] Matrix diagonal_nnz:7600000 (90.48 %)
>> 
>> [0] Matrix
local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
>> [1] Matrix
local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %)
>> ExecTime:
5min26sec
>> Iterations: 330
>> 
>> 3 machines
>> [0] Matrix
diagonal_nnz:5333340 (95.24 %)
>> [1] Matrix diagonal_nnz:4800012 (85.71
%)
>> [2] Matrix diagonal_nnz:4533332 (80.95 %)
>> 
>> [0] Matrix local
nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
>> [1] Matrix local
nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %)
>> [2] Matrix local
nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %))
>> ExecTime:
5min25sec
>> Iterations: 346
>> 
>> The suggested permutation worked
very well in comparison with the original matrix structure. The
no-speedup may be related with the different number of iterations.
>>

>> Once again, thank you very much for the time.
>> Cheers,
>> Nelson

>> 
>> Em 2015-08-23 20:19, Barry Smith escreveu:
>> 
>>> A suggestion:
take your second ordering and now interlace the second
>>> half of the
rows with the first half of the rows (keeping the some
>>> column
ordering) That is, order the rows 0, n/2, 1, n/2+1, 2, n/2+2
>>> etc
this will take the two separate "diagonal" bands and form a
>>> single
"diagonal band". This will increase the "diagonal block
>>> weight" to
be pretty high and the only scatters will need to be for
>>> the final
rows of the input vector that all processes need to do their
>>> part of
the multiply. Generate the image to make sure what I suggest
>>> make
sense and then run this ordering with 1, 2, and 3 processes. Send
>>>
the logs.
>>> 
>>> Barry
>>> 
>>>> On Aug 23, 2015, at 10:12 AM, Nelson
Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt [4]> wrote:
>>>> 
>>>>
Thank you for the fast response!
>>>> 
>>>> Yes. The last rows of the
matrix are indeed more dense, compared with the remaining ones.
>>>> For
this example, concerning load balance between machines, the last process
had 46% of the matrix nonzero entries. A few weeks ago I suspected of
this problem and wrote a little function that could permute the matrix
rows based on their number of nonzeros. However, the matrix would become
less pleasant regarding "diagonal block weight", and I stop using it as
i thought I was becoming worse.
>>>> 
>>>> Also, due to this problem, I
thought I could have a complete vector copy in each processor, instead
of a distributed vector. I tried to implement this idea, but had no luck
with the results. However, even if this solution would work, the
communication for vector update was inevitable once each iteration of my
algorithm.
>>>> Since this is a rectangular matrix, I cannot apply RCM
or such permutations, however I can permute rows and columns
though.
>>>> 
>>>> More specifically, the problem I'm trying to solve is
one of balance the best guess and uncertainty estimates of a set of
Input-Output subject to linear constraints and ancillary informations.
The matrix is called an aggregation matrix, and each entry can be 1, 0
or -1. I don't know the cause of its nonzero structure. I'm addressing
this problem using a weighted least-squares algorithm.
>>>> 
>>>> I ran
the code with a different, more friendly problem topology, logging the
load of nonzero entries and the "diagonal load" per processor.
>>>> I'm
sending images of both matrices nonzero structure. The last email
example used matrix1, the example in this email uses matrix2.
>>>>
Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns and
5.171.901 nnz.
>>>> The matrix2 (this email example) is 800.000 rows x
8.800.000 columns and 16.800.000 nnz.
>>>> 
>>>> With 1,2,3 machines, I
have these distributions of nonzeros (using matrix2). I'm sending the
logs in this email.
>>>> 1 machine
>>>> [0] Matrix diagonal_nnz:16800000
(100.00 %)
>>>> [0] Matrix local nnz: 16800000 (100.00 %), local rows:
800000 (100.00 %)
>>>> ExecTime: 4min47sec
>>>> 
>>>> 2 machines
>>>>
[0] Matrix diagonal_nnz:4400000 (52.38 %)
>>>> [1] Matrix
diagonal_nnz:4000000 (47.62 %)
>>>> 
>>>> [0] Matrix local nnz: 8400000
(50.00 %), local rows: 400000 (50.00 %)
>>>> [1] Matrix local nnz:
8400000 (50.00 %), local rows: 400000 (50.00 %)
>>>> ExecTime:
13min23sec
>>>> 
>>>> 3 machines
>>>> [0] Matrix diagonal_nnz:2933334
(52.38 %)
>>>> [1] Matrix diagonal_nnz:533327 (9.52 %)
>>>> [2] Matrix
diagonal_nnz:2399999 (42.86 %)
>>>> 
>>>> [0] Matrix local nnz: 5600007
(33.33 %), local rows: 266667 (33.33 %)
>>>> [1] Matrix local nnz:
5600007 (33.33 %), local rows: 266667 (33.33 %)
>>>> [2] Matrix local
nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %)
>>>> ExecTime:
20min26sec
>>>> 
>>>> As for the network, I ran the make streams NPMAX=3
again. I'm also sending it in this email.
>>>> 
>>>> I too think that
these bad results are caused by a combination of bad matrix structure,
especially the "diagonal weight", and maybe network.
>>>> 
>>>> I really
should find a way to permute these matrices to a more friendly
structure.
>>>> 
>>>> Thank you very much for the help.
>>>> Nelson
>>>>

>>>> Em 2015-08-22 22:49, Barry Smith escreveu:
>>>> 
>>>>>> On Aug 22,
2015, at 4:17 PM, Nelson Filipe Lopes da Silva <nelsonflsilva at ist.utl.pt
[1]> wrote:
>>>>>> 
>>>>>> Hi.
>>>>>> 
>>>>>> I managed to finish the
re-implementation. I ran the program with 1,2,3,4,5,6 machines and saved
the summary. I send each of them in this email.
>>>>>> In these
executions, the program performs Matrix-Vector (MatMult, MatMultAdd)
products and Vector-Vector operations. From what I understand while
reading the logs, the program takes most of the time in
"VecScatterEnd".
>>>>>> In this example, the matrix taking part on the
Matrix-Vector products is not "much diagonal heavy".
>>>>>> The
following numbers are the percentages of nnz values on the matrix
diagonal block for each machine, and each execution time.
>>>>>>
NMachines %NNZ ExecTime
>>>>>> 1 machine0 100%; 16min08sec
>>>>>>

>>>>>> 2 machine0 91.1%; 24min58sec
>>>>>> machine1 69.2%;
>>>>>>

>>>>>> 3 machine0 90.9% 25min42sec
>>>>>> machine1 82.8%
>>>>>>
machine2 51.6%
>>>>>> 
>>>>>> 4 machine0 91.9% 26min27sec
>>>>>>
machine1 82.4%
>>>>>> machine2 73.1%
>>>>>> machine3 39.9%
>>>>>>

>>>>>> 5 machine0 93.2% 39min23sec
>>>>>> machine1 82.8%
>>>>>>
machine2 74.4%
>>>>>> machine3 64.6%
>>>>>> machine4 31.6%
>>>>>>

>>>>>> 6 machine0 94.2% 54min54sec
>>>>>> machine1 82.6%
>>>>>>
machine2 73.1%
>>>>>> machine3 65.2%
>>>>>> machine4 55.9%
>>>>>>
machine5 25.4%
>>>>> 
>>>>> Based on this I am guessing the last rows of
the matrix have a lot
>>>>> of nonzeros away from the diagonal?
>>>>>

>>>>> There is a big load imbalance in something: for example with
2
>>>>> processes you have
>>>>> 
>>>>> VecMax 10509 1.0 2.0602e+02 4.2
0.00e+00 0.0 0.0e+00
>>>>> 0.0e+00 1.1e+04 9 0 0 0 72 9 0 0 0 72 0
>>>>>
VecScatterEnd 18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00
>>>>>
0.0e+00 0.0e+00 53 0 0 0 0 53 0 0 0 0 0
>>>>> MatMult 10505 1.0
6.5591e+02 1.4 3.16e+10 1.4 2.1e+04
>>>>> 1.2e+06 0.0e+00 37 33 58 38 0
37 33 58 38 0 83
>>>>> MatMultAdd 7624 1.0 7.0028e+02 2.3 3.26e+10 2.1
1.5e+04
>>>>> 2.8e+06 0.0e+00 34 29 42 62 0 34 29 42 62 0 69
>>>>>

>>>>> the 5th column has the imbalance between slowest and
fastest
>>>>> process. It is 4.2 for max, 1.4 for multi and 2.3 for
matmultadd, to
>>>>> get good speed ups these need to be much closer to
1.
>>>>> 
>>>>> How many nonzeros in the matrix are there per process?
Is it very
>>>>> different for difference processes? You really need to
have each
>>>>> process have similar number of matrix nonzeros. Do you
have a
>>>>> picture of the nonzero structure of the matrix? Where does
the matrix
>>>>> come from, why does it have this structure?
>>>>>

>>>>> Also likely there are just to many vector entries that need to
be
>>>>> scattered to the last process for the matmults.
>>>>> 
>>>>>>
In this implementation I'm using MatCreate and VecCreate. I'm also
leaving the partition sizes in PETSC_DECIDE.
>>>>>> 
>>>>>> Finally, to
run the application, I'm using mpirun.hydra from mpich, downloaded by
PETSc configure script.
>>>>>> I'm checking the process assignment as
suggested on the last email.
>>>>>> 
>>>>>> Am I missing anything?
>>>>>

>>>>> Your network is very poor; likely ethernet. It is had to get
much
>>>>> speedup with such slow reductions and sends and
receives.
>>>>> 
>>>>> Average time to get PetscTime():
1.19209e-07
>>>>> Average time for MPI_Barrier(): 0.000215769
>>>>>
Average time for zero size MPI_Send(): 5.94854e-05
>>>>> 
>>>>> I think
you are seeing such bad results due to an unkind matrix
>>>>> nonzero
structure giving per load balance and too much communication
>>>>> and a
very poor computer network that just makes all the needed
>>>>>
communication totally dominate.
>>>>> 
>>>>> Regards,
>>>>> Nelson
>>>>>

>>>>> Em 2015-08-20 16:17, Matthew Knepley escreveu:
>>>>> 
>>>>> On
Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva
<nelsonflsilva at ist.utl.pt [3]> wrote:
>>>>> Hello.
>>>>> 
>>>>> I am
sorry for the long time without response. I decided to rewrite my
application in a different way and will send the log_summary output when
done reimplementing.
>>>>> 
>>>>> As for the machine, I am using mpirun
to run jobs in a 8 node cluster. I modified the makefile on the steams
folder so it would run using my hostfile.
>>>>> The output is attached
to this email. It seems reasonable for a cluster with 8 machines. From
"lscpu", each machine cpu has 4 cores and 1 socket.
>>>>> 1) You
launcher is placing processes haphazardly. I would figure out how to
assign them to certain nodes
>>>>> 2) Each node has enough bandwidth for
1 core, so it does not make much sense to use more than 1.
>>>>>
Thanks,
>>>>> Matt
>>>>> 
>>>>> Cheers,
>>>>> Nelson
>>>>> 
>>>>> Em
2015-07-24 16:50, Barry Smith escreveu:
>>>>> It would be very helpful
if you ran the code on say 1, 2, 4, 8, 16
>>>>> ... processes with the
option -log_summary and send (as attachments)
>>>>> the log summary
information.
>>>>> 
>>>>> Also on the same machine run the streams
benchmark; with recent
>>>>> releases of PETSc you only need to do
>>>>>

>>>>> cd $PETSC_DIR
>>>>> make streams NPMAX=16 (or whatever your
largest process count is)
>>>>> 
>>>>> and send the output.
>>>>> 
>>>>>
I suspect that you are doing everything fine and it is more an
issue
>>>>> with the configuration of your machine. Also read the
information at
>>>>>
http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
> 
> -- 
>

> What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
> -- Norbert Wiener

 
Links:
------
[1]
mailto:nelsonflsilva at ist.utl.pt
[2] mailto:nelsonflsilva at ist.utl.pt
[3]
mailto:nelsonflsilva at ist.utl.pt
[4] mailto:nelsonflsilva at ist.utl.pt
[5]
mailto:nelsonflsilva at ist.utl.pt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/31a0970a/attachment.html>

From zonexo at gmail.com  Mon Aug 24 21:01:47 2015
From: zonexo at gmail.com (Wee Beng Tay)
Date: Tue, 25 Aug 2015 10:01:47 +0800
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CAMYG4G=M+uegDpdwNy32DhtpJsCVQFviA62UE6vguxxpDJMBJA@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAMYG4G=M+uegDpdwNy32DhtpJsCVQFviA62UE6vguxxpDJMBJA@mail.gmail.com>
Message-ID: <1440468109831-51ceb105-9c281f1a-bc754585@gmail.com>

Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Mon, Aug 24, 2015 at 6:21 PM, Matthew Knepley < knepley at gmail.com [knepley at gmail.com] > wrote:
On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay < zonexo at gmail.com [zonexo at gmail.com] > wrote:
Hi,
I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2
directions (y,z)
Previously I was using MatSetValues with global indices. However, now I'm using
DM and global indices is much more difficult.
I come across MatSetValuesStencil or MatSetValuesLocal.
So what's the difference bet the one since they both seem to work locally?
No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes global
vertex numbers.
So MatSetValuesStencil() takes global vertex numbers. Do you mean the natural or
petsc ordering?
Which is a simpler/better option?
MatSetValuesStencil() Is there an example in Fortran for MatSetValuesStencil?
Timoth?e Nicolas shows one in his reply.

Do I also need to use DMDAGetAO together with MatSetValuesStencil or
MatSetValuesLocal?
No.
Thanks,
Matt Thanks!


--
What most experimenters take for granted before they begin their experiments is
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/f59579cb/attachment.html>

From zonexo at gmail.com  Mon Aug 24 21:06:23 2015
From: zonexo at gmail.com (Wee Beng Tay)
Date: Tue, 25 Aug 2015 10:06:23 +0800
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
Message-ID: <1440468386078-b9b15ac2-3f34e8e5-a4536505@gmail.com>

Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Mon, Aug 24, 2015 at 5:54 PM, Timoth?e Nicolas < timothee.nicolas at gmail.com [timothee.nicolas at gmail.com] > wrote:
Hi,

ex5 of snes can give you an example of the two routines.

The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version
ex5f90.F uses MatSetValuesLocal.

However, I use MatSetValuesStencil also in Fortran, there is no problem, and no
need to mess around with DMDAGetAO, I think.

To input values in the matrix, you need to do the following :

! Declare the matstencils for matrix columns and rows
MatStencil :: row(4,1),col(4,n)
! Declare the quantity which will store the actual matrix elements
PetscScalar :: v(8)

The first dimension in row and col is 4 to allow for 3 spatial dimensions (even
if you use only 2) plus one degree of freedom if you have several fields in your
DMDA. The second dimension is 1 for row (you input one row at a time) and n for
col, where n is the number of columns that you input. For instance, if at node
(1,i,j) (1 is the index of the degree of freedom), you have, say, 6 couplings,
with nodes (1,i,j), (1,i+1,j), (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for
example, then you need to set n=6

Then you define the row number by naturally doing the following, inside a local
loop :

row(MatStencil_i,1) = i -1
row(MatStencil_j,1) = j -1
row(MatStencil_c,1) = 1 -1

the -1 are here because FORTRAN indexing is different from the native C
indexing. I put them on the right to make this more apparent.

Then the column information. For instance to declare the coupling with node
(1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will have to
write (still within the same local loop on i and j)

col(MatStencil_i,1) = i -1
col(MatStencil_j,1) = j -1
col(MatStencil_c,1) = 1 -1
v(1) = whatever_it_is

col(MatStencil_i,2) = i-1 -1
col(MatStencil_j,2) = j -1
col(MatStencil_c,2) = 1 -1
v(2) = whatever_it_is

col(MatStencil_i,3) = i -1
col(MatStencil_j,3) = j -1
col(MatStencil_c,3) = 2 -1
v(3) = whatever_it_is

...
...
..

...
...
...

Note that the index of the degree of freedom (or what field you are coupling
to), is indicated by MatStencil_c


Finally use MatSetValuesStencil

ione = 1
isix = 6
call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr)

If it is not clear don't hesitate to ask more details. For me it worked that
way, I succesfully computed a Jacobian that way. It is very sensitive. If you
slightly depart from the right jacobian, you will see a huge difference compared
to using matrix free with -snes_mf, so you can hardly make a mistake because you
would see it. That's how I finally got it to work.

Best

Timothee


Hi Timothee,
Thanks for the help. So for boundary pts I will just leave blank for non
existent locations?
Also, can I use PETSc multigrid to solve this problem? This is a poisson eqn.
2015-08-24 18:09 GMT+09:00 Wee-Beng Tay < zonexo at gmail.com [zonexo at gmail.com] > :
Hi,
I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2
directions (y,z)
Previously I was using MatSetValues with global indices. However, now I'm using
DM and global indices is much more difficult.
I come across MatSetValuesStencil or MatSetValuesLocal.
So what's the difference bet the one since they both seem to work locally?
Which is a simpler/better option?
Is there an example in Fortran for MatSetValuesStencil?
Do I also need to use DMDAGetAO together with MatSetValuesStencil or
MatSetValuesLocal?
Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/c4f34d87/attachment.html>

From knepley at gmail.com  Mon Aug 24 21:11:29 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 24 Aug 2015 21:11:29 -0500
Subject: [petsc-users] Insert values into matrix using
	MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <1440468109831-51ceb105-9c281f1a-bc754585@gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAMYG4G=M+uegDpdwNy32DhtpJsCVQFviA62UE6vguxxpDJMBJA@mail.gmail.com>
	<1440468109831-51ceb105-9c281f1a-bc754585@gmail.com>
Message-ID: <CAMYG4G=qoc1HNknF9STgnzd9MUUVrZL_LiM8etY8VkkyJ5PtAA@mail.gmail.com>

On Mon, Aug 24, 2015 at 9:01 PM, Wee Beng Tay <zonexo at gmail.com> wrote:

>
>
> Sent using CloudMagic
> <https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2>
> On Mon, Aug 24, 2015 at 6:21 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
> On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay <zonexo at gmail.com> wrote:
>
>> Hi,
>>
>> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI
>> along 2 directions (y,z)
>>
>> Previously I was using MatSetValues with global indices. However, now I'm
>> using DM and global indices is much more difficult.
>>
>> I come across MatSetValuesStencil or MatSetValuesLocal.
>>
>> So what's the difference bet the one since they both seem to work locally?
>>
>
> No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes
> global vertex numbers.
>
>
> So MatSetValuesStencil() takes global vertex numbers. Do you mean the
> natural or petsc  ordering?
>

There is no PETSc ordering for vertices, only the natural ordering.

  Thanks,

     Matt


> Which is a simpler/better option?
>>
>
> MatSetValuesStencil()
>
>> Is there an example in Fortran for MatSetValuesStencil?
>>
>
> Timoth?e Nicolas shows one in his reply.
>
> Do I also need to use DMDAGetAO together with MatSetValuesStencil or
>> MatSetValuesLocal?
>>
>
> No.
>
> Thanks,
>
> Matt
>
>> Thanks!
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150824/dd035139/attachment.html>

From timothee.nicolas at gmail.com  Tue Aug 25 00:45:33 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Tue, 25 Aug 2015 14:45:33 +0900
Subject: [petsc-users] Function evaluation slowness ?
Message-ID: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>

Hi,

I am testing PETSc on the supercomputer where I used to run my explicit MHD
code. For my tests I use 256 processes on a problem of size 128*128*640 =
10485760, that is, 40960 grid points per process, and 8 degrees of freedom
(or physical fields). The explicit code was using Runge-Kutta 4 for the
time scheme, which means 4 function evaluation per time step (plus one
operation to put everything together, but let's forget this one).

I could thus easily determine that the typical time required for a function
evaluation was of the order of 50 ms.

Now with the implicit Newton-Krylov solver written in PETSc, in the present
state where for now I have not implemented any Jacobian or preconditioner
whatsoever (so I run with -snes_mf), I measure a typical time between two
time steps of between 5 and 20 seconds, and the number of function
evaluations for each time step obtained with SNESGetNumberFunctionEvals is
17 (I am speaking of a particular case of course)

This means a time per function evaluation of about 0.5 to 1 second, that
is, 10 to 20 times slower.

So I have some questions about this.

1. First does SNESGetNumberFunctionEvals take into account the function
evaluations required to evaluate the Jacobian when -snes_mf is used, as
well as the operations required by the GMRES (Krylov) method ? If it were
the case, I would somehow intuitively expect a number larger than 17, which
could explain the increase in time.

2. In any case, I thought that all things considered, the function
evaluation would be the most time consuming part of a Newton-Krylov solver,
am I completely wrong about that ? Is the 10-20 factor legit ?

I realize of course that preconditioning should make all this smoother, in
particular allowing larger time steps, but here I am just concerned about
the sheer Function evaluation time.

Best regards

Timothee NICOLAS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/63a85e94/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue Aug 25 00:56:41 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 25 Aug 2015 00:56:41 -0500
Subject: [petsc-users] Function evaluation slowness ?
In-Reply-To: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
References: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
Message-ID: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>


> On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> 
> Hi,
> 
> I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one). 
> 
> I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms.
> 
> Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course)
> 
> This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower.
> 
> So I have some questions about this.
> 
> 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time.

PetscErrorCode  SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs)
{
  *nfuncs = snes->nfuncs;
}

PetscErrorCode  SNESComputeFunction(SNES snes,Vec x,Vec y)
{
...
  snes->nfuncs++;
}

PetscErrorCode  MatCreateSNESMF(SNES snes,Mat *J)
{
.....
  if (snes->pc && snes->pcside == PC_LEFT) {
    ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr);
  } else {
    ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr);
  }
}

  So, yes I would expect all the function evaluations needed for the matrix-free Jacobian matrix vector product to be counted. You can also look at the number of GMRES Krylov iterations it took (which should have one multiply per iteration) to double check that the numbers make sense.

  What does your -log_summary output look like? One thing that GMRES does is it introduces a global reduction with each multiple (hence a barrier across all your processes) on some systems this can be deadly.

  Barry


> 
> 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ?
> 
> I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time.
> 
> Best regards
> 
> Timothee NICOLAS


From timothee.nicolas at gmail.com  Tue Aug 25 01:21:24 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Tue, 25 Aug 2015 15:21:24 +0900
Subject: [petsc-users] Function evaluation slowness ?
In-Reply-To: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>
References: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
	<0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>
Message-ID: <CAGi1ndQuuO+tMC_1r=DyhrsW0n4b1aMpvKXBF_BDcKpuqoHeUQ@mail.gmail.com>

Here is the log summary (attached). At the beginning are personal prints,
you can skip. I seem to have a memory crash in the present state after
typically 45 iterations (that's why I used 40 here), the log summary
indicates some creations without destruction of Petsc objects (I will fix
this immediately), that may cause the memory crash, but I don't think it's
the cause of the slow function evaluations.

The log_summary is consistent with 0.7s per function evaluation
(4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately
the same amount of time (is it normal ?). And the other long operation is
VecScatterEnd. I assume it is the time used in process communications ? In
which case I suppose it is normal that it takes a significant amount of
time.

So this ~10 times increase does not look normal right ?

Best

Timothee NICOLAS


2015-08-25 14:56 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:

>
> > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas <
> timothee.nicolas at gmail.com> wrote:
> >
> > Hi,
> >
> > I am testing PETSc on the supercomputer where I used to run my explicit
> MHD code. For my tests I use 256 processes on a problem of size 128*128*640
> = 10485760, that is, 40960 grid points per process, and 8 degrees of
> freedom (or physical fields). The explicit code was using Runge-Kutta 4 for
> the time scheme, which means 4 function evaluation per time step (plus one
> operation to put everything together, but let's forget this one).
> >
> > I could thus easily determine that the typical time required for a
> function evaluation was of the order of 50 ms.
> >
> > Now with the implicit Newton-Krylov solver written in PETSc, in the
> present state where for now I have not implemented any Jacobian or
> preconditioner whatsoever (so I run with -snes_mf), I measure a typical
> time between two time steps of between 5 and 20 seconds, and the number of
> function evaluations for each time step obtained with
> SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of
> course)
> >
> > This means a time per function evaluation of about 0.5 to 1 second, that
> is, 10 to 20 times slower.
> >
> > So I have some questions about this.
> >
> > 1. First does SNESGetNumberFunctionEvals take into account the function
> evaluations required to evaluate the Jacobian when -snes_mf is used, as
> well as the operations required by the GMRES (Krylov) method ? If it were
> the case, I would somehow intuitively expect a number larger than 17, which
> could explain the increase in time.
>
> PetscErrorCode  SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs)
> {
>   *nfuncs = snes->nfuncs;
> }
>
> PetscErrorCode  SNESComputeFunction(SNES snes,Vec x,Vec y)
> {
> ...
>   snes->nfuncs++;
> }
>
> PetscErrorCode  MatCreateSNESMF(SNES snes,Mat *J)
> {
> .....
>   if (snes->pc && snes->pcside == PC_LEFT) {
>     ierr = MatMFFDSetFunction(*J,(PetscErrorCode
> (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr);
>   } else {
>     ierr = MatMFFDSetFunction(*J,(PetscErrorCode
> (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr);
>   }
> }
>
>   So, yes I would expect all the function evaluations needed for the
> matrix-free Jacobian matrix vector product to be counted. You can also look
> at the number of GMRES Krylov iterations it took (which should have one
> multiply per iteration) to double check that the numbers make sense.
>
>   What does your -log_summary output look like? One thing that GMRES does
> is it introduces a global reduction with each multiple (hence a barrier
> across all your processes) on some systems this can be deadly.
>
>   Barry
>
>
> >
> > 2. In any case, I thought that all things considered, the function
> evaluation would be the most time consuming part of a Newton-Krylov solver,
> am I completely wrong about that ? Is the 10-20 factor legit ?
> >
> > I realize of course that preconditioning should make all this smoother,
> in particular allowing larger time steps, but here I am just concerned
> about the sheer Function evaluation time.
> >
> > Best regards
> >
> > Timothee NICOLAS
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/e0d8ea11/attachment-0001.html>
-------------- next part --------------
 
 This is an implicit MHD code based on the MIPS code 
 
 Setting all options 
 
 Start: Reading HINT2 equilibrium file 
 
 DATA: lr,lz,lphi=         128         128         640
 lsymmetry=           1
 r_min,r_max=   2.70000000000000        4.60000000000000     
 z_min,z_max= -0.950000000000000       0.950000000000000     
 phi_min,phi_max=  0.000000000000000E+000   6.28318530717959     
 dr,dz,dphi=  1.496062992125984E-002  1.496062992125984E-002
  9.817477042468103E-003
 pmax=  1.135504809500237E-002
 bmax=   2.98086676166910     
 
 End: Reading HINT2 equilibrium file 
 
 Creating nonlinear solver, getting geometrical info, and setting vectors 
 
 Allocating arrays for which it is required 
 Masks definition 
 Major radius and vacuum definition 
 Initializing PETSc Vecs with equilibrium values 
 Set the initial force local vectors (used to enforce the equilibrium) 
 Add a random perturbation to the velocity 
 Entering the main MHD Loop 
 
 Iteration number =    1 
 Time (tau_A) = 1.0000E-02 
  0 SNES Function norm 5.382589763410e-05 
  1 SNES Function norm 4.917642384947e-11 
  2 SNES Function norm 4.187109891461e-16 
 Kinetic Energy =  9.0028E-16 
 Magnetic Energy =  1.3544E-16 
 Total CPU time since PetscInitialize: 7.7581E+00 
 CPU time used for SNESSolve: 5.0903E-01 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =    2 
 Time (tau_A) = 2.0000E-02 
  0 SNES Function norm 8.317717709972e-16 
  1 SNES Function norm 1.373112118906e-16 
 Kinetic Energy =  8.9656E-16 
 Magnetic Energy =  5.4177E-16 
 Total CPU time since PetscInitialize: 5.0395E+01 
 CPU time used for SNESSolve: 3.0864E+01 
 Number of linear iterations :    6 
 Number of function evaluations :    9 
 Too few linear iterations; Time step increased to dt =  1.3333E-02 
 
 
 Iteration number =    3 
 Time (tau_A) = 3.3333E-02 
  0 SNES Function norm 1.732338962433e-05 
  1 SNES Function norm 3.798031074240e-11 
  2 SNES Function norm 5.941618837381e-16 
 Kinetic Energy =  8.8707E-16 
 Magnetic Energy =  1.4783E-15 
 Total CPU time since PetscInitialize: 6.2962E+01 
 CPU time used for SNESSolve: 8.4275E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    4 
 Time (tau_A) = 4.6667E-02 
  0 SNES Function norm 8.650290226831e-07 
  1 SNES Function norm 3.159589761232e-12 
  2 SNES Function norm 5.559310443305e-16 
 Kinetic Energy =  8.7490E-16 
 Magnetic Energy =  2.8512E-15 
 Total CPU time since PetscInitialize: 7.5298E+01 
 CPU time used for SNESSolve: 7.2959E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    5 
 Time (tau_A) = 6.0000E-02 
  0 SNES Function norm 1.814549199007e-06 
  1 SNES Function norm 6.550542684026e-12 
  2 SNES Function norm 5.561417252951e-16 
 Kinetic Energy =  8.5968E-16 
 Magnetic Energy =  4.6147E-15 
 Total CPU time since PetscInitialize: 8.6579E+01 
 CPU time used for SNESSolve: 8.3290E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    6 
 Time (tau_A) = 7.3333E-02 
  0 SNES Function norm 5.143158573116e-07 
  1 SNES Function norm 1.854573359134e-12 
  2 SNES Function norm 5.555619916593e-16 
 Kinetic Energy =  8.4209E-16 
 Magnetic Energy =  6.7217E-15 
 Total CPU time since PetscInitialize: 9.8827E+01 
 CPU time used for SNESSolve: 9.6296E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    7 
 Time (tau_A) = 8.6667E-02 
  0 SNES Function norm 1.241769129665e-07 
  1 SNES Function norm 2.229921948896e-13 
  2 SNES Function norm 5.547572310422e-16 
 Kinetic Energy =  8.2295E-16 
 Magnetic Energy =  9.1167E-15 
 Total CPU time since PetscInitialize: 1.1428E+02 
 CPU time used for SNESSolve: 1.1251E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    8 
 Time (tau_A) = 1.0000E-01 
  0 SNES Function norm 4.127596671841e-08 
  1 SNES Function norm 9.722821486344e-14 
  2 SNES Function norm 5.547721956164e-16 
 Kinetic Energy =  8.0310E-16 
 Magnetic Energy =  1.1740E-14 
 Total CPU time since PetscInitialize: 1.2603E+02 
 CPU time used for SNESSolve: 3.1856E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    9 
 Time (tau_A) = 1.1333E-01 
  0 SNES Function norm 6.526056598290e-08 
  1 SNES Function norm 1.551059106556e-13 
  2 SNES Function norm 5.554664894406e-16 
 Kinetic Energy =  7.8340E-16 
 Magnetic Energy =  1.4531E-14 
 Total CPU time since PetscInitialize: 1.3916E+02 
 CPU time used for SNESSolve: 1.0360E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   10 
 Time (tau_A) = 1.2667E-01 
  0 SNES Function norm 7.216736548862e-08 
  1 SNES Function norm 1.709747035544e-13 
  2 SNES Function norm 5.551116565705e-16 
 Kinetic Energy =  7.6463E-16 
 Magnetic Energy =  1.7432E-14 
 Total CPU time since PetscInitialize: 1.5486E+02 
 CPU time used for SNESSolve: 1.4236E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   11 
 Time (tau_A) = 1.4000E-01 
  0 SNES Function norm 7.957568936931e-08 
  1 SNES Function norm 7.164583362638e-13 
  2 SNES Function norm 5.547681147827e-16 
 Kinetic Energy =  7.4749E-16 
 Magnetic Energy =  2.0389E-14 
 Total CPU time since PetscInitialize: 1.6355E+02 
 CPU time used for SNESSolve: 5.0159E+00 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   12 
 Time (tau_A) = 1.5333E-01 
  0 SNES Function norm 8.580881227555e-08 
  1 SNES Function norm 7.177441407594e-13 
  2 SNES Function norm 5.549122251796e-16 
 Kinetic Energy =  7.3253E-16 
 Magnetic Energy =  2.3356E-14 
 Total CPU time since PetscInitialize: 1.7703E+02 
 CPU time used for SNESSolve: 8.7550E+00 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   13 
 Time (tau_A) = 1.6667E-01 
  0 SNES Function norm 9.036148523559e-08 
  1 SNES Function norm 7.596864452710e-13 
  2 SNES Function norm 5.552150394126e-16 
 Kinetic Energy =  7.2017E-16 
 Magnetic Energy =  2.6296E-14 
 Total CPU time since PetscInitialize: 1.9497E+02 
 CPU time used for SNESSolve: 1.0766E+01 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   14 
 Time (tau_A) = 1.8000E-01 
  0 SNES Function norm 9.318522525551e-08 
  1 SNES Function norm 8.310283593293e-13 
  2 SNES Function norm 5.549000838284e-16 
 Kinetic Energy =  7.1065E-16 
 Magnetic Energy =  2.9180E-14 
 Total CPU time since PetscInitialize: 2.0960E+02 
 CPU time used for SNESSolve: 8.9322E+00 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   15 
 Time (tau_A) = 1.9333E-01 
  0 SNES Function norm 9.426477938676e-08 
  1 SNES Function norm 9.230910004645e-13 
  2 SNES Function norm 5.545358678033e-16 
 Kinetic Energy =  7.0402E-16 
 Magnetic Energy =  3.1992E-14 
 Total CPU time since PetscInitialize: 2.2222E+02 
 CPU time used for SNESSolve: 3.2721E+00 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   16 
 Time (tau_A) = 2.0667E-01 
  0 SNES Function norm 9.362326845768e-08 
  1 SNES Function norm 1.296102793360e-13 
  2 SNES Function norm 5.547628931478e-16 
 Kinetic Energy =  7.0019E-16 
 Magnetic Energy =  3.4725E-14 
 Total CPU time since PetscInitialize: 2.4133E+02 
 CPU time used for SNESSolve: 4.5367E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   17 
 Time (tau_A) = 2.2000E-01 
  0 SNES Function norm 9.133250113813e-08 
  1 SNES Function norm 1.149831225960e-13 
  2 SNES Function norm 5.548205596917e-16 
 Kinetic Energy =  6.9892E-16 
 Magnetic Energy =  3.7381E-14 
 Total CPU time since PetscInitialize: 2.5612E+02 
 CPU time used for SNESSolve: 7.0432E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   18 
 Time (tau_A) = 2.3333E-01 
  0 SNES Function norm 8.751516705490e-08 
  1 SNES Function norm 1.044612716823e-13 
  2 SNES Function norm 5.546955425427e-16 
 Kinetic Energy =  6.9986E-16 
 Magnetic Energy =  3.9970E-14 
 Total CPU time since PetscInitialize: 2.7746E+02 
 CPU time used for SNESSolve: 1.7779E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   19 
 Time (tau_A) = 2.4667E-01 
  0 SNES Function norm 8.234829805103e-08 
  1 SNES Function norm 9.821327679226e-14 
  2 SNES Function norm 5.548194234735e-16 
 Kinetic Energy =  7.0258E-16 
 Magnetic Energy =  4.2510E-14 
 Total CPU time since PetscInitialize: 2.8796E+02 
 CPU time used for SNESSolve: 7.1363E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   20 
 Time (tau_A) = 2.6000E-01 
  0 SNES Function norm 7.607020340067e-08 
  1 SNES Function norm 9.607454720992e-14 
  2 SNES Function norm 5.546700693492e-16 
 Kinetic Energy =  7.0658E-16 
 Magnetic Energy =  4.5021E-14 
 Total CPU time since PetscInitialize: 3.0141E+02 
 CPU time used for SNESSolve: 9.5897E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   21 
 Time (tau_A) = 2.7333E-01 
  0 SNES Function norm 6.899369214087e-08 
  1 SNES Function norm 9.762527001937e-14 
  2 SNES Function norm 5.546534365057e-16 
 Kinetic Energy =  7.1138E-16 
 Magnetic Energy =  4.7526E-14 
 Total CPU time since PetscInitialize: 3.1839E+02 
 CPU time used for SNESSolve: 1.2120E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   22 
 Time (tau_A) = 2.8667E-01 
  0 SNES Function norm 6.153013990112e-08 
  1 SNES Function norm 1.022809264903e-13 
  2 SNES Function norm 5.559191241468e-16 
 Kinetic Energy =  7.1650E-16 
 Magnetic Energy =  5.0049E-14 
 Total CPU time since PetscInitialize: 3.3729E+02 
 CPU time used for SNESSolve: 1.4010E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   23 
 Time (tau_A) = 3.0000E-01 
  0 SNES Function norm 5.422883432610e-08 
  1 SNES Function norm 1.093821278111e-13 
  2 SNES Function norm 5.545153520497e-16 
 Kinetic Energy =  7.2152E-16 
 Magnetic Energy =  5.2614E-14 
 Total CPU time since PetscInitialize: 3.4506E+02 
 CPU time used for SNESSolve: 3.1828E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   24 
 Time (tau_A) = 3.1333E-01 
  0 SNES Function norm 4.782401973928e-08 
  1 SNES Function norm 1.182852644926e-13 
  2 SNES Function norm 5.545048290555e-16 
 Kinetic Energy =  7.2610E-16 
 Magnetic Energy =  5.5240E-14 
 Total CPU time since PetscInitialize: 3.5751E+02 
 CPU time used for SNESSolve: 3.4274E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   25 
 Time (tau_A) = 3.2667E-01 
  0 SNES Function norm 4.322955973001e-08 
  1 SNES Function norm 1.283072300260e-13 
  2 SNES Function norm 5.545681547284e-16 
 Kinetic Energy =  7.2996E-16 
 Magnetic Energy =  5.7944E-14 
 Total CPU time since PetscInitialize: 3.7469E+02 
 CPU time used for SNESSolve: 2.5263E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   26 
 Time (tau_A) = 3.4000E-01 
  0 SNES Function norm 4.132314115916e-08 
  1 SNES Function norm 1.386969354039e-13 
  2 SNES Function norm 5.540520147662e-16 
 Kinetic Energy =  7.3297E-16 
 Magnetic Energy =  6.0737E-14 
 Total CPU time since PetscInitialize: 3.8264E+02 
 CPU time used for SNESSolve: 3.1395E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   27 
 Time (tau_A) = 3.5333E-01 
  0 SNES Function norm 4.245684761661e-08 
  1 SNES Function norm 1.485504132442e-13 
  2 SNES Function norm 5.545487475751e-16 
 Kinetic Energy =  7.3503E-16 
 Magnetic Energy =  6.3627E-14 
 Total CPU time since PetscInitialize: 3.9247E+02 
 CPU time used for SNESSolve: 5.9592E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   28 
 Time (tau_A) = 3.6667E-01 
  0 SNES Function norm 4.616289783185e-08 
  1 SNES Function norm 1.567803392197e-13 
  2 SNES Function norm 5.548329671559e-16 
 Kinetic Energy =  7.3618E-16 
 Magnetic Energy =  6.6616E-14 
 Total CPU time since PetscInitialize: 4.0173E+02 
 CPU time used for SNESSolve: 5.8294E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   29 
 Time (tau_A) = 3.8000E-01 
  0 SNES Function norm 5.149150954830e-08 
  1 SNES Function norm 1.622328092002e-13 
  2 SNES Function norm 5.545725405186e-16 
 Kinetic Energy =  7.3651E-16 
 Magnetic Energy =  6.9706E-14 
 Total CPU time since PetscInitialize: 4.1289E+02 
 CPU time used for SNESSolve: 7.4448E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   30 
 Time (tau_A) = 3.9333E-01 
  0 SNES Function norm 5.751421272236e-08 
  1 SNES Function norm 1.639703075993e-13 
  2 SNES Function norm 5.551145331563e-16 
 Kinetic Energy =  7.3617E-16 
 Magnetic Energy =  7.2890E-14 
 Total CPU time since PetscInitialize: 4.2473E+02 
 CPU time used for SNESSolve: 9.8094E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   31 
 Time (tau_A) = 4.0667E-01 
  0 SNES Function norm 6.353011943021e-08 
  1 SNES Function norm 1.616610594001e-13 
  2 SNES Function norm 5.542606202109e-16 
 Kinetic Energy =  7.3536E-16 
 Magnetic Energy =  7.6164E-14 
 Total CPU time since PetscInitialize: 4.3767E+02 
 CPU time used for SNESSolve: 9.9831E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   32 
 Time (tau_A) = 4.2000E-01 
  0 SNES Function norm 6.905745342631e-08 
  1 SNES Function norm 1.557783742592e-13 
  2 SNES Function norm 5.544338434073e-16 
 Kinetic Energy =  7.3427E-16 
 Magnetic Energy =  7.9520E-14 
 Total CPU time since PetscInitialize: 4.6580E+02 
 CPU time used for SNESSolve: 2.7991E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   33 
 Time (tau_A) = 4.3333E-01 
  0 SNES Function norm 7.377639093458e-08 
  1 SNES Function norm 1.474466471932e-13 
  2 SNES Function norm 5.543376070989e-16 
 Kinetic Energy =  7.3312E-16 
 Magnetic Energy =  8.2950E-14 
 Total CPU time since PetscInitialize: 4.8419E+02 
 CPU time used for SNESSolve: 1.4762E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   34 
 Time (tau_A) = 4.4667E-01 
  0 SNES Function norm 7.748319943328e-08 
  1 SNES Function norm 1.381088301393e-13 
  2 SNES Function norm 5.550610629314e-16 
 Kinetic Energy =  7.3209E-16 
 Magnetic Energy =  8.6449E-14 
 Total CPU time since PetscInitialize: 4.9802E+02 
 CPU time used for SNESSolve: 1.3800E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   35 
 Time (tau_A) = 4.6000E-01 
  0 SNES Function norm 8.006178457271e-08 
  1 SNES Function norm 1.290725563320e-13 
  2 SNES Function norm 5.542186770005e-16 
 Kinetic Energy =  7.3134E-16 
 Magnetic Energy =  9.0010E-14 
 Total CPU time since PetscInitialize: 5.2157E+02 
 CPU time used for SNESSolve: 1.5918E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   36 
 Time (tau_A) = 4.7333E-01 
  0 SNES Function norm 8.146728382443e-08 
  1 SNES Function norm 1.213819113579e-13 
  2 SNES Function norm 5.543903059087e-16 
 Kinetic Energy =  7.3097E-16 
 Magnetic Energy =  9.3631E-14 
 Total CPU time since PetscInitialize: 5.3299E+02 
 CPU time used for SNESSolve: 7.5341E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   37 
 Time (tau_A) = 4.8667E-01 
  0 SNES Function norm 8.171674223552e-08 
  1 SNES Function norm 1.157134150395e-13 
  2 SNES Function norm 5.542672888432e-16 
 Kinetic Energy =  7.3106E-16 
 Magnetic Energy =  9.7311E-14 
 Total CPU time since PetscInitialize: 5.5695E+02 
 CPU time used for SNESSolve: 1.8279E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   38 
 Time (tau_A) = 5.0000E-01 
  0 SNES Function norm 8.088378896911e-08 
  1 SNES Function norm 1.124615100310e-13 
  2 SNES Function norm 5.544771436497e-16 
 Kinetic Energy =  7.3164E-16 
 Magnetic Energy =  1.0105E-13 
 Total CPU time since PetscInitialize: 5.7506E+02 
 CPU time used for SNESSolve: 4.0146E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   39 
 Time (tau_A) = 5.1333E-01 
  0 SNES Function norm 7.909554806664e-08 
  1 SNES Function norm 1.117368923673e-13 
  2 SNES Function norm 5.552869590808e-16 
 Kinetic Energy =  7.3268E-16 
 Magnetic Energy =  1.0485E-13 
 Total CPU time since PetscInitialize: 5.9949E+02 
 CPU time used for SNESSolve: 1.7531E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   40 
 Time (tau_A) = 5.2667E-01 
  0 SNES Function norm 7.653064797874e-08 
  1 SNES Function norm 1.134332671074e-13 
  2 SNES Function norm 5.551034395178e-16 
 Kinetic Energy =  7.3414E-16 
 Magnetic Energy =  1.0872E-13 
 Total CPU time since PetscInitialize: 6.2212E+02 
 CPU time used for SNESSolve: 1.4068E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Exiting the main MHD Loop 
 
 Deallocating remaining arrays 
 
 Destroying remaining Petsc elements 
 
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./a.out.local on a arch-linux2-cxx-opt named helios1497 with 256 processors, by tnicolas Tue Aug 25 14:55:05 2015
Using Petsc Release Version 3.6.0, Jun, 09, 2015 

                         Max       Max/Min        Avg      Total 
Time (sec):           6.518e+02      1.00002   6.518e+02
Objects:              1.666e+03      1.00000   1.666e+03
Flops:                4.582e+09      1.00000   4.582e+09  1.173e+12
Flops/sec:            7.029e+06      1.00002   7.029e+06  1.799e+09
MPI Messages:         8.081e+03      1.49454   6.746e+03  1.727e+06
MPI Message Lengths:  7.891e+08      3.82180   3.905e+04  6.744e+10
MPI Reductions:       3.368e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 6.5182e+02 100.0%  1.1729e+12 100.0%  1.727e+06 100.0%  3.905e+04      100.0%  3.367e+03 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

SNESSolve             40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80  46 93 99 95 80  2187
SNESFunctionEval     666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40  45 13 99 95 40   299
SNESJacobianEval      79 1.0 2.8245e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
SNESLineSearch        79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02  1 11 23 23 19   1 11 23 23 19 33068
VecView                2 1.0 3.2757e+01 6.9 0.00e+00 0.0 1.6e+04 1.9e+05 9.0e+00  3  0  1  5  0   3  0  1  5  0     0
VecDot                79 1.0 2.1557e-01 5.1 4.53e+07 1.0 0.0e+00 0.0e+00 7.9e+01  0  1  0  0  2   0  1  0  0  2 53797
VecMDot              468 1.0 2.1886e+00 2.4 9.31e+08 1.0 0.0e+00 0.0e+00 4.7e+02  0 20  0  0 14   0 20  0  0 14 108862
VecNorm              781 1.0 6.5641e-01 1.3 4.48e+08 1.0 0.0e+00 0.0e+00 7.8e+02  0 10  0  0 23   0 10  0  0 23 174663
VecScale            1202 1.0 5.7695e-01 2.1 3.45e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  8  0  0  0   0  8  0  0  0 152921
VecCopy             1092 1.0 1.3964e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               827 1.0 1.2017e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             1661 1.0 2.1384e+00 1.3 9.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0 21  0  0  0   0 21  0  0  0 114027
VecWAXPY            1365 1.0 2.9492e+00 1.1 5.48e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0 12  0  0  0   0 12  0  0  0 47586
VecMAXPY             547 1.0 1.7110e+00 1.0 1.20e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0 26  0  0  0   0 26  0  0  0 179407
VecAssemblyBegin       5 1.0 2.7040e-0213.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+01  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd         5 1.0 8.6308e-0511.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult      80 1.0 1.4662e-01 1.2 2.29e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 40050
VecScatterBegin     1335 1.0 2.4852e+00 1.2 0.00e+00 0.0 1.7e+06 3.8e+04 0.0e+00  0  0 99 97  0   0  0 99 97  0     0
VecScatterEnd       1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31  0  0  0  0  31  0  0  0  0     0
VecReduceArith       158 1.0 7.9838e-02 1.1 9.06e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 290521
VecReduceComm         79 1.0 1.2613e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.9e+01  0  0  0  0  2   0  0  0  0  2     0
VecNormalize         547 1.0 4.6664e-01 1.2 4.25e+08 1.0 0.0e+00 0.0e+00 4.7e+02  0  9  0  0 14   0  9  0  0 14 233267
MatMult MF           547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25962
MatMult              547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25960
MatAssemblyBegin      79 1.0 1.8835e-0519.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd        79 1.0 2.2821e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog       468 1.0 3.4735e+00 1.6 1.86e+09 1.0 0.0e+00 0.0e+00 4.7e+02  0 41  0  0 14   0 41  0  0 14 137185
KSPSetUp              79 1.0 7.9496e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
KSPSolve              79 1.0 1.4808e+01 1.0 3.69e+09 1.0 1.2e+06 3.8e+04 1.9e+03  2 81 69 67 58   2 81 69 67 58 63857
PCSetUp               79 1.0 2.1935e-05 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply              547 1.0 6.8284e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

                SNES     1              1         1332     0
      SNESLineSearch     1              1          864     0
              DMSNES     2              1          664     0
              Vector  1630            290    654183152     0
      Vector Scatter     3              1         1600     0
             MatMFFD     1              1          784     0
              Matrix     1              1         2304     0
    Distributed Mesh     3              2         9456     0
Star Forest Bipartite Graph     6              4         3312     0
     Discrete System     3              2         1696     0
           Index Set     6              6       187248     0
   IS L to G Mapping     2              1          948     0
       Krylov Solver     1              1        18360     0
     DMKSP interface     1              0            0     0
      Preconditioner     1              1          880     0
              Viewer     3              2         1536     0
         PetscRandom     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 0
Average time for MPI_Barrier(): 9.39369e-06
Average time for zero size MPI_Send(): 0.000582926
#PETSc Option Table entries:
-dt 0.01
-log_summary
-nts 40
-skip 500
-snes_mf
-snes_monitor
-time_limit 35950
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --prefix=/csc/softs/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real --with-clanguage=cxx --with-debugging=0 --with-x=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --with-fortran --known-mpi-shared-libraries=1 --with-scalar-type=real --with-precision=double --CFLAGS="-g -O3 -mavx -mkl" --CXXFLAGS="-g -O3 -mavx -mkl" --FFLAGS="-g -O3 -mavx -mkl"
-----------------------------------------
Libraries compiled on Mon Jun 22 11:05:39 2015 on helios85 
Machine characteristics: Linux-2.6.32-504.16.2.el6.Bull.74.x86_64-x86_64-with-redhat-6.4-Santiago
Using PETSc directory: /csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0
Using PETSc arch: arch-linux2-cxx-opt
-----------------------------------------

Using C compiler: mpicxx -g -O3 -mavx -mkl   -fPIC   ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -g -O3 -mavx -mkl -fPIC   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/include -I/opt/mpi/bullxmpi/1.2.8.2/include
-----------------------------------------

Using C linker: mpicxx
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/lib -L/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/lib -lpetsc -lhwloc -lxml2 -lssl -lcrypto -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpi_f90 -lmpi_f77 -lm -lifport -lifcore -lm -lmpi_cxx -ldl -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -lmpi -lnuma -lrt -lnsl -lutil -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -limf -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -ldl 
-----------------------------------------

From bsmith at mcs.anl.gov  Tue Aug 25 01:47:37 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 25 Aug 2015 01:47:37 -0500
Subject: [petsc-users] Function evaluation slowness ?
In-Reply-To: <CAGi1ndQuuO+tMC_1r=DyhrsW0n4b1aMpvKXBF_BDcKpuqoHeUQ@mail.gmail.com>
References: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
	<0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>
	<CAGi1ndQuuO+tMC_1r=DyhrsW0n4b1aMpvKXBF_BDcKpuqoHeUQ@mail.gmail.com>
Message-ID: <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov>


  The results are kind of funky,

------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
SNESSolve             40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80  46 93 99 95 80  2187
SNESFunctionEval     666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40  45 13 99 95 40   299
SNESLineSearch        79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02  1 11 23 23 19   1 11 23 23 19 33068
VecScatterEnd       1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31  0  0  0  0  31  0  0  0  0     0
MatMult MF           547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25962
MatMult              547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25960

look at the %T time for global SNES solve is 46 % of the total time, function evaluations are 45% but MatMult are only 2% (and yet matmult should contain most of the function evaluations). I cannot explain this. Also the VecScatterEnd is HUGE and has a bad load balance of 5.8  Why are there so many more scatters than function evaluations? What other operations are you doing that require scatters?

It's almost like you have some mysterious "extra" function calls outside of the SNESSolve that are killing the performance? It might help to understand the performance to strip out all extraneous computations not needed (like in custom monitors etc).

 Barry


> On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> 
> Here is the log summary (attached). At the beginning are personal prints, you can skip. I seem to have a memory crash in the present state after typically 45 iterations (that's why I used 40 here), the log summary indicates some creations without destruction of Petsc objects (I will fix this immediately), that may cause the memory crash, but I don't think it's the cause of the slow function evaluations.
> 
> The log_summary is consistent with 0.7s per function evaluation (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately the same amount of time (is it normal ?). And the other long operation is VecScatterEnd. I assume it is the time used in process communications ? In which case I suppose it is normal that it takes a significant amount of time. 
> 
> So this ~10 times increase does not look normal right ?
> 
> Best
> 
> Timothee NICOLAS
> 
> 
> 2015-08-25 14:56 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> 
> > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> >
> > Hi,
> >
> > I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one).
> >
> > I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms.
> >
> > Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course)
> >
> > This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower.
> >
> > So I have some questions about this.
> >
> > 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time.
> 
> PetscErrorCode  SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs)
> {
>   *nfuncs = snes->nfuncs;
> }
> 
> PetscErrorCode  SNESComputeFunction(SNES snes,Vec x,Vec y)
> {
> ...
>   snes->nfuncs++;
> }
> 
> PetscErrorCode  MatCreateSNESMF(SNES snes,Mat *J)
> {
> .....
>   if (snes->pc && snes->pcside == PC_LEFT) {
>     ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr);
>   } else {
>     ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr);
>   }
> }
> 
>   So, yes I would expect all the function evaluations needed for the matrix-free Jacobian matrix vector product to be counted. You can also look at the number of GMRES Krylov iterations it took (which should have one multiply per iteration) to double check that the numbers make sense.
> 
>   What does your -log_summary output look like? One thing that GMRES does is it introduces a global reduction with each multiple (hence a barrier across all your processes) on some systems this can be deadly.
> 
>   Barry
> 
> 
> >
> > 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ?
> >
> > I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time.
> >
> > Best regards
> >
> > Timothee NICOLAS
> 
> 
> <log_summary.txt>


From timothee.nicolas at gmail.com  Tue Aug 25 02:06:54 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Tue, 25 Aug 2015 16:06:54 +0900
Subject: [petsc-users] Function evaluation slowness ?
In-Reply-To: <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov>
References: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
	<0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>
	<CAGi1ndQuuO+tMC_1r=DyhrsW0n4b1aMpvKXBF_BDcKpuqoHeUQ@mail.gmail.com>
	<26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov>
Message-ID: <CAGi1ndRv1bsz-YxNB-jNJp_zoXZ7kokJwyDrCghyPLsrAfHcww@mail.gmail.com>

OK, I see,

Might it be that I do something a bit funky to obtain a good guess for
solve ? I had he following idea, which I used with great success on a very
different problem (much simpler, maybe that's why it worked) : obtain the
initial guess as a cubic extrapolation of the preceding solutions. The idea
is that I expect my solution to be reasonably smooth over time, so
considering this, the increment of the fields should also be continuous (I
solve for the increments, not the fields themselves). Therefore, I store in
my user context the current vector Xk as well as the last three solutions
Xkm1 and Xkm2.

I define

dxm2 = Xkm1 - Xkm2
dxm1 = Xk - Xkm1

And I use the result of the last SNESSolve as

dx = Xkp1 - Xk

Then I set the new dx initial guess as the pointwise cubic extrapolation of
(dxm2,dxm1,dx)

However it seems pretty local and I don't see why scatters would be
required for this.

I printed the routine I use to do this below. In any case I will clean up a
bit, remove the extra stuff (not much there however). If it is not
sufficient, I will transform my form function in a dummy which does not
require computations and see what happens.

Timothee

  PetscErrorCode :: ierr

  PetscScalar :: M(3,3)
  Vec         :: xkm2,xkm1
  Vec         :: coef1,coef2,coef3
  PetscScalar :: a,b,c,t,det

  a = user%tkm1
  b = user%tk
  c = user%t
  t = user%t+user%dt

  det = b*a**2 + c*b**2 + a*c**2 - (c*a**2 + a*b**2 + b*c**2)

  M(1,1) = (b-c)/det
  M(2,1) = (c**2-b**2)/det
  M(3,1) = (c*b**2-b*c**2)/det

  M(1,2) = (c-a)/det
  M(2,2) = (a**2-c**2)/det
  M(3,2) = (a*c**2-c*a**2)/det

  M(1,3) = (a-b)/det
  M(2,3) = (b**2-a**2)/det
  M(3,3) = (b*a**2-a*b**2)/det

  call VecDuplicate(x,xkm1,ierr)
  call VecDuplicate(x,xkm2,ierr)

  call VecDuplicate(x,coef1,ierr)
  call VecDuplicate(x,coef2,ierr)
  call VecDuplicate(x,coef3,ierr)

  call VecWAXPY(xkm2,-one,user%Xkm2,user%Xkm1,ierr)
  call VecWAXPY(xkm1,-one,user%Xkm1,user%Xk,ierr)

  ! The following lines correspond to the following simple
operation

  ! coef1 = M(1,1)*alpha + M(1,2)*beta +
M(1,3)*gamma

  ! coef2 = M(2,1)*alpha + M(2,2)*beta +
M(2,3)*gamma

  ! coef3 = M(3,1)*alpha + M(3,2)*beta +
M(3,3)*gamma

  call VecCopy(xkm2,coef1,ierr)
  call VecScale(coef1,M(1,1),ierr)
  call VecAXPY(coef1,M(1,2),xkm1,ierr)
  call VecAXPY(coef1,M(1,3),x,ierr)

  call VecCopy(xkm2,coef2,ierr)
  call VecScale(coef2,M(2,1),ierr)
  call VecAXPY(coef2,M(2,2),xkm1,ierr)
  call VecAXPY(coef2,M(2,3),x,ierr)

  call VecCopy(xkm2,coef3,ierr)
  call VecScale(coef3,M(3,1),ierr)
  call VecAXPY(coef3,M(3,2),xkm1,ierr)
  call VecAXPY(coef3,M(3,3),x,ierr)

  call VecCopy(coef3,x,ierr)
  call VecAXPY(x,t,coef2,ierr)
  call VecAXPY(x,t**2,coef1,ierr)

  call VecDestroy(xkm2,ierr)
  call VecDestroy(xkm1,ierr)

  call VecDestroy(coef1,ierr)
  call VecDestroy(coef2,ierr)
  call VecDestroy(coef3,ierr)


2015-08-25 15:47 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:

>
>   The results are kind of funky,
>
>
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops
>        --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
> SNESSolve             40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04
> 2.7e+03 46 93 99 95 80  46 93 99 95 80  2187
> SNESFunctionEval     666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04
> 1.3e+03 45 13 99 95 40  45 13 99 95 40   299
> SNESLineSearch        79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04
> 6.3e+02  1 11 23 23 19   1 11 23 23 19 33068
> VecScatterEnd       1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 31  0  0  0  0  31  0  0  0  0     0
> MatMult MF           547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04
> 1.1e+03  2 28 81 78 34   2 28 81 78 34 25962
> MatMult              547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04
> 1.1e+03  2 28 81 78 34   2 28 81 78 34 25960
>
> look at the %T time for global SNES solve is 46 % of the total time,
> function evaluations are 45% but MatMult are only 2% (and yet matmult
> should contain most of the function evaluations). I cannot explain this.
> Also the VecScatterEnd is HUGE and has a bad load balance of 5.8  Why are
> there so many more scatters than function evaluations? What other
> operations are you doing that require scatters?
>
> It's almost like you have some mysterious "extra" function calls outside
> of the SNESSolve that are killing the performance? It might help to
> understand the performance to strip out all extraneous computations not
> needed (like in custom monitors etc).
>
>  Barry
>
>
>
>
>
>
> > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas <
> timothee.nicolas at gmail.com> wrote:
> >
> > Here is the log summary (attached). At the beginning are personal
> prints, you can skip. I seem to have a memory crash in the present state
> after typically 45 iterations (that's why I used 40 here), the log summary
> indicates some creations without destruction of Petsc objects (I will fix
> this immediately), that may cause the memory crash, but I don't think it's
> the cause of the slow function evaluations.
> >
> > The log_summary is consistent with 0.7s per function evaluation
> (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately
> the same amount of time (is it normal ?). And the other long operation is
> VecScatterEnd. I assume it is the time used in process communications ? In
> which case I suppose it is normal that it takes a significant amount of
> time.
> >
> > So this ~10 times increase does not look normal right ?
> >
> > Best
> >
> > Timothee NICOLAS
> >
> >
> > 2015-08-25 14:56 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> >
> > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas <
> timothee.nicolas at gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > I am testing PETSc on the supercomputer where I used to run my
> explicit MHD code. For my tests I use 256 processes on a problem of size
> 128*128*640 = 10485760, that is, 40960 grid points per process, and 8
> degrees of freedom (or physical fields). The explicit code was using
> Runge-Kutta 4 for the time scheme, which means 4 function evaluation per
> time step (plus one operation to put everything together, but let's forget
> this one).
> > >
> > > I could thus easily determine that the typical time required for a
> function evaluation was of the order of 50 ms.
> > >
> > > Now with the implicit Newton-Krylov solver written in PETSc, in the
> present state where for now I have not implemented any Jacobian or
> preconditioner whatsoever (so I run with -snes_mf), I measure a typical
> time between two time steps of between 5 and 20 seconds, and the number of
> function evaluations for each time step obtained with
> SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of
> course)
> > >
> > > This means a time per function evaluation of about 0.5 to 1 second,
> that is, 10 to 20 times slower.
> > >
> > > So I have some questions about this.
> > >
> > > 1. First does SNESGetNumberFunctionEvals take into account the
> function evaluations required to evaluate the Jacobian when -snes_mf is
> used, as well as the operations required by the GMRES (Krylov) method ? If
> it were the case, I would somehow intuitively expect a number larger than
> 17, which could explain the increase in time.
> >
> > PetscErrorCode  SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs)
> > {
> >   *nfuncs = snes->nfuncs;
> > }
> >
> > PetscErrorCode  SNESComputeFunction(SNES snes,Vec x,Vec y)
> > {
> > ...
> >   snes->nfuncs++;
> > }
> >
> > PetscErrorCode  MatCreateSNESMF(SNES snes,Mat *J)
> > {
> > .....
> >   if (snes->pc && snes->pcside == PC_LEFT) {
> >     ierr = MatMFFDSetFunction(*J,(PetscErrorCode
> (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr);
> >   } else {
> >     ierr = MatMFFDSetFunction(*J,(PetscErrorCode
> (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr);
> >   }
> > }
> >
> >   So, yes I would expect all the function evaluations needed for the
> matrix-free Jacobian matrix vector product to be counted. You can also look
> at the number of GMRES Krylov iterations it took (which should have one
> multiply per iteration) to double check that the numbers make sense.
> >
> >   What does your -log_summary output look like? One thing that GMRES
> does is it introduces a global reduction with each multiple (hence a
> barrier across all your processes) on some systems this can be deadly.
> >
> >   Barry
> >
> >
> > >
> > > 2. In any case, I thought that all things considered, the function
> evaluation would be the most time consuming part of a Newton-Krylov solver,
> am I completely wrong about that ? Is the 10-20 factor legit ?
> > >
> > > I realize of course that preconditioning should make all this
> smoother, in particular allowing larger time steps, but here I am just
> concerned about the sheer Function evaluation time.
> > >
> > > Best regards
> > >
> > > Timothee NICOLAS
> >
> >
> > <log_summary.txt>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/3557d500/attachment.html>

From gideon.simpson at gmail.com  Tue Aug 25 09:06:25 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 25 Aug 2015 10:06:25 -0400
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
	<1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
Message-ID: <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com>

Regarding the MUMPS issue, I?m not sure if this is useful, but when I run with the mumps flags  -mat_mumps_icntl_4 4, to see the progress, it hangs at this point:


 ... Structural symmetry (in percent)=   75
 Density: NBdense, Average, Median   =    2    9    7
 Ordering based on METIS 

-gideon

> On Aug 22, 2015, at 5:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense:
>> 
>> 1.  For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES.    There?s no error message, it just sits there and doesn?t do anything.
> 
>  You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this.
>> 
>> 2.  When running with SuperLU dist, I got the following error, with no further information:
> 
>  The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist  If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes
> 
>   Barry
> 
> 
>> 
>> [3]PETSC ERROR: ------------------------------------------------------------------------
>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [3]PETSC ERROR: likely location of problem given in stack below
>> [3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [3]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [3]PETSC ERROR:       is given.
>> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [3]PETSC ERROR: Signal received
>> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> --------------------------------------------------------------------------
>> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD 
>> with errorcode 59.
>> 
>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> You may or may not see output from other processes, depending on
>> exactly when Open MPI kills them.
>> --------------------------------------------------------------------------
>> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort
>> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>> [6]PETSC ERROR: ------------------------------------------------------------------------
>> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [6]PETSC ERROR: likely location of problem given in stack below
>> [6]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [6]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [6]PETSC ERROR:       is given.
>> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [6]PETSC ERROR: Signal received
>> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [7]PETSC ERROR: ------------------------------------------------------------------------
>> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [7]PETSC ERROR: likely location of problem given in stack below
>> [7]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [7]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [7]PETSC ERROR:       is given.
>> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [7]PETSC ERROR: Signal received
>> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [0]PETSC ERROR: ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [0]PETSC ERROR: likely location of problem given in stack below
>> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [0]PETSC ERROR:       is given.
>> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [0]PETSC ERROR: Signal received
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [1]PETSC ERROR: ------------------------------------------------------------------------
>> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [1]PETSC ERROR: likely location of problem given in stack below
>> [1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [1]PETSC ERROR:       is given.
>> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [1]PETSC ERROR: Signal received
>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [2]PETSC ERROR: ------------------------------------------------------------------------
>> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [2]PETSC ERROR: likely location of problem given in stack below
>> [2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [2]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [2]PETSC ERROR:       is given.
>> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [2]PETSC ERROR: Signal received
>> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [4]PETSC ERROR: ------------------------------------------------------------------------
>> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [4]PETSC ERROR: likely location of problem given in stack below
>> [4]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [4]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [4]PETSC ERROR:       is given.
>> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [4]PETSC ERROR: Signal received
>> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> [5]PETSC ERROR: ------------------------------------------------------------------------
>> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>> [5]PETSC ERROR: likely location of problem given in stack below
>> [5]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [5]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [5]PETSC ERROR:       is given.
>> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> [5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [5]PETSC ERROR: Signal received
>> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 
>> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> 
>> -gideon
>> 
>> 
> 


From zonexo at gmail.com  Tue Aug 25 10:19:13 2015
From: zonexo at gmail.com (Wee Beng Tay)
Date: Tue, 25 Aug 2015 23:19:13 +0800
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CAMYG4G=qoc1HNknF9STgnzd9MUUVrZL_LiM8etY8VkkyJ5PtAA@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAMYG4G=M+uegDpdwNy32DhtpJsCVQFviA62UE6vguxxpDJMBJA@mail.gmail.com>
	<1440468109831-51ceb105-9c281f1a-bc754585@gmail.com>
	<CAMYG4G=qoc1HNknF9STgnzd9MUUVrZL_LiM8etY8VkkyJ5PtAA@mail.gmail.com>
Message-ID: <1440515955447-95e4864a-05f6749c-5e9fdfbc@gmail.com>

Hi,

So can I use multigrid directly if using matsetvalues stencil?

Thanks


Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Tue, Aug 25, 2015 at 10:11 AM, Matthew Knepley < knepley at gmail.com [knepley at gmail.com] > wrote:
On Mon, Aug 24, 2015 at 9:01 PM, Wee Beng Tay < zonexo at gmail.com [zonexo at gmail.com] > wrote:


Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Mon, Aug 24, 2015 at 6:21 PM, Matthew Knepley < knepley at gmail.com [knepley at gmail.com] > wrote:
On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay < zonexo at gmail.com [zonexo at gmail.com] > wrote:
Hi,
I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2
directions (y,z)
Previously I was using MatSetValues with global indices. However, now I'm using
DM and global indices is much more difficult.
I come across MatSetValuesStencil or MatSetValuesLocal.
So what's the difference bet the one since they both seem to work locally?
No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes global
vertex numbers.
So MatSetValuesStencil() takes global vertex numbers. Do you mean the natural or
petsc ordering?
There is no PETSc ordering for vertices, only the natural ordering.
Thanks,
Matt Which is a simpler/better option?
MatSetValuesStencil() Is there an example in Fortran for MatSetValuesStencil?
Timoth?e Nicolas shows one in his reply.

Do I also need to use DMDAGetAO together with MatSetValuesStencil or
MatSetValuesLocal?
No.
Thanks,
Matt Thanks!


--
What most experimenters take for granted before they begin their experiments is
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener


--
What most experimenters take for granted before they begin their experiments is
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/08386f92/attachment.html>

From knepley at gmail.com  Tue Aug 25 10:31:09 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 25 Aug 2015 10:31:09 -0500
Subject: [petsc-users] Insert values into matrix using
	MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <1440515955447-95e4864a-05f6749c-5e9fdfbc@gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAMYG4G=M+uegDpdwNy32DhtpJsCVQFviA62UE6vguxxpDJMBJA@mail.gmail.com>
	<1440468109831-51ceb105-9c281f1a-bc754585@gmail.com>
	<CAMYG4G=qoc1HNknF9STgnzd9MUUVrZL_LiM8etY8VkkyJ5PtAA@mail.gmail.com>
	<1440515955447-95e4864a-05f6749c-5e9fdfbc@gmail.com>
Message-ID: <CAMYG4G=A9jq_YojiQDxM+B-XQeqf15XomdDXCxk=nXDXsbnQ=g@mail.gmail.com>

On Tue, Aug 25, 2015 at 10:19 AM, Wee Beng Tay <zonexo at gmail.com> wrote:

> Hi,
>
> So can I use multigrid directly if using matsetvalues stencil?
>
Do you mean, if you use MatSetStencil() then you statements will work no
matter what grid comes in to your residual function?
That is true.

  Thanks,

     Matt

> Thanks
>
>
> Sent using CloudMagic
> <https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2>
> On Tue, Aug 25, 2015 at 10:11 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
> On Mon, Aug 24, 2015 at 9:01 PM, Wee Beng Tay <zonexo at gmail.com> wrote:
>
>>
>>
>> Sent using CloudMagic
>> <https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2>
>> On Mon, Aug 24, 2015 at 6:21 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>> On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay <zonexo at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI
>>> along 2 directions (y,z)
>>>
>>> Previously I was using MatSetValues with global indices. However, now
>>> I'm using DM and global indices is much more difficult.
>>>
>>> I come across MatSetValuesStencil or MatSetValuesLocal.
>>>
>>> So what's the difference bet the one since they both seem to work
>>> locally?
>>>
>>
>> No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes
>> global vertex numbers.
>>
>>
>> So MatSetValuesStencil() takes global vertex numbers. Do you mean the
>> natural or petsc ordering?
>>
>
> There is no PETSc ordering for vertices, only the natural ordering.
>
> Thanks,
>
> Matt
>
>> Which is a simpler/better option?
>>>
>>
>> MatSetValuesStencil()
>>
>>> Is there an example in Fortran for MatSetValuesStencil?
>>>
>>
>> Timoth?e Nicolas shows one in his reply.
>>
>> Do I also need to use DMDAGetAO together with MatSetValuesStencil or
>>> MatSetValuesLocal?
>>>
>>
>> No.
>>
>> Thanks,
>>
>> Matt
>>
>>> Thanks!
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/34f65c5f/attachment.html>

From hzhang at mcs.anl.gov  Tue Aug 25 11:24:41 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Tue, 25 Aug 2015 11:24:41 -0500
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
	<1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
	<7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com>
Message-ID: <CAGCphBuJT+Mq0PkLDf_DJTuq+HPj9txNSCDkjMM1HSy9w3Ghyw@mail.gmail.com>

Gideon:
-mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None)
This is for algorithmic diagnosis, not for regular runs. Use default '0'
for it.

Hong

On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson <gideon.simpson at gmail.com>
wrote:

> Regarding the MUMPS issue, I?m not sure if this is useful, but when I run
> with the mumps flags  -mat_mumps_icntl_4 4, to see the progress, it hangs
> at this point:
>
>
>  ... Structural symmetry (in percent)=   75
>  Density: NBdense, Average, Median   =    2    9    7
>  Ordering based on METIS
>
> -gideon
>
> > On Aug 22, 2015, at 5:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >
> >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
> >>
> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by
> PETsc, in the following sense:
> >>
> >> 1.  For large enough systems, which seems to vary depending on which
> computer I?m on, MUMPS seems to just die and never start, when it?s used as
> the linear solver within a SNES.    There?s no error message, it just sits
> there and doesn?t do anything.
> >
> >  You will need to use a debugger to figure out where it is "hanging"; we
> haven't heard reports about this.
> >>
> >> 2.  When running with SuperLU dist, I got the following error, with no
> further information:
> >
> >  The last release of SuperLU_DIST had some pretty nasty bugs, memory
> corruption that caused crashes etc. We think they are now fixed if you use
> the maint branch of the PETSc repository and --download-superlu_dist  If
> you stick with the PETSc release and SuperLU_Dist you are using you will
> keep seeing these crashes
> >
> >   Barry
> >
> >
> >>
> >> [3]PETSC ERROR:
> ------------------------------------------------------------------------
> >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> >> [3]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >> [3]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >> [3]PETSC ERROR: likely location of problem given in stack below
> >> [3]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >> [3]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >> [3]PETSC ERROR:       is given.
> >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [3]PETSC ERROR: [3] MatSolve line 3104
> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [3]PETSC ERROR: [3] PCApply_LU line 194
> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258
> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [3]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [3]PETSC ERROR: Signal received
> >> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
> proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [3]PETSC ERROR: Configure options
> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >>
> --------------------------------------------------------------------------
> >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
> >> with errorcode 59.
> >>
> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> >> You may or may not see output from other processes, depending on
> >> exactly when Open MPI kills them.
> >>
> --------------------------------------------------------------------------
> >> [proteusi01:14037] 1 more process has sent help message
> help-mpi-api.txt / mpi-abort
> >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages
> >> [6]PETSC ERROR:
> ------------------------------------------------------------------------
> >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> >> [6]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >> [6]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >> [6]PETSC ERROR: likely location of problem given in stack below
> >> [6]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >> [6]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >> [6]PETSC ERROR:       is given.
> >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [6]PETSC ERROR: [6] MatSolve line 3104
> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [6]PETSC ERROR: [6] PCApply_LU line 194
> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258
> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [6]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [6]PETSC ERROR: Signal received
> >> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
> proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [6]PETSC ERROR: Configure options
> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [7]PETSC ERROR:
> ------------------------------------------------------------------------
> >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> >> [7]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >> [7]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >> [7]PETSC ERROR: likely location of problem given in stack below
> >> [7]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >> [7]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >> [7]PETSC ERROR:       is given.
> >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [7]PETSC ERROR: [7] MatSolve line 3104
> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [7]PETSC ERROR: [7] PCApply_LU line 194
> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258
> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [7]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [7]PETSC ERROR: Signal received
> >> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
> proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [7]PETSC ERROR: Configure options
> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> >> [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >> [0]PETSC ERROR: likely location of problem given in stack below
> >> [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >> [0]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >> [0]PETSC ERROR:       is given.
> >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [0]PETSC ERROR: [0] MatSolve line 3104
> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [0]PETSC ERROR: [0] PCApply_LU line 194
> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258
> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [0]PETSC ERROR: Signal received
> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
> proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [0]PETSC ERROR: Configure options
> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> >> [1]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >> [1]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >> [1]PETSC ERROR: likely location of problem given in stack below
> >> [1]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >> [1]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >> [1]PETSC ERROR:       is given.
> >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [1]PETSC ERROR: [1] MatSolve line 3104
> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [1]PETSC ERROR: [1] PCApply_LU line 194
> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258
> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [1]PETSC ERROR: Signal received
> >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
> proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [1]PETSC ERROR: Configure options
> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> >> [2]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >> [2]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >> [2]PETSC ERROR: likely location of problem given in stack below
> >> [2]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >> [2]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >> [2]PETSC ERROR:       is given.
> >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [2]PETSC ERROR: [2] MatSolve line 3104
> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [2]PETSC ERROR: [2] PCApply_LU line 194
> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258
> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [2]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [2]PETSC ERROR: Signal received
> >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
> proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [2]PETSC ERROR: Configure options
> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [4]PETSC ERROR:
> ------------------------------------------------------------------------
> >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> >> [4]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >> [4]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >> [4]PETSC ERROR: likely location of problem given in stack below
> >> [4]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >> [4]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >> [4]PETSC ERROR:       is given.
> >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [4]PETSC ERROR: [4] MatSolve line 3104
> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [4]PETSC ERROR: [4] PCApply_LU line 194
> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258
> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [4]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [4]PETSC ERROR: Signal received
> >> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
> proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [4]PETSC ERROR: Configure options
> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [5]PETSC ERROR:
> ------------------------------------------------------------------------
> >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
> batch system) has told this process to end
> >> [5]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> >> [5]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> >> [5]PETSC ERROR: likely location of problem given in stack below
> >> [5]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> >> [5]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> >> [5]PETSC ERROR:       is given.
> >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121
> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [5]PETSC ERROR: [5] MatSolve line 3104
> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [5]PETSC ERROR: [5] PCApply_LU line 194
> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258
> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [5]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >> [5]PETSC ERROR: Signal received
> >> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
> proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [5]PETSC ERROR: Configure options
> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >>
> >> -gideon
> >>
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/12a2081d/attachment-0001.html>

From gideon.simpson at gmail.com  Tue Aug 25 12:06:53 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 25 Aug 2015 13:06:53 -0400
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <CAGCphBuJT+Mq0PkLDf_DJTuq+HPj9txNSCDkjMM1HSy9w3Ghyw@mail.gmail.com>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
	<1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
	<7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com>
	<CAGCphBuJT+Mq0PkLDf_DJTuq+HPj9txNSCDkjMM1HSy9w3Ghyw@mail.gmail.com>
Message-ID: <83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com>

Hi Hong,

I ran with that flag because, while solving a SNES with MUMPS, the code would just sit there as though it had died, and never seem to recover.  I tried using that flag just to determine where it had stalled, which was at the "ordering based on METIS? bit.

-gideon

> On Aug 25, 2015, at 12:24 PM, Hong <hzhang at mcs.anl.gov> wrote:
> 
> Gideon:
> -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None)
> This is for algorithmic diagnosis, not for regular runs. Use default '0' for it.
> 
> Hong
> 
> On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> Regarding the MUMPS issue, I?m not sure if this is useful, but when I run with the mumps flags  -mat_mumps_icntl_4 4, to see the progress, it hangs at this point:
> 
> 
>  ... Structural symmetry (in percent)=   75
>  Density: NBdense, Average, Median   =    2    9    7
>  Ordering based on METIS
> 
> -gideon
> 
> > On Aug 22, 2015, at 5:12 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> >
> >
> >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>
> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense:
> >>
> >> 1.  For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES.    There?s no error message, it just sits there and doesn?t do anything.
> >
> >  You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this.
> >>
> >> 2.  When running with SuperLU dist, I got the following error, with no further information:
> >
> >  The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist  If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes
> >
> >   Barry
> >
> >
> >>
> >> [3]PETSC ERROR: ------------------------------------------------------------------------
> >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> >> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> >> [3]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> >> [3]PETSC ERROR: likely location of problem given in stack below
> >> [3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [3]PETSC ERROR:       INSTEAD the line number of the start of the function
> >> [3]PETSC ERROR:       is given.
> >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >> [3]PETSC ERROR: Signal received
> >> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> --------------------------------------------------------------------------
> >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
> >> with errorcode 59.
> >>
> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> >> You may or may not see output from other processes, depending on
> >> exactly when Open MPI kills them.
> >> --------------------------------------------------------------------------
> >> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort
> >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
> >> [6]PETSC ERROR: ------------------------------------------------------------------------
> >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> >> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> >> [6]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> >> [6]PETSC ERROR: likely location of problem given in stack below
> >> [6]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [6]PETSC ERROR:       INSTEAD the line number of the start of the function
> >> [6]PETSC ERROR:       is given.
> >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >> [6]PETSC ERROR: Signal received
> >> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [7]PETSC ERROR: ------------------------------------------------------------------------
> >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> >> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> >> [7]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> >> [7]PETSC ERROR: likely location of problem given in stack below
> >> [7]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [7]PETSC ERROR:       INSTEAD the line number of the start of the function
> >> [7]PETSC ERROR:       is given.
> >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >> [7]PETSC ERROR: Signal received
> >> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [0]PETSC ERROR: ------------------------------------------------------------------------
> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> >> [0]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> >> [0]PETSC ERROR: likely location of problem given in stack below
> >> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> >> [0]PETSC ERROR:       is given.
> >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >> [0]PETSC ERROR: Signal received
> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [1]PETSC ERROR: ------------------------------------------------------------------------
> >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> >> [1]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> >> [1]PETSC ERROR: likely location of problem given in stack below
> >> [1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
> >> [1]PETSC ERROR:       is given.
> >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >> [1]PETSC ERROR: Signal received
> >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [2]PETSC ERROR: ------------------------------------------------------------------------
> >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> >> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> >> [2]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> >> [2]PETSC ERROR: likely location of problem given in stack below
> >> [2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [2]PETSC ERROR:       INSTEAD the line number of the start of the function
> >> [2]PETSC ERROR:       is given.
> >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >> [2]PETSC ERROR: Signal received
> >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [4]PETSC ERROR: ------------------------------------------------------------------------
> >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> >> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> >> [4]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> >> [4]PETSC ERROR: likely location of problem given in stack below
> >> [4]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [4]PETSC ERROR:       INSTEAD the line number of the start of the function
> >> [4]PETSC ERROR:       is given.
> >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >> [4]PETSC ERROR: Signal received
> >> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >> [5]PETSC ERROR: ------------------------------------------------------------------------
> >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
> >> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> >> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
> >> [5]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
> >> [5]PETSC ERROR: likely location of problem given in stack below
> >> [5]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> >> [5]PETSC ERROR:       INSTEAD the line number of the start of the function
> >> [5]PETSC ERROR:       is given.
> >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
> >> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
> >> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
> >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
> >> [5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >> [5]PETSC ERROR: Signal received
> >> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
> >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
> >> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
> >> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> >>
> >> -gideon
> >>
> >>
> >
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/8d132399/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue Aug 25 12:39:16 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 25 Aug 2015 12:39:16 -0500
Subject: [petsc-users] Function evaluation slowness ?
In-Reply-To: <CAGi1ndRv1bsz-YxNB-jNJp_zoXZ7kokJwyDrCghyPLsrAfHcww@mail.gmail.com>
References: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
	<0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>
	<CAGi1ndQuuO+tMC_1r=DyhrsW0n4b1aMpvKXBF_BDcKpuqoHeUQ@mail.gmail.com>
	<26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov>
	<CAGi1ndRv1bsz-YxNB-jNJp_zoXZ7kokJwyDrCghyPLsrAfHcww@mail.gmail.com>
Message-ID: <3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov>


> On Aug 25, 2015, at 2:06 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> 
> OK, I see,
> 
> Might it be that I do something a bit funky to obtain a good guess for solve ? I had he following idea, which I used with great success on a very different problem (much simpler, maybe that's why it worked) : obtain the initial guess as a cubic extrapolation of the preceding solutions. The idea is that I expect my solution to be reasonably smooth over time, so considering this, the increment of the fields should also be continuous (I solve for the increments, not the fields themselves). Therefore, I store in my user context the current vector Xk as well as the last three solutions Xkm1 and Xkm2.
> 
> I define 
> 
> dxm2 = Xkm1 - Xkm2
> dxm1 = Xk - Xkm1
> 
> And I use the result of the last SNESSolve as 
> 
> dx = Xkp1 - Xk
> 
> Then I set the new dx initial guess as the pointwise cubic extrapolation of (dxm2,dxm1,dx)
> 
> However it seems pretty local and I don't see why scatters would be required for this.

   Yes, no scatters here.

> 
> I printed the routine I use to do this below. In any case I will clean up a bit, remove the extra stuff (not much there however). If it is not sufficient, I will transform my form function in a dummy which does not require computations and see what happens.
> 
> Timothee
> 
>   PetscErrorCode :: ierr
> 
>   PetscScalar :: M(3,3)
>   Vec         :: xkm2,xkm1
>   Vec         :: coef1,coef2,coef3
>   PetscScalar :: a,b,c,t,det
> 
>   a = user%tkm1
>   b = user%tk
>   c = user%t
>   t = user%t+user%dt
> 
>   det = b*a**2 + c*b**2 + a*c**2 - (c*a**2 + a*b**2 + b*c**2)
> 
>   M(1,1) = (b-c)/det
>   M(2,1) = (c**2-b**2)/det
>   M(3,1) = (c*b**2-b*c**2)/det
> 
>   M(1,2) = (c-a)/det
>   M(2,2) = (a**2-c**2)/det
>   M(3,2) = (a*c**2-c*a**2)/det
> 
>   M(1,3) = (a-b)/det
>   M(2,3) = (b**2-a**2)/det
>   M(3,3) = (b*a**2-a*b**2)/det
> 
>   call VecDuplicate(x,xkm1,ierr)
>   call VecDuplicate(x,xkm2,ierr)
> 
>   call VecDuplicate(x,coef1,ierr)
>   call VecDuplicate(x,coef2,ierr)
>   call VecDuplicate(x,coef3,ierr)
> 
>   call VecWAXPY(xkm2,-one,user%Xkm2,user%Xkm1,ierr)
>   call VecWAXPY(xkm1,-one,user%Xkm1,user%Xk,ierr)
> 
>   ! The following lines correspond to the following simple operation                                                                                                                                                                                                   
>   ! coef1 = M(1,1)*alpha + M(1,2)*beta + M(1,3)*gamma                                                                                                                                                                                                                  
>   ! coef2 = M(2,1)*alpha + M(2,2)*beta + M(2,3)*gamma                                                                                                                                                                                                                  
>   ! coef3 = M(3,1)*alpha + M(3,2)*beta + M(3,3)*gamma                                                                                                                                                                                                                  
>   call VecCopy(xkm2,coef1,ierr)
>   call VecScale(coef1,M(1,1),ierr)
>   call VecAXPY(coef1,M(1,2),xkm1,ierr)
>   call VecAXPY(coef1,M(1,3),x,ierr)
> 
>   call VecCopy(xkm2,coef2,ierr)
>   call VecScale(coef2,M(2,1),ierr)
>   call VecAXPY(coef2,M(2,2),xkm1,ierr)
>   call VecAXPY(coef2,M(2,3),x,ierr)
> 
>   call VecCopy(xkm2,coef3,ierr)
>   call VecScale(coef3,M(3,1),ierr)
>   call VecAXPY(coef3,M(3,2),xkm1,ierr)
>   call VecAXPY(coef3,M(3,3),x,ierr)
> 
>   call VecCopy(coef3,x,ierr)
>   call VecAXPY(x,t,coef2,ierr)
>   call VecAXPY(x,t**2,coef1,ierr)
> 
>   call VecDestroy(xkm2,ierr)
>   call VecDestroy(xkm1,ierr)
> 
>   call VecDestroy(coef1,ierr)
>   call VecDestroy(coef2,ierr)
>   call VecDestroy(coef3,ierr)
> 
> 
> 
> 2015-08-25 15:47 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> 
>   The results are kind of funky,
> 
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
> SNESSolve             40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80  46 93 99 95 80  2187
> SNESFunctionEval     666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40  45 13 99 95 40   299
> SNESLineSearch        79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02  1 11 23 23 19   1 11 23 23 19 33068
> VecScatterEnd       1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31  0  0  0  0  31  0  0  0  0     0
> MatMult MF           547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25962
> MatMult              547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25960
> 
> look at the %T time for global SNES solve is 46 % of the total time, function evaluations are 45% but MatMult are only 2% (and yet matmult should contain most of the function evaluations). I cannot explain this. Also the VecScatterEnd is HUGE and has a bad load balance of 5.8  Why are there so many more scatters than function evaluations? What other operations are you doing that require scatters?
> 
> It's almost like you have some mysterious "extra" function calls outside of the SNESSolve that are killing the performance? It might help to understand the performance to strip out all extraneous computations not needed (like in custom monitors etc).
> 
>  Barry
> 
> 
> 
> 
> 
> 
> > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> >
> > Here is the log summary (attached). At the beginning are personal prints, you can skip. I seem to have a memory crash in the present state after typically 45 iterations (that's why I used 40 here), the log summary indicates some creations without destruction of Petsc objects (I will fix this immediately), that may cause the memory crash, but I don't think it's the cause of the slow function evaluations.
> >
> > The log_summary is consistent with 0.7s per function evaluation (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately the same amount of time (is it normal ?). And the other long operation is VecScatterEnd. I assume it is the time used in process communications ? In which case I suppose it is normal that it takes a significant amount of time.
> >
> > So this ~10 times increase does not look normal right ?
> >
> > Best
> >
> > Timothee NICOLAS
> >
> >
> > 2015-08-25 14:56 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> >
> > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one).
> > >
> > > I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms.
> > >
> > > Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course)
> > >
> > > This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower.
> > >
> > > So I have some questions about this.
> > >
> > > 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time.
> >
> > PetscErrorCode  SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs)
> > {
> >   *nfuncs = snes->nfuncs;
> > }
> >
> > PetscErrorCode  SNESComputeFunction(SNES snes,Vec x,Vec y)
> > {
> > ...
> >   snes->nfuncs++;
> > }
> >
> > PetscErrorCode  MatCreateSNESMF(SNES snes,Mat *J)
> > {
> > .....
> >   if (snes->pc && snes->pcside == PC_LEFT) {
> >     ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr);
> >   } else {
> >     ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr);
> >   }
> > }
> >
> >   So, yes I would expect all the function evaluations needed for the matrix-free Jacobian matrix vector product to be counted. You can also look at the number of GMRES Krylov iterations it took (which should have one multiply per iteration) to double check that the numbers make sense.
> >
> >   What does your -log_summary output look like? One thing that GMRES does is it introduces a global reduction with each multiple (hence a barrier across all your processes) on some systems this can be deadly.
> >
> >   Barry
> >
> >
> > >
> > > 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ?
> > >
> > > I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time.
> > >
> > > Best regards
> > >
> > > Timothee NICOLAS
> >
> >
> > <log_summary.txt>
> 
> 


From bsmith at mcs.anl.gov  Tue Aug 25 14:14:56 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 25 Aug 2015 14:14:56 -0500
Subject: [petsc-users] on the data size problem
In-Reply-To: <CAFxoUcLj1eXSQ44YTA4Bg7YpC_xhN3i9+VHeyaj9NPK2p22E_Q@mail.gmail.com>
References: <CAFxoUcLj1eXSQ44YTA4Bg7YpC_xhN3i9+VHeyaj9NPK2p22E_Q@mail.gmail.com>
Message-ID: <7CF7408A-9540-4ABA-BFAF-E16B5085A0B4@mcs.anl.gov>


  Convergence of iterative schemes depends on problem sizes and problem properties. You need to debug your code/algorithm to determine what is going on. Some advice NEVER NEVER NEVER run in parallel until you are getting correct behavior and solutions on one process consistently. Increase the problem size slightly from 5 until you start seeing bad behavior; don't just jump from 5 to 5000. 

  Barry


> On Aug 19, 2015, at 10:51 AM, Hongliang Lu <honglianglu87 at gmail.com> wrote:
> 
> Dear all,
> I am trying to implement a BFS algorithm using Petsc, and I have tested my code on a graph of 5 nodes, but when I tested on a larger graph, which size is 5000 nodes, the program went wrong, and ca not finished, could some on help me out? thank you very much!!!!!
> I tried to run the following code in a cluster with 10 nodes.
> 
> int main(int argc,char **args)
> {
>         Vec         curNodes,tmp;
>         Mat         oriGraph;
> 	PetscInt rows, cols;
> 	PetscScalar one=1;
> 	PetscScalar nodeVecSum=1;
>         char filein[PETSC_MAX_PATH_LEN],fileout[PETSC_MAX_PATH_LEN],buf[PETSC_MAX_PATH_LEN];
>         PetscViewer fd;
>         PetscInitialize(&argc,&args,(char *)0,help);
> 
>         PetscOptionsGetString(PETSC_NULL,"-fin",filein,PETSC_MAX_PATH_LEN-1,PETSC_NULL);
>         PetscViewerBinaryOpen(PETSC_COMM_WORLD,filein,FILE_MODE_READ,&fd);
>         MatCreate(PETSC_COMM_WORLD,&oriGraph);
> 
>         MatLoad(oriGraph,fd);
> 	MatGetSize(oriGraph,&rows,&cols);
> 	MatSetOption(oriGraph,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);
> 	MatSetUp(oriGraph);
> 	VecCreate(PETSC_COMM_WORLD,&curNodes);
> 	
> 	VecSetSizes(curNodes,PETSC_DECIDE,rows);
> 	VecSetFromOptions(curNodes);
> 	VecCreate(PETSC_COMM_WORLD,&tmp);
> 	VecSetSizes(tmp,PETSC_DECIDE,rows);
> 	VecSetFromOptions(tmp);
> 	VecZeroEntries(tmp);
> 	srand(time(0));
> 	PetscInt node=rand()%rows;
> 	PetscPrintf(PETSC_COMM_SELF,"The node ID is: %d \n",node);
> 	VecSetValues(curNodes,1,&node,&one,INSERT_VALUES);
> 	VecAssemblyBegin(curNodes);
> 	VecAssemblyEnd(curNodes);	
> 
>         PetscViewerDestroy(&fd);
> 
> 	const PetscInt    *colsv;
>         const PetscScalar *valsv;
>         PetscInt ncols,i,zero=0;
> 	PetscInt iter=0;
> 
> 	nodeVecSum=1;
> 	for(;iter<10;iter++)
> 	{	
> 		VecAssemblyBegin(curNodes);
> 		VecAssemblyEnd(curNodes);
> 		MatMult(oriGraph,curNodes,tmp);
> 		VecAssemblyBegin(tmp);
> 		VecAssemblyEnd(tmp);
> 		VecSum(tmp,&nodeVecSum);
> 		PetscPrintf(PETSC_COMM_SELF,"There are neighbors: %d \n",(int)nodeVecSum);
> 		VecSum(curNodes,&nodeVecSum);
> 		if(nodeVecSum<1)
> 			break;
> 
> 		PetscScalar y;
>         	PetscInt indices;
> 		PetscInt n,m,rstart,rend;
> 		IS isrow;
> 		Mat curMat;
> 		MatGetLocalSize(oriGraph,&n,&m);
> 		MatGetOwnershipRange(oriGraph,&rstart,&rend);
> 		ISCreateStride(PETSC_COMM_SELF,n,rstart,1,&isrow);
> 		MatGetSubMatrix(oriGraph,isrow,NULL,MAT_INITIAL_MATRIX,&curMat);
> 		
> 		MatGetSize(curMat,&n,&m);
> 		for(i=rstart;i<rend;i++)
> 		{
> 			indices=i;
> 			VecGetValues(curNodes,1,&indices,&y);
> 			if(y>0){
>         			MatGetRow(oriGraph,indices,&ncols,&colsv,&valsv);
>         			PetscScalar *v,zero=0;
>         			PetscMalloc1(cols,&v);
>         			for(int j=0;j<ncols;j++){
>                				v[j]=zero;
> 				}
> 				MatSetValues(oriGraph,1,&indices,ncols,colsv,v,INSERT_VALUES);
> 				PetscFree(v);
> 				
>         		}
> 
> 		}
> 		MatAssemblyBegin(oriGraph,MAT_FINAL_ASSEMBLY);
>                 MatAssemblyEnd(oriGraph,MAT_FINAL_ASSEMBLY);
> 		ISDestroy(&isrow);
> 		
> 		MatDestroy(&curMat);
> 		
> 		VecCopy(tmp,curNodes);
> 		VecAssemblyBegin(curNodes);
> 		VecAssemblyEnd(curNodes);
> 		
> 	}
> 	PetscPrintf(PETSC_COMM_SELF,"Finished in iterations of: %d\n",iter);
>         MatDestroy(&oriGraph);
>         VecDestroy(&curNodes);
> 	VecDestroy(&tmp);
>         PetscFinalize();
>         return 0;
> }
> The Petsc version I have installed is 3.6.1. 
> 
> 


From hzhang at mcs.anl.gov  Tue Aug 25 15:35:29 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Tue, 25 Aug 2015 15:35:29 -0500
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
	<1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
	<7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com>
	<CAGCphBuJT+Mq0PkLDf_DJTuq+HPj9txNSCDkjMM1HSy9w3Ghyw@mail.gmail.com>
	<83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com>
Message-ID: <CAGCphBu_f8gRVQ1fgVKr7xinHLhNOra1BMXUirzF-Q6cDVb6kw@mail.gmail.com>

Gideon :
>
>
> I ran with that flag because, while solving a SNES with MUMPS, the code
> would just sit there as though it had died, and never seem to recover.  I
> tried using that flag just to determine where it had stalled, which was at
> the "ordering based on METIS? bit.
>

If you suspect  METIS/ParMetis hangs,
then turn to other sequential matrix orderings, e.g.,
' -mat_mumps_icntl_29 0 -mat_mumps_icntl_7 2', which I found the most
robust ordering.
Run your code with '-help |grep mumps', it will display mumps options.

Hong


>
> -gideon
>
> On Aug 25, 2015, at 12:24 PM, Hong <hzhang at mcs.anl.gov> wrote:
>
> Gideon:
> -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None)
> This is for algorithmic diagnosis, not for regular runs. Use default '0'
> for it.
>
> Hong
>
> On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
>
>> Regarding the MUMPS issue, I?m not sure if this is useful, but when I run
>> with the mumps flags  -mat_mumps_icntl_4 4, to see the progress, it hangs
>> at this point:
>>
>>
>>  ... Structural symmetry (in percent)=   75
>>  Density: NBdense, Average, Median   =    2    9    7
>>  Ordering based on METIS
>>
>> -gideon
>>
>> > On Aug 22, 2015, at 5:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >
>> >
>> >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com>
>> wrote:
>> >>
>> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by
>> PETsc, in the following sense:
>> >>
>> >> 1.  For large enough systems, which seems to vary depending on which
>> computer I?m on, MUMPS seems to just die and never start, when it?s used as
>> the linear solver within a SNES.    There?s no error message, it just sits
>> there and doesn?t do anything.
>> >
>> >  You will need to use a debugger to figure out where it is "hanging";
>> we haven't heard reports about this.
>> >>
>> >> 2.  When running with SuperLU dist, I got the following error, with no
>> further information:
>> >
>> >  The last release of SuperLU_DIST had some pretty nasty bugs, memory
>> corruption that caused crashes etc. We think they are now fixed if you use
>> the maint branch of the PETSc repository and --download-superlu_dist  If
>> you stick with the PETSc release and SuperLU_Dist you are using you will
>> keep seeing these crashes
>> >
>> >   Barry
>> >
>> >
>> >>
>> >> [3]PETSC ERROR:
>> ------------------------------------------------------------------------
>> >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> >> [3]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> >> [3]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> >> [3]PETSC ERROR: likely location of problem given in stack below
>> >> [3]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> >> [3]PETSC ERROR:       INSTEAD the line number of the start of the
>> function
>> >> [3]PETSC ERROR:       is given.
>> >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [3]PETSC ERROR: [3] MatSolve line 3104
>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [3]PETSC ERROR: [3] PCApply_LU line 194
>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258
>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [3]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> >> [3]PETSC ERROR: Signal received
>> >> [3]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [3]PETSC ERROR: Configure options
>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >>
>> --------------------------------------------------------------------------
>> >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
>> >> with errorcode 59.
>> >>
>> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> >> You may or may not see output from other processes, depending on
>> >> exactly when Open MPI kills them.
>> >>
>> --------------------------------------------------------------------------
>> >> [proteusi01:14037] 1 more process has sent help message
>> help-mpi-api.txt / mpi-abort
>> >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0
>> to see all help / error messages
>> >> [6]PETSC ERROR:
>> ------------------------------------------------------------------------
>> >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>> the batch system) has told this process to end
>> >> [6]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> >> [6]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> >> [6]PETSC ERROR: likely location of problem given in stack below
>> >> [6]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> >> [6]PETSC ERROR:       INSTEAD the line number of the start of the
>> function
>> >> [6]PETSC ERROR:       is given.
>> >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [6]PETSC ERROR: [6] MatSolve line 3104
>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [6]PETSC ERROR: [6] PCApply_LU line 194
>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258
>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [6]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> >> [6]PETSC ERROR: Signal received
>> >> [6]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [6]PETSC ERROR: Configure options
>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [7]PETSC ERROR:
>> ------------------------------------------------------------------------
>> >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>> the batch system) has told this process to end
>> >> [7]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> >> [7]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> >> [7]PETSC ERROR: likely location of problem given in stack below
>> >> [7]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> >> [7]PETSC ERROR:       INSTEAD the line number of the start of the
>> function
>> >> [7]PETSC ERROR:       is given.
>> >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [7]PETSC ERROR: [7] MatSolve line 3104
>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [7]PETSC ERROR: [7] PCApply_LU line 194
>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258
>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [7]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> >> [7]PETSC ERROR: Signal received
>> >> [7]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [7]PETSC ERROR: Configure options
>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>> the batch system) has told this process to end
>> >> [0]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> >> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> >> [0]PETSC ERROR: likely location of problem given in stack below
>> >> [0]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> >> [0]PETSC ERROR:       INSTEAD the line number of the start of the
>> function
>> >> [0]PETSC ERROR:       is given.
>> >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [0]PETSC ERROR: [0] MatSolve line 3104
>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [0]PETSC ERROR: [0] PCApply_LU line 194
>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258
>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> >> [0]PETSC ERROR: Signal received
>> >> [0]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [0]PETSC ERROR: Configure options
>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [1]PETSC ERROR:
>> ------------------------------------------------------------------------
>> >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>> the batch system) has told this process to end
>> >> [1]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> >> [1]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> >> [1]PETSC ERROR: likely location of problem given in stack below
>> >> [1]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> >> [1]PETSC ERROR:       INSTEAD the line number of the start of the
>> function
>> >> [1]PETSC ERROR:       is given.
>> >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [1]PETSC ERROR: [1] MatSolve line 3104
>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [1]PETSC ERROR: [1] PCApply_LU line 194
>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258
>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [1]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> >> [1]PETSC ERROR: Signal received
>> >> [1]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [1]PETSC ERROR: Configure options
>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [2]PETSC ERROR:
>> ------------------------------------------------------------------------
>> >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>> the batch system) has told this process to end
>> >> [2]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> >> [2]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> >> [2]PETSC ERROR: likely location of problem given in stack below
>> >> [2]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> >> [2]PETSC ERROR:       INSTEAD the line number of the start of the
>> function
>> >> [2]PETSC ERROR:       is given.
>> >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [2]PETSC ERROR: [2] MatSolve line 3104
>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [2]PETSC ERROR: [2] PCApply_LU line 194
>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258
>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [2]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> >> [2]PETSC ERROR: Signal received
>> >> [2]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [2]PETSC ERROR: Configure options
>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [4]PETSC ERROR:
>> ------------------------------------------------------------------------
>> >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>> the batch system) has told this process to end
>> >> [4]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> >> [4]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> >> [4]PETSC ERROR: likely location of problem given in stack below
>> >> [4]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> >> [4]PETSC ERROR:       INSTEAD the line number of the start of the
>> function
>> >> [4]PETSC ERROR:       is given.
>> >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [4]PETSC ERROR: [4] MatSolve line 3104
>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [4]PETSC ERROR: [4] PCApply_LU line 194
>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258
>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [4]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> >> [4]PETSC ERROR: Signal received
>> >> [4]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [4]PETSC ERROR: Configure options
>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [5]PETSC ERROR:
>> ------------------------------------------------------------------------
>> >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>> the batch system) has told this process to end
>> >> [5]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> >> [5]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> >> [5]PETSC ERROR: likely location of problem given in stack below
>> >> [5]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>> available,
>> >> [5]PETSC ERROR:       INSTEAD the line number of the start of the
>> function
>> >> [5]PETSC ERROR:       is given.
>> >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121
>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [5]PETSC ERROR: [5] MatSolve line 3104
>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [5]PETSC ERROR: [5] PCApply_LU line 194
>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258
>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [5]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> >> [5]PETSC ERROR: Signal received
>> >> [5]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [5]PETSC ERROR: Configure options
>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >>
>> >> -gideon
>> >>
>> >>
>> >
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/af55ed72/attachment-0001.html>

From gideon.simpson at gmail.com  Tue Aug 25 15:54:59 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Tue, 25 Aug 2015 16:54:59 -0400
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <CAGCphBu_f8gRVQ1fgVKr7xinHLhNOra1BMXUirzF-Q6cDVb6kw@mail.gmail.com>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
	<1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
	<7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com>
	<CAGCphBuJT+Mq0PkLDf_DJTuq+HPj9txNSCDkjMM1HSy9w3Ghyw@mail.gmail.com>
	<83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com>
	<CAGCphBu_f8gRVQ1fgVKr7xinHLhNOra1BMXUirzF-Q6cDVb6kw@mail.gmail.com>
Message-ID: <C11EECE7-506A-4916-8371-34E27D1820F9@gmail.com>

running with  -mat_mumps_icntl_7 4 got it to run on problems that it couldn?t do before, thanks.  How should I understand how this choice of flag is impacting whether or not it stalls?  
-gideon

> On Aug 25, 2015, at 4:35 PM, Hong <hzhang at mcs.anl.gov> wrote:
> 
> Gideon :
> 
> I ran with that flag because, while solving a SNES with MUMPS, the code would just sit there as though it had died, and never seem to recover.  I tried using that flag just to determine where it had stalled, which was at the "ordering based on METIS? bit.
>  
> If you suspect  METIS/ParMetis hangs,
> then turn to other sequential matrix orderings, e.g., 
> ' -mat_mumps_icntl_29 0 -mat_mumps_icntl_7 2', which I found the most robust ordering.
> Run your code with '-help |grep mumps', it will display mumps options.
> 
> Hong
>  
> 
> -gideon
> 
>> On Aug 25, 2015, at 12:24 PM, Hong <hzhang at mcs.anl.gov <mailto:hzhang at mcs.anl.gov>> wrote:
>> 
>> Gideon:
>> -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None)
>> This is for algorithmic diagnosis, not for regular runs. Use default '0' for it.
>> 
>> Hong
>> 
>> On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
>> Regarding the MUMPS issue, I?m not sure if this is useful, but when I run with the mumps flags  -mat_mumps_icntl_4 4, to see the progress, it hangs at this point:
>> 
>> 
>>  ... Structural symmetry (in percent)=   75
>>  Density: NBdense, Average, Median   =    2    9    7
>>  Ordering based on METIS
>> 
>> -gideon
>> 
>> > On Aug 22, 2015, at 5:12 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
>> >
>> >
>> >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
>> >>
>> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense:
>> >>
>> >> 1.  For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES.    There?s no error message, it just sits there and doesn?t do anything.
>> >
>> >  You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this.
>> >>
>> >> 2.  When running with SuperLU dist, I got the following error, with no further information:
>> >
>> >  The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist  If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes
>> >
>> >   Barry
>> >
>> >
>> >>
>> >> [3]PETSC ERROR: ------------------------------------------------------------------------
>> >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>> >> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> >> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>> >> [3]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
>> >> [3]PETSC ERROR: likely location of problem given in stack below
>> >> [3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> >> [3]PETSC ERROR:       INSTEAD the line number of the start of the function
>> >> [3]PETSC ERROR:       is given.
>> >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >> [3]PETSC ERROR: Signal received
>> >> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> --------------------------------------------------------------------------
>> >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
>> >> with errorcode 59.
>> >>
>> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> >> You may or may not see output from other processes, depending on
>> >> exactly when Open MPI kills them.
>> >> --------------------------------------------------------------------------
>> >> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort
>> >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>> >> [6]PETSC ERROR: ------------------------------------------------------------------------
>> >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> >> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> >> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>> >> [6]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
>> >> [6]PETSC ERROR: likely location of problem given in stack below
>> >> [6]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> >> [6]PETSC ERROR:       INSTEAD the line number of the start of the function
>> >> [6]PETSC ERROR:       is given.
>> >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [6]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >> [6]PETSC ERROR: Signal received
>> >> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [7]PETSC ERROR: ------------------------------------------------------------------------
>> >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> >> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> >> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>> >> [7]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
>> >> [7]PETSC ERROR: likely location of problem given in stack below
>> >> [7]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> >> [7]PETSC ERROR:       INSTEAD the line number of the start of the function
>> >> [7]PETSC ERROR:       is given.
>> >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [7]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >> [7]PETSC ERROR: Signal received
>> >> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [0]PETSC ERROR: ------------------------------------------------------------------------
>> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> >> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>> >> [0]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
>> >> [0]PETSC ERROR: likely location of problem given in stack below
>> >> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> >> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>> >> [0]PETSC ERROR:       is given.
>> >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >> [0]PETSC ERROR: Signal received
>> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [1]PETSC ERROR: ------------------------------------------------------------------------
>> >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> >> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>> >> [1]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
>> >> [1]PETSC ERROR: likely location of problem given in stack below
>> >> [1]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> >> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
>> >> [1]PETSC ERROR:       is given.
>> >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >> [1]PETSC ERROR: Signal received
>> >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [2]PETSC ERROR: ------------------------------------------------------------------------
>> >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> >> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> >> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>> >> [2]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
>> >> [2]PETSC ERROR: likely location of problem given in stack below
>> >> [2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> >> [2]PETSC ERROR:       INSTEAD the line number of the start of the function
>> >> [2]PETSC ERROR:       is given.
>> >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >> [2]PETSC ERROR: Signal received
>> >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [4]PETSC ERROR: ------------------------------------------------------------------------
>> >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> >> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> >> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>> >> [4]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
>> >> [4]PETSC ERROR: likely location of problem given in stack below
>> >> [4]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> >> [4]PETSC ERROR:       INSTEAD the line number of the start of the function
>> >> [4]PETSC ERROR:       is given.
>> >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [4]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >> [4]PETSC ERROR: Signal received
>> >> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >> [5]PETSC ERROR: ------------------------------------------------------------------------
>> >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end
>> >> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> >> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind <http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind>
>> >> [5]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on GNU/linux and Apple Mac OS X to find memory corruption errors
>> >> [5]PETSC ERROR: likely location of problem given in stack below
>> >> [5]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>> >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> >> [5]PETSC ERROR:       INSTEAD the line number of the start of the function
>> >> [5]PETSC ERROR:       is given.
>> >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>> >> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>> >> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>> >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>> >> [5]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >> [5]PETSC ERROR: Signal received
>> >> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
>> >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>> >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015
>> >> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>> >> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> >>
>> >> -gideon
>> >>
>> >>
>> >

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/793624cc/attachment-0001.html>

From hzhang at mcs.anl.gov  Tue Aug 25 17:23:28 2015
From: hzhang at mcs.anl.gov (Hong)
Date: Tue, 25 Aug 2015 17:23:28 -0500
Subject: [petsc-users] issues with sparse direct solvers
In-Reply-To: <C11EECE7-506A-4916-8371-34E27D1820F9@gmail.com>
References: <C5F9A73F-1717-473A-8BAD-93FB2CA2BC75@gmail.com>
	<1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov>
	<7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com>
	<CAGCphBuJT+Mq0PkLDf_DJTuq+HPj9txNSCDkjMM1HSy9w3Ghyw@mail.gmail.com>
	<83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com>
	<CAGCphBu_f8gRVQ1fgVKr7xinHLhNOra1BMXUirzF-Q6cDVb6kw@mail.gmail.com>
	<C11EECE7-506A-4916-8371-34E27D1820F9@gmail.com>
Message-ID: <CAGCphBtb-RyMXB4=h8WHmnw8Ly+jMXT2V2AXM_gtNKw2krzQtg@mail.gmail.com>

Gideon :

> running with  -mat_mumps_icntl_7 4 got it to run on problems that it
> couldn?t do before, thanks.  How should I understand how this choice of
> flag is impacting whether or not it stalls?
>
Good to know that your code starts running :-)
I've never encountered hang with mumps. It turns that the hang occurred  in
metis, which I rarely use.

Debugging what causes hang, I usually run code with a debugger, e.g. use
opiton '-start_in_debugger', when code hangs, hit control^C to see where it
hangs.

Hong

>
> On Aug 25, 2015, at 4:35 PM, Hong <hzhang at mcs.anl.gov> wrote:
>
> Gideon :
>>
>>
>> I ran with that flag because, while solving a SNES with MUMPS, the code
>> would just sit there as though it had died, and never seem to recover.  I
>> tried using that flag just to determine where it had stalled, which was at
>> the "ordering based on METIS? bit.
>>
>
> If you suspect  METIS/ParMetis hangs,
> then turn to other sequential matrix orderings, e.g.,
> ' -mat_mumps_icntl_29 0 -mat_mumps_icntl_7 2', which I found the most
> robust ordering.
> Run your code with '-help |grep mumps', it will display mumps options.
>
> Hong
>
>
>>
>> -gideon
>>
>> On Aug 25, 2015, at 12:24 PM, Hong <hzhang at mcs.anl.gov> wrote:
>>
>> Gideon:
>> -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None)
>> This is for algorithmic diagnosis, not for regular runs. Use default '0'
>> for it.
>>
>> Hong
>>
>> On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson <gideon.simpson at gmail.com
>> > wrote:
>>
>>> Regarding the MUMPS issue, I?m not sure if this is useful, but when I
>>> run with the mumps flags  -mat_mumps_icntl_4 4, to see the progress, it
>>> hangs at this point:
>>>
>>>
>>>  ... Structural symmetry (in percent)=   75
>>>  Density: NBdense, Average, Median   =    2    9    7
>>>  Ordering based on METIS
>>>
>>> -gideon
>>>
>>> > On Aug 22, 2015, at 5:12 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> >
>>> >
>>> >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson <gideon.simpson at gmail.com>
>>> wrote:
>>> >>
>>> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by
>>> PETsc, in the following sense:
>>> >>
>>> >> 1.  For large enough systems, which seems to vary depending on which
>>> computer I?m on, MUMPS seems to just die and never start, when it?s used as
>>> the linear solver within a SNES.    There?s no error message, it just sits
>>> there and doesn?t do anything.
>>> >
>>> >  You will need to use a debugger to figure out where it is "hanging";
>>> we haven't heard reports about this.
>>> >>
>>> >> 2.  When running with SuperLU dist, I got the following error, with
>>> no further information:
>>> >
>>> >  The last release of SuperLU_DIST had some pretty nasty bugs, memory
>>> corruption that caused crashes etc. We think they are now fixed if you use
>>> the maint branch of the PETSc repository and --download-superlu_dist  If
>>> you stick with the PETSc release and SuperLU_Dist you are using you will
>>> keep seeing these crashes
>>> >
>>> >   Barry
>>> >
>>> >
>>> >>
>>> >> [3]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>> >> [3]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> >> [3]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>> Mac OS X to find memory corruption errors
>>> >> [3]PETSC ERROR: likely location of problem given in stack below
>>> >> [3]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> >> [3]PETSC ERROR:       INSTEAD the line number of the start of the
>>> function
>>> >> [3]PETSC ERROR:       is given.
>>> >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [3]PETSC ERROR: [3] MatSolve line 3104
>>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> >> [3]PETSC ERROR: [3] PCApply_LU line 194
>>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258
>>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> >> [3]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> >> [3]PETSC ERROR: Signal received
>>> >> [3]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>>> >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> >> [3]PETSC ERROR: Configure options
>>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> >> [3]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> >>
>>> --------------------------------------------------------------------------
>>> >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
>>> >> with errorcode 59.
>>> >>
>>> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>> >> You may or may not see output from other processes, depending on
>>> >> exactly when Open MPI kills them.
>>> >>
>>> --------------------------------------------------------------------------
>>> >> [proteusi01:14037] 1 more process has sent help message
>>> help-mpi-api.txt / mpi-abort
>>> >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0
>>> to see all help / error messages
>>> >> [6]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>>> the batch system) has told this process to end
>>> >> [6]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> >> [6]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>> Mac OS X to find memory corruption errors
>>> >> [6]PETSC ERROR: likely location of problem given in stack below
>>> >> [6]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> >> [6]PETSC ERROR:       INSTEAD the line number of the start of the
>>> function
>>> >> [6]PETSC ERROR:       is given.
>>> >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [6]PETSC ERROR: [6] MatSolve line 3104
>>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> >> [6]PETSC ERROR: [6] PCApply_LU line 194
>>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258
>>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> >> [6]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> >> [6]PETSC ERROR: Signal received
>>> >> [6]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>>> >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> >> [6]PETSC ERROR: Configure options
>>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> >> [6]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> >> [7]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>>> the batch system) has told this process to end
>>> >> [7]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> >> [7]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>> Mac OS X to find memory corruption errors
>>> >> [7]PETSC ERROR: likely location of problem given in stack below
>>> >> [7]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> >> [7]PETSC ERROR:       INSTEAD the line number of the start of the
>>> function
>>> >> [7]PETSC ERROR:       is given.
>>> >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [7]PETSC ERROR: [7] MatSolve line 3104
>>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> >> [7]PETSC ERROR: [7] PCApply_LU line 194
>>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258
>>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> >> [7]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> >> [7]PETSC ERROR: Signal received
>>> >> [7]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>>> >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> >> [7]PETSC ERROR: Configure options
>>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> >> [7]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> >> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>>> the batch system) has told this process to end
>>> >> [0]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> >> [0]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>> Mac OS X to find memory corruption errors
>>> >> [0]PETSC ERROR: likely location of problem given in stack below
>>> >> [0]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> >> [0]PETSC ERROR:       INSTEAD the line number of the start of the
>>> function
>>> >> [0]PETSC ERROR:       is given.
>>> >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [0]PETSC ERROR: [0] MatSolve line 3104
>>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> >> [0]PETSC ERROR: [0] PCApply_LU line 194
>>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258
>>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> >> [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> >> [0]PETSC ERROR: Signal received
>>> >> [0]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>>> >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> >> [0]PETSC ERROR: Configure options
>>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> >> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> >> [1]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>>> the batch system) has told this process to end
>>> >> [1]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> >> [1]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>> Mac OS X to find memory corruption errors
>>> >> [1]PETSC ERROR: likely location of problem given in stack below
>>> >> [1]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> >> [1]PETSC ERROR:       INSTEAD the line number of the start of the
>>> function
>>> >> [1]PETSC ERROR:       is given.
>>> >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [1]PETSC ERROR: [1] MatSolve line 3104
>>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> >> [1]PETSC ERROR: [1] PCApply_LU line 194
>>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258
>>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> >> [1]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> >> [1]PETSC ERROR: Signal received
>>> >> [1]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>>> >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> >> [1]PETSC ERROR: Configure options
>>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> >> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> >> [2]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>>> the batch system) has told this process to end
>>> >> [2]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> >> [2]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>> Mac OS X to find memory corruption errors
>>> >> [2]PETSC ERROR: likely location of problem given in stack below
>>> >> [2]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> >> [2]PETSC ERROR:       INSTEAD the line number of the start of the
>>> function
>>> >> [2]PETSC ERROR:       is given.
>>> >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [2]PETSC ERROR: [2] MatSolve line 3104
>>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> >> [2]PETSC ERROR: [2] PCApply_LU line 194
>>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258
>>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> >> [2]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> >> [2]PETSC ERROR: Signal received
>>> >> [2]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>>> >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> >> [2]PETSC ERROR: Configure options
>>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> >> [2]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> >> [4]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>>> the batch system) has told this process to end
>>> >> [4]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> >> [4]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>> Mac OS X to find memory corruption errors
>>> >> [4]PETSC ERROR: likely location of problem given in stack below
>>> >> [4]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> >> [4]PETSC ERROR:       INSTEAD the line number of the start of the
>>> function
>>> >> [4]PETSC ERROR:       is given.
>>> >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [4]PETSC ERROR: [4] MatSolve line 3104
>>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> >> [4]PETSC ERROR: [4] PCApply_LU line 194
>>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258
>>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> >> [4]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> >> [4]PETSC ERROR: Signal received
>>> >> [4]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>>> >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> >> [4]PETSC ERROR: Configure options
>>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> >> [4]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> >> [5]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
>>> the batch system) has told this process to end
>>> >> [5]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>> >> [5]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
>>> Mac OS X to find memory corruption errors
>>> >> [5]PETSC ERROR: likely location of problem given in stack below
>>> >> [5]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> >> [5]PETSC ERROR:       INSTEAD the line number of the start of the
>>> function
>>> >> [5]PETSC ERROR:       is given.
>>> >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121
>>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c
>>> >> [5]PETSC ERROR: [5] MatSolve line 3104
>>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c
>>> >> [5]PETSC ERROR: [5] PCApply_LU line 194
>>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c
>>> >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258
>>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h
>>> >> [5]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> >> [5]PETSC ERROR: Signal received
>>> >> [5]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015
>>> >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named
>>> proteusi01 by simpson Sat Aug 22 17:01:41 2015
>>> >> [5]PETSC ERROR: Configure options
>>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed
>>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a
>>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes
>>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes
>>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes
>>> >> [5]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>> >>
>>> >> -gideon
>>> >>
>>> >>
>>> >
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/51d45f63/attachment-0001.html>

From timothee.nicolas at gmail.com  Tue Aug 25 21:19:56 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Wed, 26 Aug 2015 11:19:56 +0900
Subject: [petsc-users] Function evaluation slowness ?
In-Reply-To: <3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov>
References: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
	<0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>
	<CAGi1ndQuuO+tMC_1r=DyhrsW0n4b1aMpvKXBF_BDcKpuqoHeUQ@mail.gmail.com>
	<26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov>
	<CAGi1ndRv1bsz-YxNB-jNJp_zoXZ7kokJwyDrCghyPLsrAfHcww@mail.gmail.com>
	<3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov>
Message-ID: <CAGi1ndQssq44gS-eDNxLR2j5hnW2OaHgvsTAobsbBGGdtb646A@mail.gmail.com>

Hi,

Problem solved !

Several points to be noticed :

1. First the discrepancy between creations and destructions showed I had
forgotten some VecDestroy, and also a VecRestoreArrayReadF90. Repairing
this seemed to increase a bit the speed but did not solve the actual more
serious problem seen in the log_summary.

2. The actual problem was a very stupid one on my side. At some point I
print small diagnostics at every time step to a text file with standard
Fortran write statement rather than a viewer to a binary file. I had simply
forgotten to put the statement between an if statement on the rank

if (rank.eq.0) then

    write(50) ....

end if

So all the processors were trying to write together to the file, which, I
suppose, somehow caused all the Scatters. After adding the if statement, I
recover a fast speed (about 10 ms per function evaluation). Thank you so
much for your help, I would never have made it so far without it !!!

3. Last minor point, a discrepancy of 1 remains between creations and
destructions (see below extract of log_summary) for the category "viewer".
I have checked that it is also the case for the examples ex5f90.F and ex5.c
on which my code is based. I can't track it down, but it's probably a minor
point anyway.

--- Event Stage 0: Main Stage

                SNES     1              1         1332     0
      SNESLineSearch     1              1          864     0
              DMSNES     2              2         1328     0
              Vector    20             20     34973120     0
      Vector Scatter     3              3       503488     0
             MatMFFD     1              1          768     0
              Matrix     1              1         2304     0
    Distributed Mesh     3              3        14416     0
Star Forest Bipartite Graph     6              6         5024     0
     Discrete System     3              3         2544     0
           Index Set     6              6       187248     0
   IS L to G Mapping     2              2       184524     0
       Krylov Solver     1              1         1304     0
     DMKSP interface     1              1          648     0
      Preconditioner     1              1          880     0
              Viewer     3              2         1536     0
         PetscRandom     1              1          624     0
========================================================================================================================


Best

Timothee


2015-08-26 2:39 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:

>
> > On Aug 25, 2015, at 2:06 AM, Timoth?e Nicolas <
> timothee.nicolas at gmail.com> wrote:
> >
> > OK, I see,
> >
> > Might it be that I do something a bit funky to obtain a good guess for
> solve ? I had he following idea, which I used with great success on a very
> different problem (much simpler, maybe that's why it worked) : obtain the
> initial guess as a cubic extrapolation of the preceding solutions. The idea
> is that I expect my solution to be reasonably smooth over time, so
> considering this, the increment of the fields should also be continuous (I
> solve for the increments, not the fields themselves). Therefore, I store in
> my user context the current vector Xk as well as the last three solutions
> Xkm1 and Xkm2.
> >
> > I define
> >
> > dxm2 = Xkm1 - Xkm2
> > dxm1 = Xk - Xkm1
> >
> > And I use the result of the last SNESSolve as
> >
> > dx = Xkp1 - Xk
> >
> > Then I set the new dx initial guess as the pointwise cubic extrapolation
> of (dxm2,dxm1,dx)
> >
> > However it seems pretty local and I don't see why scatters would be
> required for this.
>
>    Yes, no scatters here.
>
> >
> > I printed the routine I use to do this below. In any case I will clean
> up a bit, remove the extra stuff (not much there however). If it is not
> sufficient, I will transform my form function in a dummy which does not
> require computations and see what happens.
> >
> > Timothee
> >
> >   PetscErrorCode :: ierr
> >
> >   PetscScalar :: M(3,3)
> >   Vec         :: xkm2,xkm1
> >   Vec         :: coef1,coef2,coef3
> >   PetscScalar :: a,b,c,t,det
> >
> >   a = user%tkm1
> >   b = user%tk
> >   c = user%t
> >   t = user%t+user%dt
> >
> >   det = b*a**2 + c*b**2 + a*c**2 - (c*a**2 + a*b**2 + b*c**2)
> >
> >   M(1,1) = (b-c)/det
> >   M(2,1) = (c**2-b**2)/det
> >   M(3,1) = (c*b**2-b*c**2)/det
> >
> >   M(1,2) = (c-a)/det
> >   M(2,2) = (a**2-c**2)/det
> >   M(3,2) = (a*c**2-c*a**2)/det
> >
> >   M(1,3) = (a-b)/det
> >   M(2,3) = (b**2-a**2)/det
> >   M(3,3) = (b*a**2-a*b**2)/det
> >
> >   call VecDuplicate(x,xkm1,ierr)
> >   call VecDuplicate(x,xkm2,ierr)
> >
> >   call VecDuplicate(x,coef1,ierr)
> >   call VecDuplicate(x,coef2,ierr)
> >   call VecDuplicate(x,coef3,ierr)
> >
> >   call VecWAXPY(xkm2,-one,user%Xkm2,user%Xkm1,ierr)
> >   call VecWAXPY(xkm1,-one,user%Xkm1,user%Xk,ierr)
> >
> >   ! The following lines correspond to the following simple operation
> >   ! coef1 = M(1,1)*alpha + M(1,2)*beta + M(1,3)*gamma
> >   ! coef2 = M(2,1)*alpha + M(2,2)*beta + M(2,3)*gamma
> >   ! coef3 = M(3,1)*alpha + M(3,2)*beta + M(3,3)*gamma
> >   call VecCopy(xkm2,coef1,ierr)
> >   call VecScale(coef1,M(1,1),ierr)
> >   call VecAXPY(coef1,M(1,2),xkm1,ierr)
> >   call VecAXPY(coef1,M(1,3),x,ierr)
> >
> >   call VecCopy(xkm2,coef2,ierr)
> >   call VecScale(coef2,M(2,1),ierr)
> >   call VecAXPY(coef2,M(2,2),xkm1,ierr)
> >   call VecAXPY(coef2,M(2,3),x,ierr)
> >
> >   call VecCopy(xkm2,coef3,ierr)
> >   call VecScale(coef3,M(3,1),ierr)
> >   call VecAXPY(coef3,M(3,2),xkm1,ierr)
> >   call VecAXPY(coef3,M(3,3),x,ierr)
> >
> >   call VecCopy(coef3,x,ierr)
> >   call VecAXPY(x,t,coef2,ierr)
> >   call VecAXPY(x,t**2,coef1,ierr)
> >
> >   call VecDestroy(xkm2,ierr)
> >   call VecDestroy(xkm1,ierr)
> >
> >   call VecDestroy(coef1,ierr)
> >   call VecDestroy(coef2,ierr)
> >   call VecDestroy(coef3,ierr)
> >
> >
> >
> > 2015-08-25 15:47 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> >
> >   The results are kind of funky,
> >
> >
> ------------------------------------------------------------------------------------------------------------------------
> > Event                Count      Time (sec)     Flops
>          --- Global ---  --- Stage ---   Total
> >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> >
> ------------------------------------------------------------------------------------------------------------------------
> > SNESSolve             40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04
> 2.7e+03 46 93 99 95 80  46 93 99 95 80  2187
> > SNESFunctionEval     666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04
> 1.3e+03 45 13 99 95 40  45 13 99 95 40   299
> > SNESLineSearch        79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04
> 6.3e+02  1 11 23 23 19   1 11 23 23 19 33068
> > VecScatterEnd       1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 31  0  0  0  0  31  0  0  0  0     0
> > MatMult MF           547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04
> 1.1e+03  2 28 81 78 34   2 28 81 78 34 25962
> > MatMult              547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04
> 1.1e+03  2 28 81 78 34   2 28 81 78 34 25960
> >
> > look at the %T time for global SNES solve is 46 % of the total time,
> function evaluations are 45% but MatMult are only 2% (and yet matmult
> should contain most of the function evaluations). I cannot explain this.
> Also the VecScatterEnd is HUGE and has a bad load balance of 5.8  Why are
> there so many more scatters than function evaluations? What other
> operations are you doing that require scatters?
> >
> > It's almost like you have some mysterious "extra" function calls outside
> of the SNESSolve that are killing the performance? It might help to
> understand the performance to strip out all extraneous computations not
> needed (like in custom monitors etc).
> >
> >  Barry
> >
> >
> >
> >
> >
> >
> > > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas <
> timothee.nicolas at gmail.com> wrote:
> > >
> > > Here is the log summary (attached). At the beginning are personal
> prints, you can skip. I seem to have a memory crash in the present state
> after typically 45 iterations (that's why I used 40 here), the log summary
> indicates some creations without destruction of Petsc objects (I will fix
> this immediately), that may cause the memory crash, but I don't think it's
> the cause of the slow function evaluations.
> > >
> > > The log_summary is consistent with 0.7s per function evaluation
> (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately
> the same amount of time (is it normal ?). And the other long operation is
> VecScatterEnd. I assume it is the time used in process communications ? In
> which case I suppose it is normal that it takes a significant amount of
> time.
> > >
> > > So this ~10 times increase does not look normal right ?
> > >
> > > Best
> > >
> > > Timothee NICOLAS
> > >
> > >
> > > 2015-08-25 14:56 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> > >
> > > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas <
> timothee.nicolas at gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I am testing PETSc on the supercomputer where I used to run my
> explicit MHD code. For my tests I use 256 processes on a problem of size
> 128*128*640 = 10485760, that is, 40960 grid points per process, and 8
> degrees of freedom (or physical fields). The explicit code was using
> Runge-Kutta 4 for the time scheme, which means 4 function evaluation per
> time step (plus one operation to put everything together, but let's forget
> this one).
> > > >
> > > > I could thus easily determine that the typical time required for a
> function evaluation was of the order of 50 ms.
> > > >
> > > > Now with the implicit Newton-Krylov solver written in PETSc, in the
> present state where for now I have not implemented any Jacobian or
> preconditioner whatsoever (so I run with -snes_mf), I measure a typical
> time between two time steps of between 5 and 20 seconds, and the number of
> function evaluations for each time step obtained with
> SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of
> course)
> > > >
> > > > This means a time per function evaluation of about 0.5 to 1 second,
> that is, 10 to 20 times slower.
> > > >
> > > > So I have some questions about this.
> > > >
> > > > 1. First does SNESGetNumberFunctionEvals take into account the
> function evaluations required to evaluate the Jacobian when -snes_mf is
> used, as well as the operations required by the GMRES (Krylov) method ? If
> it were the case, I would somehow intuitively expect a number larger than
> 17, which could explain the increase in time.
> > >
> > > PetscErrorCode  SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs)
> > > {
> > >   *nfuncs = snes->nfuncs;
> > > }
> > >
> > > PetscErrorCode  SNESComputeFunction(SNES snes,Vec x,Vec y)
> > > {
> > > ...
> > >   snes->nfuncs++;
> > > }
> > >
> > > PetscErrorCode  MatCreateSNESMF(SNES snes,Mat *J)
> > > {
> > > .....
> > >   if (snes->pc && snes->pcside == PC_LEFT) {
> > >     ierr = MatMFFDSetFunction(*J,(PetscErrorCode
> (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr);
> > >   } else {
> > >     ierr = MatMFFDSetFunction(*J,(PetscErrorCode
> (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr);
> > >   }
> > > }
> > >
> > >   So, yes I would expect all the function evaluations needed for the
> matrix-free Jacobian matrix vector product to be counted. You can also look
> at the number of GMRES Krylov iterations it took (which should have one
> multiply per iteration) to double check that the numbers make sense.
> > >
> > >   What does your -log_summary output look like? One thing that GMRES
> does is it introduces a global reduction with each multiple (hence a
> barrier across all your processes) on some systems this can be deadly.
> > >
> > >   Barry
> > >
> > >
> > > >
> > > > 2. In any case, I thought that all things considered, the function
> evaluation would be the most time consuming part of a Newton-Krylov solver,
> am I completely wrong about that ? Is the 10-20 factor legit ?
> > > >
> > > > I realize of course that preconditioning should make all this
> smoother, in particular allowing larger time steps, but here I am just
> concerned about the sheer Function evaluation time.
> > > >
> > > > Best regards
> > > >
> > > > Timothee NICOLAS
> > >
> > >
> > > <log_summary.txt>
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/bdf05401/attachment.html>

From bsmith at mcs.anl.gov  Tue Aug 25 21:25:14 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 25 Aug 2015 21:25:14 -0500
Subject: [petsc-users] Function evaluation slowness ?
In-Reply-To: <CAGi1ndQssq44gS-eDNxLR2j5hnW2OaHgvsTAobsbBGGdtb646A@mail.gmail.com>
References: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
	<0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>
	<CAGi1ndQuuO+tMC_1r=DyhrsW0n4b1aMpvKXBF_BDcKpuqoHeUQ@mail.gmail.com>
	<26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov>
	<CAGi1ndRv1bsz-YxNB-jNJp_zoXZ7kokJwyDrCghyPLsrAfHcww@mail.gmail.com>
	<3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov>
	<CAGi1ndQssq44gS-eDNxLR2j5hnW2OaHgvsTAobsbBGGdtb646A@mail.gmail.com>
Message-ID: <9A6ECF0C-15C2-4993-8F48-8B7D7441BD3C@mcs.anl.gov>


> On Aug 25, 2015, at 9:19 PM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> 
> Hi,
> 
> Problem solved !
> 
> Several points to be noticed :
> 
> 1. First the discrepancy between creations and destructions showed I had forgotten some VecDestroy, and also a VecRestoreArrayReadF90. Repairing this seemed to increase a bit the speed but did not solve the actual more serious problem seen in the log_summary.
> 
> 2. The actual problem was a very stupid one on my side. At some point I print small diagnostics at every time step to a text file with standard Fortran write statement rather than a viewer to a binary file. I had simply forgotten to put the statement between an if statement on the rank
> 
> if (rank.eq.0) then
> 
>     write(50) ....
> 
> end if
> 
> So all the processors were trying to write together to the file, which, I suppose, somehow caused all the Scatters. After adding the if statement, I recover a fast speed (about 10 ms per function evaluation). Thank you so much for your help, I would never have made it so far without it !!!
> 
> 3. Last minor point, a discrepancy of 1 remains between creations and destructions (see below extract of log_summary) for the category "viewer". I have checked that it is also the case for the examples ex5f90.F and ex5.c on which my code is based. I can't track it down, but it's probably a minor point anyway.
> 
> --- Event Stage 0: Main Stage
> 
>                 SNES     1              1         1332     0
>       SNESLineSearch     1              1          864     0
>               DMSNES     2              2         1328     0
>               Vector    20             20     34973120     0
>       Vector Scatter     3              3       503488     0
>              MatMFFD     1              1          768     0
>               Matrix     1              1         2304     0
>     Distributed Mesh     3              3        14416     0
> Star Forest Bipartite Graph     6              6         5024     0
>      Discrete System     3              3         2544     0
>            Index Set     6              6       187248     0
>    IS L to G Mapping     2              2       184524     0
>        Krylov Solver     1              1         1304     0
>      DMKSP interface     1              1          648     0
>       Preconditioner     1              1          880     0
>               Viewer     3              2         1536     0
>          PetscRandom     1              1          624     0
> ========================================================================================================================
> 
  Because the -log_summary output is done with a viewer there has to be a viewer not yet destroyed with the output is made. Hence it will indicate one viewer still exists. This does not mean that it does not get destroyed eventually.

  Barry

> 
> Best 
> 
> Timothee
> 
> 
> 
> 2015-08-26 2:39 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> 
> > On Aug 25, 2015, at 2:06 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> >
> > OK, I see,
> >
> > Might it be that I do something a bit funky to obtain a good guess for solve ? I had he following idea, which I used with great success on a very different problem (much simpler, maybe that's why it worked) : obtain the initial guess as a cubic extrapolation of the preceding solutions. The idea is that I expect my solution to be reasonably smooth over time, so considering this, the increment of the fields should also be continuous (I solve for the increments, not the fields themselves). Therefore, I store in my user context the current vector Xk as well as the last three solutions Xkm1 and Xkm2.
> >
> > I define
> >
> > dxm2 = Xkm1 - Xkm2
> > dxm1 = Xk - Xkm1
> >
> > And I use the result of the last SNESSolve as
> >
> > dx = Xkp1 - Xk
> >
> > Then I set the new dx initial guess as the pointwise cubic extrapolation of (dxm2,dxm1,dx)
> >
> > However it seems pretty local and I don't see why scatters would be required for this.
> 
>    Yes, no scatters here.
> 
> >
> > I printed the routine I use to do this below. In any case I will clean up a bit, remove the extra stuff (not much there however). If it is not sufficient, I will transform my form function in a dummy which does not require computations and see what happens.
> >
> > Timothee
> >
> >   PetscErrorCode :: ierr
> >
> >   PetscScalar :: M(3,3)
> >   Vec         :: xkm2,xkm1
> >   Vec         :: coef1,coef2,coef3
> >   PetscScalar :: a,b,c,t,det
> >
> >   a = user%tkm1
> >   b = user%tk
> >   c = user%t
> >   t = user%t+user%dt
> >
> >   det = b*a**2 + c*b**2 + a*c**2 - (c*a**2 + a*b**2 + b*c**2)
> >
> >   M(1,1) = (b-c)/det
> >   M(2,1) = (c**2-b**2)/det
> >   M(3,1) = (c*b**2-b*c**2)/det
> >
> >   M(1,2) = (c-a)/det
> >   M(2,2) = (a**2-c**2)/det
> >   M(3,2) = (a*c**2-c*a**2)/det
> >
> >   M(1,3) = (a-b)/det
> >   M(2,3) = (b**2-a**2)/det
> >   M(3,3) = (b*a**2-a*b**2)/det
> >
> >   call VecDuplicate(x,xkm1,ierr)
> >   call VecDuplicate(x,xkm2,ierr)
> >
> >   call VecDuplicate(x,coef1,ierr)
> >   call VecDuplicate(x,coef2,ierr)
> >   call VecDuplicate(x,coef3,ierr)
> >
> >   call VecWAXPY(xkm2,-one,user%Xkm2,user%Xkm1,ierr)
> >   call VecWAXPY(xkm1,-one,user%Xkm1,user%Xk,ierr)
> >
> >   ! The following lines correspond to the following simple operation
> >   ! coef1 = M(1,1)*alpha + M(1,2)*beta + M(1,3)*gamma
> >   ! coef2 = M(2,1)*alpha + M(2,2)*beta + M(2,3)*gamma
> >   ! coef3 = M(3,1)*alpha + M(3,2)*beta + M(3,3)*gamma
> >   call VecCopy(xkm2,coef1,ierr)
> >   call VecScale(coef1,M(1,1),ierr)
> >   call VecAXPY(coef1,M(1,2),xkm1,ierr)
> >   call VecAXPY(coef1,M(1,3),x,ierr)
> >
> >   call VecCopy(xkm2,coef2,ierr)
> >   call VecScale(coef2,M(2,1),ierr)
> >   call VecAXPY(coef2,M(2,2),xkm1,ierr)
> >   call VecAXPY(coef2,M(2,3),x,ierr)
> >
> >   call VecCopy(xkm2,coef3,ierr)
> >   call VecScale(coef3,M(3,1),ierr)
> >   call VecAXPY(coef3,M(3,2),xkm1,ierr)
> >   call VecAXPY(coef3,M(3,3),x,ierr)
> >
> >   call VecCopy(coef3,x,ierr)
> >   call VecAXPY(x,t,coef2,ierr)
> >   call VecAXPY(x,t**2,coef1,ierr)
> >
> >   call VecDestroy(xkm2,ierr)
> >   call VecDestroy(xkm1,ierr)
> >
> >   call VecDestroy(coef1,ierr)
> >   call VecDestroy(coef2,ierr)
> >   call VecDestroy(coef3,ierr)
> >
> >
> >
> > 2015-08-25 15:47 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> >
> >   The results are kind of funky,
> >
> > ------------------------------------------------------------------------------------------------------------------------
> > Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
> >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> > ------------------------------------------------------------------------------------------------------------------------
> > SNESSolve             40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80  46 93 99 95 80  2187
> > SNESFunctionEval     666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40  45 13 99 95 40   299
> > SNESLineSearch        79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02  1 11 23 23 19   1 11 23 23 19 33068
> > VecScatterEnd       1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31  0  0  0  0  31  0  0  0  0     0
> > MatMult MF           547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25962
> > MatMult              547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25960
> >
> > look at the %T time for global SNES solve is 46 % of the total time, function evaluations are 45% but MatMult are only 2% (and yet matmult should contain most of the function evaluations). I cannot explain this. Also the VecScatterEnd is HUGE and has a bad load balance of 5.8  Why are there so many more scatters than function evaluations? What other operations are you doing that require scatters?
> >
> > It's almost like you have some mysterious "extra" function calls outside of the SNESSolve that are killing the performance? It might help to understand the performance to strip out all extraneous computations not needed (like in custom monitors etc).
> >
> >  Barry
> >
> >
> >
> >
> >
> >
> > > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> > >
> > > Here is the log summary (attached). At the beginning are personal prints, you can skip. I seem to have a memory crash in the present state after typically 45 iterations (that's why I used 40 here), the log summary indicates some creations without destruction of Petsc objects (I will fix this immediately), that may cause the memory crash, but I don't think it's the cause of the slow function evaluations.
> > >
> > > The log_summary is consistent with 0.7s per function evaluation (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately the same amount of time (is it normal ?). And the other long operation is VecScatterEnd. I assume it is the time used in process communications ? In which case I suppose it is normal that it takes a significant amount of time.
> > >
> > > So this ~10 times increase does not look normal right ?
> > >
> > > Best
> > >
> > > Timothee NICOLAS
> > >
> > >
> > > 2015-08-25 14:56 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:
> > >
> > > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one).
> > > >
> > > > I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms.
> > > >
> > > > Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course)
> > > >
> > > > This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower.
> > > >
> > > > So I have some questions about this.
> > > >
> > > > 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time.
> > >
> > > PetscErrorCode  SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs)
> > > {
> > >   *nfuncs = snes->nfuncs;
> > > }
> > >
> > > PetscErrorCode  SNESComputeFunction(SNES snes,Vec x,Vec y)
> > > {
> > > ...
> > >   snes->nfuncs++;
> > > }
> > >
> > > PetscErrorCode  MatCreateSNESMF(SNES snes,Mat *J)
> > > {
> > > .....
> > >   if (snes->pc && snes->pcside == PC_LEFT) {
> > >     ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr);
> > >   } else {
> > >     ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr);
> > >   }
> > > }
> > >
> > >   So, yes I would expect all the function evaluations needed for the matrix-free Jacobian matrix vector product to be counted. You can also look at the number of GMRES Krylov iterations it took (which should have one multiply per iteration) to double check that the numbers make sense.
> > >
> > >   What does your -log_summary output look like? One thing that GMRES does is it introduces a global reduction with each multiple (hence a barrier across all your processes) on some systems this can be deadly.
> > >
> > >   Barry
> > >
> > >
> > > >
> > > > 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ?
> > > >
> > > > I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time.
> > > >
> > > > Best regards
> > > >
> > > > Timothee NICOLAS
> > >
> > >
> > > <log_summary.txt>
> >
> >
> 
> 


From jed at jedbrown.org  Tue Aug 25 22:18:03 2015
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 25 Aug 2015 21:18:03 -0600
Subject: [petsc-users] Function evaluation slowness ?
In-Reply-To: <CAGi1ndQssq44gS-eDNxLR2j5hnW2OaHgvsTAobsbBGGdtb646A@mail.gmail.com>
References: <CAGi1ndQgz1YOSZeX8efddQEe44oCHyUFD13ud3RNSv2O=iSKyg@mail.gmail.com>
	<0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov>
	<CAGi1ndQuuO+tMC_1r=DyhrsW0n4b1aMpvKXBF_BDcKpuqoHeUQ@mail.gmail.com>
	<26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov>
	<CAGi1ndRv1bsz-YxNB-jNJp_zoXZ7kokJwyDrCghyPLsrAfHcww@mail.gmail.com>
	<3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov>
	<CAGi1ndQssq44gS-eDNxLR2j5hnW2OaHgvsTAobsbBGGdtb646A@mail.gmail.com>
Message-ID: <87k2siagb8.fsf@jedbrown.org>

Timoth?e Nicolas <timothee.nicolas at gmail.com> writes:
> 2. The actual problem was a very stupid one on my side. At some point I
> print small diagnostics at every time step to a text file with standard
> Fortran write statement rather than a viewer to a binary file. I had simply
> forgotten to put the statement between an if statement on the rank
>
> if (rank.eq.0) then
>
>     write(50) ....
>
> end if
>
> So all the processors were trying to write together to the file, which, I
> suppose, somehow caused all the Scatters. 

It doesn't create Scatters, but it likely creates load imbalance that
will be paid for in the subsequent VecScatter.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/9faffde1/attachment.pgp>

From zonexo at gmail.com  Tue Aug 25 23:12:49 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Wed, 26 Aug 2015 12:12:49 +0800
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
Message-ID: <55DD3CC1.5070801@gmail.com>

Hi,

I have wrote the routine for my Poisson eqn. I have only 1 DOF, which is 
for pressure. The center cell is coupled with 6 other cells (north, 
south, east, west, front, back), so together 7 couplings.

size x/y/z = 4/8/10

*/MatStencil  :: row(4,1),col(4,7)/**/
/**/
/**/PetscScalar :: value_insert(7)/**/
/**/
/**/PetscInt :: ione,iseven/**/
/**/
/**/ione = 1;   iseven = 7/**/
/**/
/**/do k=ksta,kend/**/
/**/
/**/        do j = jsta,jend/**/
/**/
/**/            do i=1,size_x/**/
/**//**/
/**/                row(MatStencil_i,1) = i - 1/**/
/**//**/
/**/                row(MatStencil_j,1) = j - 1/**/
/**//**/
/**/                row(MatStencil_k,1) = k - 1/**/
/**//**/
/**/                row(MatStencil_c,1) = 0 ! 1 - 1/**/
/**//**/
/**/                value_insert = 0.d0/**/
/**//**/
/**/                if (i /= size_x) then/**/
/**//**/
/**/                    col(MatStencil_i,3) = i + 1 - 1 !east/**/
/**//**/
/**/                    col(MatStencil_j,3) = j - 1/**/
/**//**/
/**/                    col(MatStencil_k,3) = k - 1/**/
/**//**/
/**/                    col(MatStencil_c,3) = 0/**/
/**//**/
/**/                    value_insert(3) = 
(cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)/**/
/**//**/
/**/                end if/**/
/**//**/
/**/                if (i /= 1) then/**/
/**//**/
/**/                    col(MatStencil_i,5) = i - 1 - 1 !west/**/
/**//**/
/**/                    col(MatStencil_j,5) = j - 1/**/
/**//**/
/**/                    col(MatStencil_k,5) = k - 1/**/
/**//**/
/**/                    col(MatStencil_c,5) = 0/**/
/**//**/
/**/                    value_insert(5) = 
(cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)/**/
/**//**/
/**/                end if/**/
/**//**/
/**/                if (j /= size_y) then/**/
/**//**/
/**/                    col(MatStencil_i,2) = i - 1 !north/**/
/**//**/
/**/                    col(MatStencil_j,2) = j + 1 - 1/**/
/**//**/
/**/                    col(MatStencil_k,2) = k - 1/**/
/**//**/
/**/                    col(MatStencil_c,2) = 0/**/
/**//**/
/**/                    value_insert(2) = 
(cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)/**/
/**//**/
/**/                end if/**/
/**//**/
/**/                .../**/
/**//**/
/**/                col(MatStencil_i,1) = i - 1/**/
/**//**/
/**/                col(MatStencil_j,1) = j - 1/**/
/**//**/
/**/                col(MatStencil_k,1) = k - 1/**/
/**//**/
/**/                col(MatStencil_c,1) = 0/**/
/**//**/
/**/                value_insert(1) = -value_insert(2) - value_insert(3) 
- value_insert(4) - value_insert(5) - value_insert(6) - value_insert(7)/**/
/**//**/
/**/                call 
MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)/**/
/**//**/
/**/            end do/**/
/**//**/
/**/        end do/**/
/**//**/
/**/    end do/*

but I got the error :

[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix.

The error happens at i = 4, j = 1, k = 1. So I guess it has something to 
do with the boundary condition. However, I can't figure out what's 
wrong. Can someone help?

Thank you

Yours sincerely,

TAY wee-beng

On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote:
> Hi,
>
> ex5 of snes can give you an example of the two routines.
>
> The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 
> version ex5f90.F uses MatSetValuesLocal.
>
> However, I use MatSetValuesStencil also in Fortran, there is no 
> problem, and no need to mess around with DMDAGetAO, I think.
>
> To input values in the matrix, you need to do the following :
>
> ! Declare the matstencils for matrix columns and rows
> MatStencil  :: row(4,1),col(4,n)
> ! Declare the quantity which will store the actual matrix elements
> PetscScalar :: v(8)
>
> The first dimension in row and col is 4 to allow for 3 spatial 
> dimensions (even if you use only 2) plus one degree of freedom if you 
> have several fields in your DMDA. The second dimension is 1 for row 
> (you input one row at a time) and n for col, where n is the number of 
> columns that you input. For instance, if at node (1,i,j)  (1 is the 
> index of the degree of freedom), you have, say, 6 couplings, with 
> nodes (1,i,j), (1,i+1,j), (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for 
> example, then you need to set n=6
>
> Then you define the row number by naturally doing the following, 
> inside a local loop :
>
> row(MatStencil_i,1) = i          -1
> row(MatStencil_j,1) = j          -1
> row(MatStencil_c,1) = 1          -1
>
> the -1 are here because FORTRAN indexing is different from the native 
> C indexing. I put them on the right to make this more apparent.
>
> Then the column information. For instance to declare the coupling with 
> node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you 
> will have to write (still within the same local loop on i and j)
>
> col(MatStencil_i,1) = i         -1
> col(MatStencil_j,1) = j         -1
> col(MatStencil_c,1) = 1         -1
> v(1) = whatever_it_is
>
> col(MatStencil_i,2) = i-1       -1
> col(MatStencil_j,2) = j         -1
> col(MatStencil_c,2) = 1         -1
> v(2) = whatever_it_is
>
> col(MatStencil_i,3) = i       -1
> col(MatStencil_j,3) = j         -1
> col(MatStencil_c,3) = 2         -1
> v(3) = whatever_it_is
>
> ...
> ...
> ..
>
> ...
> ...
> ...
>
> Note that the index of the degree of freedom (or what field you are 
> coupling to), is indicated by MatStencil_c
>
>
> Finally use MatSetValuesStencil
>
> ione = 1
> isix = 6
> call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr)
>
> If it is not clear don't hesitate to ask more details. For me it 
> worked that way, I succesfully computed a Jacobian that way. It is 
> very sensitive. If you slightly depart from the right jacobian, you 
> will see a huge difference compared to using  matrix free with 
> -snes_mf, so you can hardly make a mistake because you would see it. 
> That's how I finally got it to work.
>
> Best
>
> Timothee
>
>
> 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>>:
>
>     Hi,
>
>     I'm modifying my 3d fortran code from MPI along 1 direction (z) to
>     MPI along 2 directions (y,z)
>
>     Previously I was using MatSetValues with global indices. However,
>     now I'm using DM and global indices is much more difficult.
>
>     I come across MatSetValuesStencil or MatSetValuesLocal.
>
>     So what's the difference bet the one since they both seem to work
>     locally?
>
>     Which is a simpler/better option?
>
>     Is there an example in Fortran for MatSetValuesStencil?
>
>     Do I also need to use DMDAGetAO together with MatSetValuesStencil
>     or MatSetValuesLocal?
>
>     Thanks!
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/e41bc9ac/attachment.html>

From timothee.nicolas at gmail.com  Tue Aug 25 23:24:18 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Wed, 26 Aug 2015 13:24:18 +0900
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <55DD3CC1.5070801@gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
	<55DD3CC1.5070801@gmail.com>
Message-ID: <CAGi1ndR78hB8M9_ybho-favYfdJnsissE_A45V1hFYZTQ0AC0w@mail.gmail.com>

What is the definition of ksta, kend, jsta, jend ? Etc ? You are
parallelized only in j and k ?

What I said about the "-1" holds only if you have translated the start and
end points to FORTRAN numbering after getting the corners and ghost corners
from the DMDA (see ex ex5f90.F from snes)

Would you mind sending the complete routine with the complete definitions
of ksta,kend,jsta,jend,and size_x ?

Timothee

2015-08-26 13:12 GMT+09:00 TAY wee-beng <zonexo at gmail.com>:

> Hi,
>
> I have wrote the routine for my Poisson eqn. I have only 1 DOF, which is
> for pressure. The center cell is coupled with 6 other cells (north, south,
> east, west, front, back), so together 7 couplings.
>
> size x/y/z = 4/8/10
>
> *MatStencil  :: row(4,1),col(4,7)*
>
> *PetscScalar :: value_insert(7)*
>
> *PetscInt :: ione,iseven*
>
> *ione = 1;   iseven = 7*
>
> *do k=ksta,kend*
>
> *        do j = jsta,jend*
>
> *            do i=1,size_x*
>
> *                row(MatStencil_i,1) = i - 1*
>
> *                row(MatStencil_j,1) = j - 1*
>
> *                row(MatStencil_k,1) = k - 1*
>
> *                row(MatStencil_c,1) = 0 ! 1 - 1*
>
> *                value_insert = 0.d0*
>
> *                if (i /= size_x) then*
>
> *                    col(MatStencil_i,3) = i + 1 - 1 !east*
>
> *                    col(MatStencil_j,3) = j - 1*
>
> *                    col(MatStencil_k,3) = k - 1*
>
> *                    col(MatStencil_c,3) = 0*
>
> *                    value_insert(3) =
> (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)*
>
> *                end if*
>
> *                if (i /= 1) then*
>
> *                    col(MatStencil_i,5) = i - 1 - 1 !west*
>
> *                    col(MatStencil_j,5) = j - 1*
>
> *                    col(MatStencil_k,5) = k - 1*
>
> *                    col(MatStencil_c,5) = 0*
>
> *                    value_insert(5) =
> (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)*
>
> *                end if*
>
> *                if (j /= size_y) then*
>
> *                    col(MatStencil_i,2) = i - 1 !north*
>
> *                    col(MatStencil_j,2) = j + 1 - 1*
>
> *                    col(MatStencil_k,2) = k - 1*
>
> *                    col(MatStencil_c,2) = 0*
>
> *                    value_insert(2) =
> (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)*
>
> *                end if*
>
> *                ...*
>
> *                col(MatStencil_i,1) = i - 1*
>
> *                col(MatStencil_j,1) = j - 1*
>
> *                col(MatStencil_k,1) = k - 1*
>
> *                col(MatStencil_c,1) = 0*
>
> *                value_insert(1) = -value_insert(2) - value_insert(3) -
> value_insert(4) - value_insert(5) - value_insert(6) - value_insert(7)*
>
> *                call
> MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)*
>
> *            end do*
>
> *        end do*
>
> *    end do*
>
> but I got the error :
>
> [0]PETSC ERROR: Argument out of range
> [0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix.
>
> The error happens at i = 4, j = 1, k = 1. So I guess it has something to
> do with the boundary condition. However, I can't figure out what's wrong.
> Can someone help?
>
> Thank you
>
> Yours sincerely,
>
> TAY wee-beng
>
> On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote:
>
> Hi,
>
> ex5 of snes can give you an example of the two routines.
>
> The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version
> ex5f90.F uses MatSetValuesLocal.
>
> However, I use MatSetValuesStencil also in Fortran, there is no problem,
> and no need to mess around with DMDAGetAO, I think.
>
> To input values in the matrix, you need to do the following :
>
> ! Declare the matstencils for matrix columns and rows
> MatStencil  :: row(4,1),col(4,n)
> ! Declare the quantity which will store the actual matrix elements
> PetscScalar :: v(8)
>
> The first dimension in row and col is 4 to allow for 3 spatial dimensions
> (even if you use only 2) plus one degree of freedom if you have several
> fields in your DMDA. The second dimension is 1 for row (you input one row
> at a time) and n for col, where n is the number of columns that you input.
> For instance, if at node (1,i,j)  (1 is the index of the degree of
> freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j),
> (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set
> n=6
>
> Then you define the row number by naturally doing the following, inside a
> local loop :
>
> row(MatStencil_i,1) = i          -1
> row(MatStencil_j,1) = j          -1
> row(MatStencil_c,1) = 1          -1
>
> the -1 are here because FORTRAN indexing is different from the native C
> indexing. I put them on the right to make this more apparent.
>
> Then the column information. For instance to declare the coupling with
> node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will
> have to write (still within the same local loop on i and j)
>
> col(MatStencil_i,1) = i         -1
> col(MatStencil_j,1) = j         -1
> col(MatStencil_c,1) = 1         -1
> v(1) = whatever_it_is
>
> col(MatStencil_i,2) = i-1       -1
> col(MatStencil_j,2) = j         -1
> col(MatStencil_c,2) = 1         -1
> v(2) = whatever_it_is
>
> col(MatStencil_i,3) = i       -1
> col(MatStencil_j,3) = j         -1
> col(MatStencil_c,3) = 2         -1
> v(3) = whatever_it_is
>
> ...
> ...
> ..
>
> ...
> ...
> ...
>
> Note that the index of the degree of freedom (or what field you are
> coupling to), is indicated by MatStencil_c
>
>
> Finally use MatSetValuesStencil
>
> ione = 1
> isix = 6
> call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr)
>
> If it is not clear don't hesitate to ask more details. For me it worked
> that way, I succesfully computed a Jacobian that way. It is very sensitive.
> If you slightly depart from the right jacobian, you will see a huge
> difference compared to using  matrix free with -snes_mf, so you can hardly
> make a mistake because you would see it. That's how I finally got it to
> work.
>
> Best
>
> Timothee
>
>
> 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay <zonexo at gmail.com>:
>
>> Hi,
>>
>> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI
>> along 2 directions (y,z)
>>
>> Previously I was using MatSetValues with global indices. However, now I'm
>> using DM and global indices is much more difficult.
>>
>> I come across MatSetValuesStencil or MatSetValuesLocal.
>>
>> So what's the difference bet the one since they both seem to work locally?
>>
>> Which is a simpler/better option?
>>
>> Is there an example in Fortran for MatSetValuesStencil?
>>
>> Do I also need to use DMDAGetAO together with MatSetValuesStencil or
>> MatSetValuesLocal?
>>
>> Thanks!
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/ebe41779/attachment-0001.html>

From zonexo at gmail.com  Wed Aug 26 00:02:33 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Wed, 26 Aug 2015 13:02:33 +0800
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CAGi1ndR78hB8M9_ybho-favYfdJnsissE_A45V1hFYZTQ0AC0w@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
	<55DD3CC1.5070801@gmail.com>
	<CAGi1ndR78hB8M9_ybho-favYfdJnsissE_A45V1hFYZTQ0AC0w@mail.gmail.com>
Message-ID: <55DD4869.2000006@gmail.com>

Hi Timothee,

Yes, I only parallelized in j and k. ksta,jsta are the starting k and j 
values. kend,jend are the ending k and j values.

However, now I am using only 1 procs.

I was going to resend you my code but then I realised my mistake. I used:

*/call 
MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)/**/
/*
for all pts, including those at the boundary. Hence, those coupling 
outside the boundary is also included.

I changed to:

*/call 
MatSetValuesStencil(A_mat,ione,row,ione,col(:,7),value_insert(7),INSERT_VALUES,ierr)/*

so I am now entering values individually.

Is there anyway I can use the 1st option to enter all the values 
together even those some pts are invalid. I think it should be faster. 
Can I somehow tell PETSc to ignore them?

Thank you

Yours sincerely,

TAY wee-beng

On 26/8/2015 12:24 PM, Timoth?e Nicolas wrote:
> What is the definition of ksta, kend, jsta, jend ? Etc ? You are 
> parallelized only in j and k ?
>
> What I said about the "-1" holds only if you have translated the start 
> and end points to FORTRAN numbering after getting the corners and 
> ghost corners from the DMDA (see ex ex5f90.F from snes)
>
> Would you mind sending the complete routine with the complete 
> definitions of ksta,kend,jsta,jend,and size_x ?
>
> Timothee
>
> 2015-08-26 13:12 GMT+09:00 TAY wee-beng <zonexo at gmail.com 
> <mailto:zonexo at gmail.com>>:
>
>     Hi,
>
>     I have wrote the routine for my Poisson eqn. I have only 1 DOF,
>     which is for pressure. The center cell is coupled with 6 other
>     cells (north, south, east, west, front, back), so together 7
>     couplings.
>
>     size x/y/z = 4/8/10
>
>     */MatStencil  :: row(4,1),col(4,7)/**/
>     /**/
>     /**/PetscScalar :: value_insert(7)/**/
>     /**/
>     /**/PetscInt :: ione,iseven/**/
>     /**/
>     /**/ione = 1;   iseven = 7/**/
>     /**/
>     /**/do k=ksta,kend/**/
>     /**/
>     /**/        do j = jsta,jend/**/
>     /**/
>     /**/            do i=1,size_x/**/
>     /**//**/
>     /**/                row(MatStencil_i,1) = i - 1/**/
>     /**//**/
>     /**/                row(MatStencil_j,1) = j - 1/**/
>     /**//**/
>     /**/                row(MatStencil_k,1) = k - 1/**/
>     /**//**/
>     /**/                row(MatStencil_c,1) = 0 ! 1 - 1/**/
>     /**//**/
>     /**/                value_insert = 0.d0/**/
>     /**//**/
>     /**/                if (i /= size_x) then/**/
>     /**//**/
>     /**/                    col(MatStencil_i,3) = i + 1 - 1 !east/**/
>     /**//**/
>     /**/                    col(MatStencil_j,3) = j - 1/**/
>     /**//**/
>     /**/                    col(MatStencil_k,3) = k - 1/**/
>     /**//**/
>     /**/                    col(MatStencil_c,3) = 0/**/
>     /**//**/
>     /**/                    value_insert(3) =
>     (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)/**/
>     /**//**/
>     /**/                end if/**/
>     /**//**/
>     /**/                if (i /= 1) then/**/
>     /**//**/
>     /**/                    col(MatStencil_i,5) = i - 1 - 1 !west/**/
>     /**//**/
>     /**/                    col(MatStencil_j,5) = j - 1/**/
>     /**//**/
>     /**/                    col(MatStencil_k,5) = k - 1/**/
>     /**//**/
>     /**/                    col(MatStencil_c,5) = 0/**/
>     /**//**/
>     /**/                    value_insert(5) =
>     (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)/**/
>     /**//**/
>     /**/                end if/**/
>     /**//**/
>     /**/                if (j /= size_y) then/**/
>     /**//**/
>     /**/                    col(MatStencil_i,2) = i - 1 !north/**/
>     /**//**/
>     /**/                    col(MatStencil_j,2) = j + 1 - 1/**/
>     /**//**/
>     /**/                    col(MatStencil_k,2) = k - 1/**/
>     /**//**/
>     /**/                    col(MatStencil_c,2) = 0/**/
>     /**//**/
>     /**/                    value_insert(2) =
>     (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)/**/
>     /**//**/
>     /**/                end if/**/
>     /**//**/
>     /**/                .../**/
>     /**//**/
>     /**/                col(MatStencil_i,1) = i - 1/**/
>     /**//**/
>     /**/                col(MatStencil_j,1) = j - 1/**/
>     /**//**/
>     /**/                col(MatStencil_k,1) = k - 1/**/
>     /**//**/
>     /**/                col(MatStencil_c,1) = 0/**/
>     /**//**/
>     /**/                value_insert(1) = -value_insert(2) -
>     value_insert(3) - value_insert(4) - value_insert(5) -
>     value_insert(6) - value_insert(7)/**/
>     /**//**/
>     /**/                call
>     MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)/**/
>     /**//**/
>     /**/            end do/**/
>     /**//**/
>     /**/        end do/**/
>     /**//**/
>     /**/    end do/*
>
>     but I got the error :
>
>     [0]PETSC ERROR: Argument out of range
>     [0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix.
>
>     The error happens at i = 4, j = 1, k = 1. So I guess it has
>     something to do with the boundary condition. However, I can't
>     figure out what's wrong. Can someone help?
>
>     Thank you
>
>     Yours sincerely,
>
>     TAY wee-beng
>
>     On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote:
>>     Hi,
>>
>>     ex5 of snes can give you an example of the two routines.
>>
>>     The C version ex5.c uses MatSetValuesStencil whereas the
>>     Fortran90 version ex5f90.F uses MatSetValuesLocal.
>>
>>     However, I use MatSetValuesStencil also in Fortran, there is no
>>     problem, and no need to mess around with DMDAGetAO, I think.
>>
>>     To input values in the matrix, you need to do the following :
>>
>>     ! Declare the matstencils for matrix columns and rows
>>     MatStencil  :: row(4,1),col(4,n)
>>     ! Declare the quantity which will store the actual matrix elements
>>     PetscScalar :: v(8)
>>
>>     The first dimension in row and col is 4 to allow for 3 spatial
>>     dimensions (even if you use only 2) plus one degree of freedom if
>>     you have several fields in your DMDA. The second dimension is 1
>>     for row (you input one row at a time) and n for col, where n is
>>     the number of columns that you input. For instance, if at node
>>     (1,i,j)  (1 is the index of the degree of freedom), you have,
>>     say, 6 couplings, with nodes (1,i,j), (1,i+1,j), (1,i-1,j),
>>     (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set n=6
>>
>>     Then you define the row number by naturally doing the following,
>>     inside a local loop :
>>
>>     row(MatStencil_i,1) = i          -1
>>     row(MatStencil_j,1) = j          -1
>>     row(MatStencil_c,1) = 1          -1
>>
>>     the -1 are here because FORTRAN indexing is different from the
>>     native C indexing. I put them on the right to make this more
>>     apparent.
>>
>>     Then the column information. For instance to declare the coupling
>>     with node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the
>>     rest) you will have to write (still within the same local loop on
>>     i and j)
>>
>>     col(MatStencil_i,1) = i         -1
>>     col(MatStencil_j,1) = j         -1
>>     col(MatStencil_c,1) = 1         -1
>>     v(1) = whatever_it_is
>>
>>     col(MatStencil_i,2) = i-1       -1
>>     col(MatStencil_j,2) = j         -1
>>     col(MatStencil_c,2) = 1         -1
>>     v(2) = whatever_it_is
>>
>>     col(MatStencil_i,3) = i       -1
>>     col(MatStencil_j,3) = j         -1
>>     col(MatStencil_c,3) = 2         -1
>>     v(3) = whatever_it_is
>>
>>     ...
>>     ...
>>     ..
>>
>>     ...
>>     ...
>>     ...
>>
>>     Note that the index of the degree of freedom (or what field you
>>     are coupling to), is indicated by MatStencil_c
>>
>>
>>     Finally use MatSetValuesStencil
>>
>>     ione = 1
>>     isix = 6
>>     call
>>     MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr)
>>
>>     If it is not clear don't hesitate to ask more details. For me it
>>     worked that way, I succesfully computed a Jacobian that way. It
>>     is very sensitive. If you slightly depart from the right
>>     jacobian, you will see a huge difference compared to using 
>>     matrix free with -snes_mf, so you can hardly make a mistake
>>     because you would see it. That's how I finally got it to work.
>>
>>     Best
>>
>>     Timothee
>>
>>
>>     2015-08-24 18:09 GMT+09:00 Wee-Beng Tay <zonexo at gmail.com
>>     <mailto:zonexo at gmail.com>>:
>>
>>         Hi,
>>
>>         I'm modifying my 3d fortran code from MPI along 1 direction
>>         (z) to MPI along 2 directions (y,z)
>>
>>         Previously I was using MatSetValues with global indices.
>>         However, now I'm using DM and global indices is much more
>>         difficult.
>>
>>         I come across MatSetValuesStencil or MatSetValuesLocal.
>>
>>         So what's the difference bet the one since they both seem to
>>         work locally?
>>
>>         Which is a simpler/better option?
>>
>>         Is there an example in Fortran for MatSetValuesStencil?
>>
>>         Do I also need to use DMDAGetAO together
>>         with MatSetValuesStencil or MatSetValuesLocal?
>>
>>         Thanks!
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/4deb536d/attachment.html>

From timothee.nicolas at gmail.com  Wed Aug 26 00:15:10 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Wed, 26 Aug 2015 14:15:10 +0900
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <55DD4869.2000006@gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
	<55DD3CC1.5070801@gmail.com>
	<CAGi1ndR78hB8M9_ybho-favYfdJnsissE_A45V1hFYZTQ0AC0w@mail.gmail.com>
	<55DD4869.2000006@gmail.com>
Message-ID: <CAGi1ndQmCWRbx5ijVc-vc96RaOGuSXwoXz43FRpjyfBs8e1O7Q@mail.gmail.com>

I don't really understand what you say, but it does not sound right.

You can enter the boundary points separately and then the points outside
the boundary on separate calls, like this :

  do j=user%ys,user%ye
     do i=user%xs,user%xe
        if (i.eq.1 .or. i.eq.user%mx .or. j .eq. 1 .or. j .eq. user%my) then
        ! boundary point
           row(MatStencil_i,1) = i          -1
           row(MatStencil_j,1) = j          -1
           row(MatStencil_c,1) = 1          -1

           col(MatStencil_i,1) = i         -1
           col(MatStencil_j,1) = j         -1
           col(MatStencil_c,1) = 1         -1
           v(1)   = one
           call MatSetValuesStencil(jac_prec,ione,row,ione,col,v,          &
                &                           INSERT_VALUES,ierr)

         else
           row(MatStencil_i,1) = i          -1
           row(MatStencil_j,1) = j          -1
           row(MatStencil_c,1) = 1          -1


           col(MatStencil_i,1) = i         -1
           col(MatStencil_j,1) = j         -1
           col(MatStencil_c,1) = 1         -1
           v(1) = undemi*dxm1*(vx_ip1j-vx_im1j) +
two*user%nu*(dxm1**2+dym1**2)
           col(MatStencil_i,2) = i+1       -1
           col(MatStencil_j,2) = j         -1
           col(MatStencil_c,2) = 1         -1
           v(2) = undemi*dxm1*(vx_ij-vx_ip1j) - user%nu*dxm1**2
           col(MatStencil_i,3) = i-1       -1
           col(MatStencil_j,3) = j         -1
           col(MatStencil_c,3) = 1         -1
           v(3) = -undemi*dxm1*(vx_ij-vx_im1j)  - user%nu*dxm1**2
           col(MatStencil_i,4) = i         -1
           col(MatStencil_j,4) = j+1       -1
           col(MatStencil_c,4) = 1         -1
           v(4) = undemi*dym1*vy_ij           - user%nu*dym1**2
           col(MatStencil_i,5) = i         -1
           col(MatStencil_j,5) = j-1       -1
           col(MatStencil_c,5) = 1         -1
           v(5) =  -undemi*dym1*vy_ij           - user%nu*dym1**2
           col(MatStencil_i,6) = i         -1
           col(MatStencil_j,6) = j         -1
           col(MatStencil_c,6) = 2         -1
           v(6) = undemi*dym1*(vx_ijp1-vx_ijm1)
           col(MatStencil_i,7) = i+1       -1
           col(MatStencil_j,7) = j         -1
           col(MatStencil_c,7) = 2         -1
           v(7) = -undemi*dxm1*vy_ip1j
           col(MatStencil_i,8) = i-1       -1
           col(MatStencil_j,8) = j         -1
           col(MatStencil_c,8) = 2         -1
           v(8) = undemi*dxm1*vy_im1j

          call MatSetValuesStencil(jac_prec,ione,row,ieight,col,v,         &
                &                                INSERT_VALUES,ierr)

       endif
     enddo
  enddo

Timothee


2015-08-26 14:02 GMT+09:00 TAY wee-beng <zonexo at gmail.com>:

> Hi Timothee,
>
> Yes, I only parallelized in j and k. ksta,jsta are the starting k and j
> values. kend,jend are the ending k and j values.
>
> However, now I am using only 1 procs.
>
> I was going to resend you my code but then I realised my mistake. I used:
>
> *call
> MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)*
>
> for all pts, including those at the boundary. Hence, those coupling
> outside the boundary is also included.
>
> I changed to:
>
> *call
> MatSetValuesStencil(A_mat,ione,row,ione,col(:,7),value_insert(7),INSERT_VALUES,ierr)*
>
> so I am now entering values individually.
>
> Is there anyway I can use the 1st option to enter all the values together
> even those some pts are invalid. I think it should be faster. Can I somehow
> tell PETSc to ignore them?
>
> Thank you
>
> Yours sincerely,
>
> TAY wee-beng
>
> On 26/8/2015 12:24 PM, Timoth?e Nicolas wrote:
>
> What is the definition of ksta, kend, jsta, jend ? Etc ? You are
> parallelized only in j and k ?
>
> What I said about the "-1" holds only if you have translated the start and
> end points to FORTRAN numbering after getting the corners and ghost corners
> from the DMDA (see ex ex5f90.F from snes)
>
> Would you mind sending the complete routine with the complete definitions
> of ksta,kend,jsta,jend,and size_x ?
>
> Timothee
>
> 2015-08-26 13:12 GMT+09:00 TAY wee-beng <zonexo at gmail.com>:
>
>> Hi,
>>
>> I have wrote the routine for my Poisson eqn. I have only 1 DOF, which is
>> for pressure. The center cell is coupled with 6 other cells (north, south,
>> east, west, front, back), so together 7 couplings.
>>
>> size x/y/z = 4/8/10
>>
>> *MatStencil  :: row(4,1),col(4,7)*
>>
>> *PetscScalar :: value_insert(7)*
>>
>> *PetscInt :: ione,iseven*
>>
>> *ione = 1;   iseven = 7*
>>
>> *do k=ksta,kend*
>>
>> *        do j = jsta,jend*
>>
>> *            do i=1,size_x*
>>
>> *                row(MatStencil_i,1) = i - 1*
>>
>> *                row(MatStencil_j,1) = j - 1*
>>
>> *                row(MatStencil_k,1) = k - 1*
>>
>> *                row(MatStencil_c,1) = 0 ! 1 - 1*
>>
>> *                value_insert = 0.d0*
>>
>> *                if (i /= size_x) then*
>>
>> *                    col(MatStencil_i,3) = i + 1 - 1 !east*
>>
>> *                    col(MatStencil_j,3) = j - 1*
>>
>> *                    col(MatStencil_k,3) = k - 1*
>>
>> *                    col(MatStencil_c,3) = 0*
>>
>> *                    value_insert(3) =
>> (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)*
>>
>> *                end if*
>>
>> *                if (i /= 1) then*
>>
>> *                    col(MatStencil_i,5) = i - 1 - 1 !west*
>>
>> *                    col(MatStencil_j,5) = j - 1*
>>
>> *                    col(MatStencil_k,5) = k - 1*
>>
>> *                    col(MatStencil_c,5) = 0*
>>
>> *                    value_insert(5) =
>> (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)*
>>
>> *                end if*
>>
>> *                if (j /= size_y) then*
>>
>> *                    col(MatStencil_i,2) = i - 1 !north*
>>
>> *                    col(MatStencil_j,2) = j + 1 - 1*
>>
>> *                    col(MatStencil_k,2) = k - 1*
>>
>> *                    col(MatStencil_c,2) = 0*
>>
>> *                    value_insert(2) =
>> (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)*
>>
>> *                end if*
>>
>> *                ...*
>>
>> *                col(MatStencil_i,1) = i - 1*
>>
>> *                col(MatStencil_j,1) = j - 1*
>>
>> *                col(MatStencil_k,1) = k - 1*
>>
>> *                col(MatStencil_c,1) = 0*
>>
>> *                value_insert(1) = -value_insert(2) - value_insert(3) -
>> value_insert(4) - value_insert(5) - value_insert(6) - value_insert(7)*
>>
>> *                call
>> MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)*
>>
>> *            end do*
>>
>> *        end do*
>>
>> *    end do*
>>
>> but I got the error :
>>
>> [0]PETSC ERROR: Argument out of range
>> [0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix.
>>
>> The error happens at i = 4, j = 1, k = 1. So I guess it has something to
>> do with the boundary condition. However, I can't figure out what's wrong.
>> Can someone help?
>>
>> Thank you
>>
>> Yours sincerely,
>>
>> TAY wee-beng
>>
>> On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote:
>>
>> Hi,
>>
>> ex5 of snes can give you an example of the two routines.
>>
>> The C version ex5.c uses MatSetValuesStencil whereas the Fortran90
>> version ex5f90.F uses MatSetValuesLocal.
>>
>> However, I use MatSetValuesStencil also in Fortran, there is no problem,
>> and no need to mess around with DMDAGetAO, I think.
>>
>> To input values in the matrix, you need to do the following :
>>
>> ! Declare the matstencils for matrix columns and rows
>> MatStencil  :: row(4,1),col(4,n)
>> ! Declare the quantity which will store the actual matrix elements
>> PetscScalar :: v(8)
>>
>> The first dimension in row and col is 4 to allow for 3 spatial dimensions
>> (even if you use only 2) plus one degree of freedom if you have several
>> fields in your DMDA. The second dimension is 1 for row (you input one row
>> at a time) and n for col, where n is the number of columns that you input.
>> For instance, if at node (1,i,j)  (1 is the index of the degree of
>> freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j),
>> (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set
>> n=6
>>
>> Then you define the row number by naturally doing the following, inside a
>> local loop :
>>
>> row(MatStencil_i,1) = i          -1
>> row(MatStencil_j,1) = j          -1
>> row(MatStencil_c,1) = 1          -1
>>
>> the -1 are here because FORTRAN indexing is different from the native C
>> indexing. I put them on the right to make this more apparent.
>>
>> Then the column information. For instance to declare the coupling with
>> node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will
>> have to write (still within the same local loop on i and j)
>>
>> col(MatStencil_i,1) = i         -1
>> col(MatStencil_j,1) = j         -1
>> col(MatStencil_c,1) = 1         -1
>> v(1) = whatever_it_is
>>
>> col(MatStencil_i,2) = i-1       -1
>> col(MatStencil_j,2) = j         -1
>> col(MatStencil_c,2) = 1         -1
>> v(2) = whatever_it_is
>>
>> col(MatStencil_i,3) = i       -1
>> col(MatStencil_j,3) = j         -1
>> col(MatStencil_c,3) = 2         -1
>> v(3) = whatever_it_is
>>
>> ...
>> ...
>> ..
>>
>> ...
>> ...
>> ...
>>
>> Note that the index of the degree of freedom (or what field you are
>> coupling to), is indicated by MatStencil_c
>>
>>
>> Finally use MatSetValuesStencil
>>
>> ione = 1
>> isix = 6
>> call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr)
>>
>> If it is not clear don't hesitate to ask more details. For me it worked
>> that way, I succesfully computed a Jacobian that way. It is very sensitive.
>> If you slightly depart from the right jacobian, you will see a huge
>> difference compared to using  matrix free with -snes_mf, so you can hardly
>> make a mistake because you would see it. That's how I finally got it to
>> work.
>>
>> Best
>>
>> Timothee
>>
>>
>> 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay < <zonexo at gmail.com>
>> zonexo at gmail.com>:
>>
>>> Hi,
>>>
>>> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI
>>> along 2 directions (y,z)
>>>
>>> Previously I was using MatSetValues with global indices. However, now
>>> I'm using DM and global indices is much more difficult.
>>>
>>> I come across MatSetValuesStencil or MatSetValuesLocal.
>>>
>>> So what's the difference bet the one since they both seem to work
>>> locally?
>>>
>>> Which is a simpler/better option?
>>>
>>> Is there an example in Fortran for MatSetValuesStencil?
>>>
>>> Do I also need to use DMDAGetAO together with MatSetValuesStencil or
>>> MatSetValuesLocal?
>>>
>>> Thanks!
>>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/0301a30e/attachment-0001.html>

From nicolas.pozin at inria.fr  Wed Aug 26 13:00:36 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Wed, 26 Aug 2015 20:00:36 +0200 (CEST)
Subject: [petsc-users]  forming a matrix from a set of vectors
In-Reply-To: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
Message-ID: <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>

Dear all, 

Given a set of vectors V1, V2,...,Vn, is there an efficient way to form the dense matrix [V1 V2 ... Vn]? 

Thanks,
Regards
Nicolas


From jed at jedbrown.org  Wed Aug 26 13:38:37 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 26 Aug 2015 12:38:37 -0600
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
Message-ID: <87pp2999oy.fsf@jedbrown.org>

Nicolas Pozin <nicolas.pozin at inria.fr> writes:
> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form the dense matrix [V1 V2 ... Vn]? 

What do you want to do with that matrix?  The vector representation is
pretty flexible and the memory semantics are similar unless you store
the dense matrix row-aligned (not the default).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/e47662fc/attachment.pgp>

From nicolas.pozin at inria.fr  Wed Aug 26 15:06:32 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Wed, 26 Aug 2015 22:06:32 +0200 (CEST)
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <87pp2999oy.fsf@jedbrown.org>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
Message-ID: <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>

Thank you for this answer.

What I want to do is to get the lines of this matrix and store them in vectors.


----- Mail original -----
> De: "Jed Brown" <jed at jedbrown.org>
> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, petsc-users at mcs.anl.gov
> Envoy?: Mercredi 26 Ao?t 2015 20:38:37
> Objet: Re: [petsc-users]  forming a matrix from a set of vectors
> 
> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
> > Given a set of vectors V1, V2,...,Vn, is there an efficient way to form the
> > dense matrix [V1 V2 ... Vn]?
> 
> What do you want to do with that matrix?  The vector representation is
> pretty flexible and the memory semantics are similar unless you store
> the dense matrix row-aligned (not the default).
> 

From bsmith at mcs.anl.gov  Wed Aug 26 15:21:04 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 26 Aug 2015 15:21:04 -0500
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
Message-ID: <D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>


> On Aug 26, 2015, at 3:06 PM, Nicolas Pozin <nicolas.pozin at inria.fr> wrote:
> 
> Thank you for this answer.
> 
> What I want to do is to get the lines of this matrix and store them in vectors.

  If you want to treat the columns of the dense matrix as vectors then use MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the first row of each column of the obtained array (PETSc dense matrices are stored by column; same as for example LAPACK).

  But if you explained more why you want to treat something sometimes as a Mat (which is a linear operator on vectors) and sometimes as vectors we might be able to suggest how to organize your code.

   Barry

> 
> 
> ----- Mail original -----
>> De: "Jed Brown" <jed at jedbrown.org>
>> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, petsc-users at mcs.anl.gov
>> Envoy?: Mercredi 26 Ao?t 2015 20:38:37
>> Objet: Re: [petsc-users]  forming a matrix from a set of vectors
>> 
>> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
>>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form the
>>> dense matrix [V1 V2 ... Vn]?
>> 
>> What do you want to do with that matrix?  The vector representation is
>> pretty flexible and the memory semantics are similar unless you store
>> the dense matrix row-aligned (not the default).
>> 


From nicolas.pozin at inria.fr  Wed Aug 26 15:41:07 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Wed, 26 Aug 2015 22:41:07 +0200 (CEST)
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
	<D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
Message-ID: <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>

Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
-d is a sparse matrix of size (n1,m1)
-A is a dense symetric matrix of size size (n1,n1)
with m1 very big compared to n1 (1 million against a few dozens).

The problem is too big to allow the use of MatMatMult.
What I planned to do :
-compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th column of A : quick since d is sparse and n1 is small
-deduce the matrix transpose(d)*A = [V1 ... Vn]
and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through
-transpose([V1 ...Vn]) and get its columns C1 ... Cn 
-conclude on the i-th diagonal value which is the i-th component of tranpose(d)*Ci


----- Mail original -----
> De: "Barry Smith" <bsmith at mcs.anl.gov>
> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
> Cc: "Jed Brown" <jed at jedbrown.org>, petsc-users at mcs.anl.gov
> Envoy?: Mercredi 26 Ao?t 2015 22:21:04
> Objet: Re: [petsc-users] forming a matrix from a set of vectors
> 
> 
> > On Aug 26, 2015, at 3:06 PM, Nicolas Pozin <nicolas.pozin at inria.fr> wrote:
> > 
> > Thank you for this answer.
> > 
> > What I want to do is to get the lines of this matrix and store them in
> > vectors.
> 
>   If you want to treat the columns of the dense matrix as vectors then use
>   MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the
>   first row of each column of the obtained array (PETSc dense matrices are
>   stored by column; same as for example LAPACK).
> 
>   But if you explained more why you want to treat something sometimes as a
>   Mat (which is a linear operator on vectors) and sometimes as vectors we
>   might be able to suggest how to organize your code.
> 
>    Barry
> 
> > 
> > 
> > ----- Mail original -----
> >> De: "Jed Brown" <jed at jedbrown.org>
> >> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, petsc-users at mcs.anl.gov
> >> Envoy?: Mercredi 26 Ao?t 2015 20:38:37
> >> Objet: Re: [petsc-users]  forming a matrix from a set of vectors
> >> 
> >> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
> >>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form
> >>> the
> >>> dense matrix [V1 V2 ... Vn]?
> >> 
> >> What do you want to do with that matrix?  The vector representation is
> >> pretty flexible and the memory semantics are similar unless you store
> >> the dense matrix row-aligned (not the default).
> >> 
> 
> 

From bsmith at mcs.anl.gov  Wed Aug 26 15:58:12 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 26 Aug 2015 15:58:12 -0500
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
	<D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
	<1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
Message-ID: <154030DD-BAED-4F54-829D-0B2F089301F5@mcs.anl.gov>


   Since A is tiny I am assuming you are doing this sequentially only?

   Do you have d stored as a AIJ matrix or is transpose(d) stored as a AIJ matrix?

  
> On Aug 26, 2015, at 3:41 PM, Nicolas Pozin <nicolas.pozin at inria.fr> wrote:
> 
> Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
> -d is a sparse matrix of size (n1,m1)
> -A is a dense symetric matrix of size size (n1,n1)
> with m1 very big compared to n1 (1 million against a few dozens).
> 
> The problem is too big to allow the use of MatMatMult.
> What I planned to do :
> -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th column of A : quick since d is sparse and n1 is small
> -deduce the matrix transpose(d)*A = [V1 ... Vn]
> and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through
> -transpose([V1 ...Vn]) and get its columns C1 ... Cn 
> -conclude on the i-th diagonal value which is the i-th component of tranpose(d)*Ci
> 
> 
> 
> ----- Mail original -----
>> De: "Barry Smith" <bsmith at mcs.anl.gov>
>> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
>> Cc: "Jed Brown" <jed at jedbrown.org>, petsc-users at mcs.anl.gov
>> Envoy?: Mercredi 26 Ao?t 2015 22:21:04
>> Objet: Re: [petsc-users] forming a matrix from a set of vectors
>> 
>> 
>>> On Aug 26, 2015, at 3:06 PM, Nicolas Pozin <nicolas.pozin at inria.fr> wrote:
>>> 
>>> Thank you for this answer.
>>> 
>>> What I want to do is to get the lines of this matrix and store them in
>>> vectors.
>> 
>>  If you want to treat the columns of the dense matrix as vectors then use
>>  MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the
>>  first row of each column of the obtained array (PETSc dense matrices are
>>  stored by column; same as for example LAPACK).
>> 
>>  But if you explained more why you want to treat something sometimes as a
>>  Mat (which is a linear operator on vectors) and sometimes as vectors we
>>  might be able to suggest how to organize your code.
>> 
>>   Barry
>> 
>>> 
>>> 
>>> ----- Mail original -----
>>>> De: "Jed Brown" <jed at jedbrown.org>
>>>> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, petsc-users at mcs.anl.gov
>>>> Envoy?: Mercredi 26 Ao?t 2015 20:38:37
>>>> Objet: Re: [petsc-users]  forming a matrix from a set of vectors
>>>> 
>>>> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
>>>>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form
>>>>> the
>>>>> dense matrix [V1 V2 ... Vn]?
>>>> 
>>>> What do you want to do with that matrix?  The vector representation is
>>>> pretty flexible and the memory semantics are similar unless you store
>>>> the dense matrix row-aligned (not the default).
>>>> 
>> 
>> 


From nicolas.pozin at inria.fr  Wed Aug 26 16:01:29 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Wed, 26 Aug 2015 23:01:29 +0200 (CEST)
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <154030DD-BAED-4F54-829D-0B2F089301F5@mcs.anl.gov>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
	<D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
	<1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
	<154030DD-BAED-4F54-829D-0B2F089301F5@mcs.anl.gov>
Message-ID: <1199295077.10054108.1440622889251.JavaMail.zimbra@inria.fr>

Yes, this is sequentially and yes again, d is stored as a AIJ matrix

----- Mail original -----
> De: "Barry Smith" <bsmith at mcs.anl.gov>
> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
> Cc: "Jed Brown" <jed at jedbrown.org>, petsc-users at mcs.anl.gov
> Envoy?: Mercredi 26 Ao?t 2015 22:58:12
> Objet: Re: [petsc-users] forming a matrix from a set of vectors
> 
> 
>    Since A is tiny I am assuming you are doing this sequentially only?
> 
>    Do you have d stored as a AIJ matrix or is transpose(d) stored as a AIJ
>    matrix?
> 
>   
> 
> 
> > On Aug 26, 2015, at 3:41 PM, Nicolas Pozin <nicolas.pozin at inria.fr> wrote:
> > 
> > Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
> > -d is a sparse matrix of size (n1,m1)
> > -A is a dense symetric matrix of size size (n1,n1)
> > with m1 very big compared to n1 (1 million against a few dozens).
> > 
> > The problem is too big to allow the use of MatMatMult.
> > What I planned to do :
> > -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th
> > column of A : quick since d is sparse and n1 is small
> > -deduce the matrix transpose(d)*A = [V1 ... Vn]
> > and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through
> > -transpose([V1 ...Vn]) and get its columns C1 ... Cn
> > -conclude on the i-th diagonal value which is the i-th component of
> > tranpose(d)*Ci
> > 
> > 
> > 
> > ----- Mail original -----
> >> De: "Barry Smith" <bsmith at mcs.anl.gov>
> >> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
> >> Cc: "Jed Brown" <jed at jedbrown.org>, petsc-users at mcs.anl.gov
> >> Envoy?: Mercredi 26 Ao?t 2015 22:21:04
> >> Objet: Re: [petsc-users] forming a matrix from a set of vectors
> >> 
> >> 
> >>> On Aug 26, 2015, at 3:06 PM, Nicolas Pozin <nicolas.pozin at inria.fr>
> >>> wrote:
> >>> 
> >>> Thank you for this answer.
> >>> 
> >>> What I want to do is to get the lines of this matrix and store them in
> >>> vectors.
> >> 
> >>  If you want to treat the columns of the dense matrix as vectors then use
> >>  MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the
> >>  first row of each column of the obtained array (PETSc dense matrices are
> >>  stored by column; same as for example LAPACK).
> >> 
> >>  But if you explained more why you want to treat something sometimes as a
> >>  Mat (which is a linear operator on vectors) and sometimes as vectors we
> >>  might be able to suggest how to organize your code.
> >> 
> >>   Barry
> >> 
> >>> 
> >>> 
> >>> ----- Mail original -----
> >>>> De: "Jed Brown" <jed at jedbrown.org>
> >>>> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, petsc-users at mcs.anl.gov
> >>>> Envoy?: Mercredi 26 Ao?t 2015 20:38:37
> >>>> Objet: Re: [petsc-users]  forming a matrix from a set of vectors
> >>>> 
> >>>> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
> >>>>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form
> >>>>> the
> >>>>> dense matrix [V1 V2 ... Vn]?
> >>>> 
> >>>> What do you want to do with that matrix?  The vector representation is
> >>>> pretty flexible and the memory semantics are similar unless you store
> >>>> the dense matrix row-aligned (not the default).
> >>>> 
> >> 
> >> 
> 
> 

From patrick.sanan at gmail.com  Wed Aug 26 16:02:41 2015
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Wed, 26 Aug 2015 23:02:41 +0200
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
	<D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
	<1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
Message-ID: <CA+z91Te3FTwa6=XhVy1_fsS7zaK=ukBA27-Tq5fdZ5OHS6rhDA@mail.gmail.com>

On Wed, Aug 26, 2015 at 10:41 PM, Nicolas Pozin <nicolas.pozin at inria.fr>
wrote:

> Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
> -d is a sparse matrix of size (n1,m1)
> -A is a dense symetric matrix of size size (n1,n1)
> with m1 very big compared to n1 (1 million against a few dozens).
>
If I read this correctly, another way to phrase what you need is
||d_i||_A^2 = <d_i,Ad_i>, for a few dozen values of i . Naively you could
do that by iterating through an array of Vec objects (which need not all be
stored in memory simultaneously), calling MatMult followed by VecDot. You
could perhaps get more clever later (if the size of the system justifies
it) by doing things like using non-blocking/split versions of VecDot (or
VecMDot) so that you can overlap the matrix multiplications with the dot
products.

>
> The problem is too big to allow the use of MatMatMult.
> What I planned to do :
> -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th
> column of A : quick since d is sparse and n1 is small
> -deduce the matrix transpose(d)*A = [V1 ... Vn]
> and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through
> -transpose([V1 ...Vn]) and get its columns C1 ... Cn
> -conclude on the i-th diagonal value which is the i-th component of
> tranpose(d)*Ci
>
>
>
> ----- Mail original -----
> > De: "Barry Smith" <bsmith at mcs.anl.gov>
> > ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
> > Cc: "Jed Brown" <jed at jedbrown.org>, petsc-users at mcs.anl.gov
> > Envoy?: Mercredi 26 Ao?t 2015 22:21:04
> > Objet: Re: [petsc-users] forming a matrix from a set of vectors
> >
> >
> > > On Aug 26, 2015, at 3:06 PM, Nicolas Pozin <nicolas.pozin at inria.fr>
> wrote:
> > >
> > > Thank you for this answer.
> > >
> > > What I want to do is to get the lines of this matrix and store them in
> > > vectors.
> >
> >   If you want to treat the columns of the dense matrix as vectors then
> use
> >   MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to
> the
> >   first row of each column of the obtained array (PETSc dense matrices
> are
> >   stored by column; same as for example LAPACK).
> >
> >   But if you explained more why you want to treat something sometimes as
> a
> >   Mat (which is a linear operator on vectors) and sometimes as vectors we
> >   might be able to suggest how to organize your code.
> >
> >    Barry
> >
> > >
> > >
> > > ----- Mail original -----
> > >> De: "Jed Brown" <jed at jedbrown.org>
> > >> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, petsc-users at mcs.anl.gov
> > >> Envoy?: Mercredi 26 Ao?t 2015 20:38:37
> > >> Objet: Re: [petsc-users]  forming a matrix from a set of vectors
> > >>
> > >> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
> > >>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to
> form
> > >>> the
> > >>> dense matrix [V1 V2 ... Vn]?
> > >>
> > >> What do you want to do with that matrix?  The vector representation is
> > >> pretty flexible and the memory semantics are similar unless you store
> > >> the dense matrix row-aligned (not the default).
> > >>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/4921ba39/attachment-0001.html>

From patrick.sanan at gmail.com  Wed Aug 26 16:04:17 2015
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Wed, 26 Aug 2015 23:04:17 +0200
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <CA+z91Te3FTwa6=XhVy1_fsS7zaK=ukBA27-Tq5fdZ5OHS6rhDA@mail.gmail.com>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
	<D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
	<1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
	<CA+z91Te3FTwa6=XhVy1_fsS7zaK=ukBA27-Tq5fdZ5OHS6rhDA@mail.gmail.com>
Message-ID: <CA+z91TfQS5fBXcHLURGUeG3btbr7xUCO-eOPkXhn5B7QNQEvKA@mail.gmail.com>

On Wed, Aug 26, 2015 at 11:02 PM, Patrick Sanan <patrick.sanan at gmail.com>
wrote:

>
>
> On Wed, Aug 26, 2015 at 10:41 PM, Nicolas Pozin <nicolas.pozin at inria.fr>
> wrote:
>
>> Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
>> -d is a sparse matrix of size (n1,m1)
>> -A is a dense symetric matrix of size size (n1,n1)
>> with m1 very big compared to n1 (1 million against a few dozens).
>>
> If I read this correctly, another way to phrase what you need is
> ||d_i||_A^2 = <d_i,Ad_i>, for a few dozen values of i . Naively you could
> do that by iterating through an array of Vec objects (which need not all be
> stored in memory simultaneously), calling MatMult followed by VecDot. You
> could perhaps get more clever later (if the size of the system justifies
> it) by doing things like using non-blocking/split versions of VecDot (or
> VecMDot) so that you can overlap the matrix multiplications with the dot
> products.
>
Ah, sorry, I had the sparsity of A and d reversed in my reading.

>
>> The problem is too big to allow the use of MatMatMult.
>> What I planned to do :
>> -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th
>> column of A : quick since d is sparse and n1 is small
>> -deduce the matrix transpose(d)*A = [V1 ... Vn]
>> and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through
>> -transpose([V1 ...Vn]) and get its columns C1 ... Cn
>> -conclude on the i-th diagonal value which is the i-th component of
>> tranpose(d)*Ci
>>
>>
>>
>> ----- Mail original -----
>> > De: "Barry Smith" <bsmith at mcs.anl.gov>
>> > ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
>> > Cc: "Jed Brown" <jed at jedbrown.org>, petsc-users at mcs.anl.gov
>> > Envoy?: Mercredi 26 Ao?t 2015 22:21:04
>> > Objet: Re: [petsc-users] forming a matrix from a set of vectors
>> >
>> >
>> > > On Aug 26, 2015, at 3:06 PM, Nicolas Pozin <nicolas.pozin at inria.fr>
>> wrote:
>> > >
>> > > Thank you for this answer.
>> > >
>> > > What I want to do is to get the lines of this matrix and store them in
>> > > vectors.
>> >
>> >   If you want to treat the columns of the dense matrix as vectors then
>> use
>> >   MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to
>> the
>> >   first row of each column of the obtained array (PETSc dense matrices
>> are
>> >   stored by column; same as for example LAPACK).
>> >
>> >   But if you explained more why you want to treat something sometimes
>> as a
>> >   Mat (which is a linear operator on vectors) and sometimes as vectors
>> we
>> >   might be able to suggest how to organize your code.
>> >
>> >    Barry
>> >
>> > >
>> > >
>> > > ----- Mail original -----
>> > >> De: "Jed Brown" <jed at jedbrown.org>
>> > >> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, petsc-users at mcs.anl.gov
>> > >> Envoy?: Mercredi 26 Ao?t 2015 20:38:37
>> > >> Objet: Re: [petsc-users]  forming a matrix from a set of vectors
>> > >>
>> > >> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
>> > >>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to
>> form
>> > >>> the
>> > >>> dense matrix [V1 V2 ... Vn]?
>> > >>
>> > >> What do you want to do with that matrix?  The vector representation
>> is
>> > >> pretty flexible and the memory semantics are similar unless you store
>> > >> the dense matrix row-aligned (not the default).
>> > >>
>> >
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/f3c9c564/attachment.html>

From jed at jedbrown.org  Wed Aug 26 16:45:05 2015
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 26 Aug 2015 15:45:05 -0600
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
	<D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
	<1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
Message-ID: <87k2sh9126.fsf@jedbrown.org>

Nicolas Pozin <nicolas.pozin at inria.fr> writes:

> Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
> -d is a sparse matrix of size (n1,m1)
> -A is a dense symetric matrix of size size (n1,n1)
> with m1 very big compared to n1 (1 million against a few dozens).

The result will be m1 ? m1, but at most rank n1.  Why would you want
that monster as a matrix?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150826/10d881ce/attachment.pgp>

From nicolas.pozin at inria.fr  Wed Aug 26 16:55:38 2015
From: nicolas.pozin at inria.fr (Nicolas Pozin)
Date: Wed, 26 Aug 2015 23:55:38 +0200 (CEST)
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <87k2sh9126.fsf@jedbrown.org>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
	<D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
	<1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
	<87k2sh9126.fsf@jedbrown.org>
Message-ID: <457017974.10055822.1440626138775.JavaMail.zimbra@inria.fr>

I'm working on a finite element system of the type (A+B)X=Y where A is a classic sparse symetric matrix and B is this transpose(d)*A*d. All the degrees of freedom are coupled (B dense), this is a physical property of the problem I deal with...

To solve it I use a conjugate gradient with jacobi preconditionner (which proves to be satisying here) . So I need the diagonal of B... and for now this is clearly the most time-consuming part of my code.


----- Mail original -----
> De: "Jed Brown" <jed at jedbrown.org>
> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, "Barry Smith" <bsmith at mcs.anl.gov>
> Cc: petsc-users at mcs.anl.gov
> Envoy?: Mercredi 26 Ao?t 2015 23:45:05
> Objet: Re: [petsc-users] forming a matrix from a set of vectors
> 
> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
> 
> > Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
> > -d is a sparse matrix of size (n1,m1)
> > -A is a dense symetric matrix of size size (n1,n1)
> > with m1 very big compared to n1 (1 million against a few dozens).
> 
> The result will be m1 ? m1, but at most rank n1.  Why would you want
> that monster as a matrix?
> 

From bsmith at mcs.anl.gov  Wed Aug 26 22:00:44 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 26 Aug 2015 22:00:44 -0500
Subject: [petsc-users] forming a matrix from a set of vectors
In-Reply-To: <1199295077.10054108.1440622889251.JavaMail.zimbra@inria.fr>
References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr>
	<1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr>
	<87pp2999oy.fsf@jedbrown.org>
	<1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr>
	<D5FCFF70-22B1-4289-B9EF-0E437FE142D1@mcs.anl.gov>
	<1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr>
	<154030DD-BAED-4F54-829D-0B2F089301F5@mcs.anl.gov>
	<1199295077.10054108.1440622889251.JavaMail.zimbra@inria.fr>
Message-ID: <44BE97E7-4D65-46BE-B8E0-0B77D808DC0B@mcs.anl.gov>


   Nicolas,

    I believe the best way to do this is to write a code that specifically knows the SeqAIJ matrix structure  and does the product directly with that data structure instead of trying to "cook something up" using the standard higher level PETSc routines. So you need to have 

#include <../src/mat/impls/aij/seq/aij.h>  

directly in your code. See for example src/mat/impls/aij/seq/aij.c 

Next you need to store c = transpose(d) since storing the matrix d in PETSc sparse format (which is row based) is terrible for the operation you want to perform. So ideally just create the matrix c and put its values in; if that is too difficult you can use MatTranspose() to get c from d.

   Now to the algorithm 

    T =     c *(A*transpose(c)) 

note that the columns of transpose(c) (i.e. the columns of d)  are the rows of c, let c(i,*) represent a row of c. The diagonals of T are 

    T(i,i) = c(i,*) * (A * c(i,*))

   Let s = A * c(i,*) now to implement this just 
loop over i. Compute s (an array) using directly the data in the sparse matrix c (use MatDenseGetArray()) to access the values in A) then compute 
c(i,*) * s; then move on to the next i.   This will be very efficient (computes only exactly what is needed) and requires only the array s which is small (size of number of rows of A, not c).

  Barry


> On Aug 26, 2015, at 4:01 PM, Nicolas Pozin <nicolas.pozin at inria.fr> wrote:
> 
> Yes, this is sequentially and yes again, d is stored as a AIJ matrix
> 
> ----- Mail original -----
>> De: "Barry Smith" <bsmith at mcs.anl.gov>
>> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
>> Cc: "Jed Brown" <jed at jedbrown.org>, petsc-users at mcs.anl.gov
>> Envoy?: Mercredi 26 Ao?t 2015 22:58:12
>> Objet: Re: [petsc-users] forming a matrix from a set of vectors
>> 
>> 
>>   Since A is tiny I am assuming you are doing this sequentially only?
>> 
>>   Do you have d stored as a AIJ matrix or is transpose(d) stored as a AIJ
>>   matrix?
>> 
>> 
>> 
>> 
>>> On Aug 26, 2015, at 3:41 PM, Nicolas Pozin <nicolas.pozin at inria.fr> wrote:
>>> 
>>> Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
>>> -d is a sparse matrix of size (n1,m1)
>>> -A is a dense symetric matrix of size size (n1,n1)
>>> with m1 very big compared to n1 (1 million against a few dozens).
>>> 
>>> The problem is too big to allow the use of MatMatMult.
>>> What I planned to do :
>>> -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th
>>> column of A : quick since d is sparse and n1 is small
>>> -deduce the matrix transpose(d)*A = [V1 ... Vn]
>>> and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through
>>> -transpose([V1 ...Vn]) and get its columns C1 ... Cn
>>> -conclude on the i-th diagonal value which is the i-th component of
>>> tranpose(d)*Ci
>>> 
>>> 
>>> 
>>> ----- Mail original -----
>>>> De: "Barry Smith" <bsmith at mcs.anl.gov>
>>>> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>
>>>> Cc: "Jed Brown" <jed at jedbrown.org>, petsc-users at mcs.anl.gov
>>>> Envoy?: Mercredi 26 Ao?t 2015 22:21:04
>>>> Objet: Re: [petsc-users] forming a matrix from a set of vectors
>>>> 
>>>> 
>>>>> On Aug 26, 2015, at 3:06 PM, Nicolas Pozin <nicolas.pozin at inria.fr>
>>>>> wrote:
>>>>> 
>>>>> Thank you for this answer.
>>>>> 
>>>>> What I want to do is to get the lines of this matrix and store them in
>>>>> vectors.
>>>> 
>>>> If you want to treat the columns of the dense matrix as vectors then use
>>>> MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the
>>>> first row of each column of the obtained array (PETSc dense matrices are
>>>> stored by column; same as for example LAPACK).
>>>> 
>>>> But if you explained more why you want to treat something sometimes as a
>>>> Mat (which is a linear operator on vectors) and sometimes as vectors we
>>>> might be able to suggest how to organize your code.
>>>> 
>>>>  Barry
>>>> 
>>>>> 
>>>>> 
>>>>> ----- Mail original -----
>>>>>> De: "Jed Brown" <jed at jedbrown.org>
>>>>>> ?: "Nicolas Pozin" <nicolas.pozin at inria.fr>, petsc-users at mcs.anl.gov
>>>>>> Envoy?: Mercredi 26 Ao?t 2015 20:38:37
>>>>>> Objet: Re: [petsc-users]  forming a matrix from a set of vectors
>>>>>> 
>>>>>> Nicolas Pozin <nicolas.pozin at inria.fr> writes:
>>>>>>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form
>>>>>>> the
>>>>>>> dense matrix [V1 V2 ... Vn]?
>>>>>> 
>>>>>> What do you want to do with that matrix?  The vector representation is
>>>>>> pretty flexible and the memory semantics are similar unless you store
>>>>>> the dense matrix row-aligned (not the default).
>>>>>> 
>>>> 
>>>> 
>> 
>> 


From zonexo at gmail.com  Thu Aug 27 01:05:59 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 27 Aug 2015 14:05:59 +0800
Subject: [petsc-users] Problem with linking PETSc
Message-ID: <55DEA8C7.5010100@gmail.com>

Hi,

I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008.

Due to some MPICH2 issues, I am trying to use Intel MPI (newest 
version). Building and testing of PETSc 3.6.1 with Intel MPI all went 
smoothly.

However, I now have problem linking the files on VS2008 to create the 
final exe. The error is:

/*1>Compiling manifest to resources...*//*
*//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//*
*//*1>Copyright (C) Microsoft Corporation.  All rights reserved.*//*
*//*1>Linking...*//*
*//*1>global.obj : error LNK2019: unresolved external symbol 
MATCREATEAIJ referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
*//*1>global.obj : error LNK2019: unresolved external symbol 
MATSETFROMOPTIONS referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
*//*...*//*
*//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol 
VECGETARRAY referenced in function PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
*//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol 
VECRESTOREARRAY referenced in function 
PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
*//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol 
DMLOCALTOLOCALBEGIN referenced in function 
PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
*//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol 
DMLOCALTOLOCALEND referenced in function 
PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
*//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol 
PETSCINITIALIZE referenced in function MAIN__*//*
*//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error 
LNK1120: 74 unresolved externals*//*
*//*1>*//*
*//*1>Build log written to 
"file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//*
*//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//*
*//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/

I did not do much changes since the prev PETSc worked. I only changed 
the directory $(PETSC_DIR) and $(IMPI) to the new directory in win7 
environment variables. I wonder what's wrong.

-- 
Thank you

Yours sincerely,

TAY wee-beng

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150827/b9e85892/attachment-0001.html>

From zonexo at gmail.com  Thu Aug 27 01:08:15 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 27 Aug 2015 14:08:15 +0800
Subject: [petsc-users] Problem with linking PETSc 2
Message-ID: <55DEA94F.9040407@gmail.com>

Hi,

I forgot to add that I also changed the MPI lib to those used by Intel MPI.

-- 
Thank you

Yours sincerely,

TAY wee-beng


From balay at mcs.anl.gov  Thu Aug 27 10:38:59 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 27 Aug 2015 10:38:59 -0500
Subject: [petsc-users] Problem with linking PETSc
In-Reply-To: <55DEA8C7.5010100@gmail.com>
References: <55DEA8C7.5010100@gmail.com>
Message-ID: <alpine.LFD.2.20.1508271038150.27561@asterix>

Are you able to compile and run both C and fortran petsc examples
using the corresponding makefile?

Satish

On Thu, 27 Aug 2015, TAY wee-beng wrote:

> Hi,
> 
> I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008.
> 
> Due to some MPICH2 issues, I am trying to use Intel MPI (newest version).
> Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly.
> 
> However, I now have problem linking the files on VS2008 to create the final
> exe. The error is:
> 
> /*1>Compiling manifest to resources...*//*
> *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//*
> *//*1>Copyright (C) Microsoft Corporation.  All rights reserved.*//*
> *//*1>Linking...*//*
> *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ
> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
> *//*1>global.obj : error LNK2019: unresolved external symbol MATSETFROMOPTIONS
> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
> *//*...*//*
> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
> VECGETARRAY referenced in function PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
> VECRESTOREARRAY referenced in function
> PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
> DMLOCALTOLOCALBEGIN referenced in function
> PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
> DMLOCALTOLOCALEND referenced in function PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
> *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol
> PETSCINITIALIZE referenced in function MAIN__*//*
> *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error
> LNK1120: 74 unresolved externals*//*
> *//*1>*//*
> *//*1>Build log written to
> "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//*
> *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//*
> *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/
> 
> I did not do much changes since the prev PETSc worked. I only changed the
> directory $(PETSC_DIR) and $(IMPI) to the new directory in win7 environment
> variables. I wonder what's wrong.
> 
> 


From zonexo at gmail.com  Thu Aug 27 10:45:12 2015
From: zonexo at gmail.com (Wee Beng Tay)
Date: Thu, 27 Aug 2015 23:45:12 +0800
Subject: [petsc-users] Insert values into matrix using
 MatSetValuesStencil or MatSetValuesLocal
In-Reply-To: <CAGi1ndQmCWRbx5ijVc-vc96RaOGuSXwoXz43FRpjyfBs8e1O7Q@mail.gmail.com>
References: <CADSzT4JLACgt_UZvAG+uyDkNaSOzs1t4QAr0T4DozWN--8Fe+w@mail.gmail.com>
	<CAGi1ndQLKN_+cUBchLD8SAqDn8QO46TQf+L69rjqqCLkXaX1UA@mail.gmail.com>
	<55DD3CC1.5070801@gmail.com>
	<CAGi1ndR78hB8M9_ybho-favYfdJnsissE_A45V1hFYZTQ0AC0w@mail.gmail.com>
	<55DD4869.2000006@gmail.com>
	<CAGi1ndQmCWRbx5ijVc-vc96RaOGuSXwoXz43FRpjyfBs8e1O7Q@mail.gmail.com>
Message-ID: <1440690313501-f2c9870c-a80854bd-4d8201cd@gmail.com>

Hi Timothee,

That's a better way. Thanks


Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Wed, Aug 26, 2015 at 1:15 PM, Timoth?e Nicolas < timothee.nicolas at gmail.com [timothee.nicolas at gmail.com] > wrote:
I don't really understand what you say, but it does not sound right.

You can enter the boundary points separately and then the points outside the
boundary on separate calls, like this :

do j=user%ys,user%ye
do i=user%xs,user%xe
if (i.eq.1 .or. i.eq.user%mx .or. j .eq. 1 .or. j .eq. user%my) then
! boundary point
row(MatStencil_i,1) = i -1
row(MatStencil_j,1) = j -1
row(MatStencil_c,1) = 1 -1

col(MatStencil_i,1) = i -1
col(MatStencil_j,1) = j -1
col(MatStencil_c,1) = 1 -1
v(1) = one
call MatSetValuesStencil(jac_prec,ione,row,ione,col,v, &
& INSERT_VALUES,ierr)

else
row(MatStencil_i,1) = i -1
row(MatStencil_j,1) = j -1
row(MatStencil_c,1) = 1 -1

col(MatStencil_i,1) = i -1
col(MatStencil_j,1) = j -1
col(MatStencil_c,1) = 1 -1
v(1) = undemi*dxm1*(vx_ip1j-vx_im1j) + two*user%nu*(dxm1**2+dym1**2)
col(MatStencil_i,2) = i+1 -1
col(MatStencil_j,2) = j -1
col(MatStencil_c,2) = 1 -1
v(2) = undemi*dxm1*(vx_ij-vx_ip1j) - user%nu*dxm1**2
col(MatStencil_i,3) = i-1 -1
col(MatStencil_j,3) = j -1
col(MatStencil_c,3) = 1 -1
v(3) = -undemi*dxm1*(vx_ij-vx_im1j) - user%nu*dxm1**2
col(MatStencil_i,4) = i -1
col(MatStencil_j,4) = j+1 -1
col(MatStencil_c,4) = 1 -1
v(4) = undemi*dym1*vy_ij - user%nu*dym1**2
col(MatStencil_i,5) = i -1
col(MatStencil_j,5) = j-1 -1
col(MatStencil_c,5) = 1 -1
v(5) = -undemi*dym1*vy_ij - user%nu*dym1**2
col(MatStencil_i,6) = i -1
col(MatStencil_j,6) = j -1
col(MatStencil_c,6) = 2 -1
v(6) = undemi*dym1*(vx_ijp1-vx_ijm1)
col(MatStencil_i,7) = i+1 -1
col(MatStencil_j,7) = j -1
col(MatStencil_c,7) = 2 -1
v(7) = -undemi*dxm1*vy_ip1j
col(MatStencil_i,8) = i-1 -1
col(MatStencil_j,8) = j -1
col(MatStencil_c,8) = 2 -1
v(8) = undemi*dxm1*vy_im1j

call MatSetValuesStencil(jac_prec,ione,row,ieight,col,v, &
& INSERT_VALUES,ierr)

endif
enddo
enddo

Timothee


2015-08-26 14:02 GMT+09:00 TAY wee-beng < zonexo at gmail.com [zonexo at gmail.com] > :
Hi Timothee,

Yes, I only parallelized in j and k. ksta,jsta are the starting k and j values.
kend,jend are the ending k and j values.

However, now I am using only 1 procs.

I was going to resend you my code but then I realised my mistake. I used:

call
MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)

for all pts, including those at the boundary. Hence, those coupling outside the
boundary is also included.

I changed to:

call
MatSetValuesStencil(A_mat,ione,row,ione,col(:,7),value_insert(7),INSERT_VALUES,ierr)

so I am now entering values individually.

Is there anyway I can use the 1st option to enter all the values together even
those some pts are invalid. I think it should be faster. Can I somehow tell
PETSc to ignore them?
Thank you

Yours sincerely,

TAY wee-beng

On 26/8/2015 12:24 PM, Timoth?e Nicolas wrote:
What is the definition of ksta, kend, jsta, jend ? Etc ? You are parallelized
only in j and k ?

What I said about the "-1" holds only if you have translated the start and end
points to FORTRAN numbering after getting the corners and ghost corners from the
DMDA (see ex ex5f90.F from snes)

Would you mind sending the complete routine with the complete definitions of
ksta,kend,jsta,jend,and size_x ?

Timothee

2015-08-26 13:12 GMT+09:00 TAY wee-beng < zonexo at gmail.com [zonexo at gmail.com] > :
Hi,

I have wrote the routine for my Poisson eqn. I have only 1 DOF, which is for
pressure. The center cell is coupled with 6 other cells (north, south, east,
west, front, back), so together 7 couplings.

size x/y/z = 4/8/10

MatStencil :: row(4,1),col(4,7)

PetscScalar :: value_insert(7)

PetscInt :: ione,iseven

ione = 1; iseven = 7

do k=ksta,kend

do j = jsta,jend

do i=1,size_x

row(MatStencil_i,1) = i - 1

row(MatStencil_j,1) = j - 1

row(MatStencil_k,1) = k - 1

row(MatStencil_c,1) = 0 ! 1 - 1

value_insert = 0.d0

if (i /= size_x) then

col(MatStencil_i,3) = i + 1 - 1 !east

col(MatStencil_j,3) = j - 1

col(MatStencil_k,3) = k - 1

col(MatStencil_c,3) = 0

value_insert(3) = (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)

end if

if (i /= 1) then

col(MatStencil_i,5) = i - 1 - 1 !west

col(MatStencil_j,5) = j - 1

col(MatStencil_k,5) = k - 1

col(MatStencil_c,5) = 0

value_insert(5) = (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)

end if

if (j /= size_y) then

col(MatStencil_i,2) = i - 1 !north

col(MatStencil_j,2) = j + 1 - 1

col(MatStencil_k,2) = k - 1

col(MatStencil_c,2) = 0

value_insert(2) = (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)

end if

...

col(MatStencil_i,1) = i - 1

col(MatStencil_j,1) = j - 1

col(MatStencil_k,1) = k - 1

col(MatStencil_c,1) = 0

value_insert(1) = -value_insert(2) - value_insert(3) - value_insert(4) -
value_insert(5) - value_insert(6) - value_insert(7)

call
MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)

end do

end do

end do

but I got the error :

[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix.

The error happens at i = 4, j = 1, k = 1. So I guess it has something to do with
the boundary condition. However, I can't figure out what's wrong. Can someone
help?
Thank you

Yours sincerely,

TAY wee-beng

On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote:
Hi,

ex5 of snes can give you an example of the two routines.

The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version
ex5f90.F uses MatSetValuesLocal.

However, I use MatSetValuesStencil also in Fortran, there is no problem, and no
need to mess around with DMDAGetAO, I think.

To input values in the matrix, you need to do the following :

! Declare the matstencils for matrix columns and rows
MatStencil :: row(4,1),col(4,n)
! Declare the quantity which will store the actual matrix elements
PetscScalar :: v(8)

The first dimension in row and col is 4 to allow for 3 spatial dimensions (even
if you use only 2) plus one degree of freedom if you have several fields in your
DMDA. The second dimension is 1 for row (you input one row at a time) and n for
col, where n is the number of columns that you input. For instance, if at node
(1,i,j) (1 is the index of the degree of freedom), you have, say, 6 couplings,
with nodes (1,i,j), (1,i+1,j), (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for
example, then you need to set n=6

Then you define the row number by naturally doing the following, inside a local
loop :

row(MatStencil_i,1) = i -1
row(MatStencil_j,1) = j -1
row(MatStencil_c,1) = 1 -1

the -1 are here because FORTRAN indexing is different from the native C
indexing. I put them on the right to make this more apparent.

Then the column information. For instance to declare the coupling with node
(1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will have to
write (still within the same local loop on i and j)

col(MatStencil_i,1) = i -1
col(MatStencil_j,1) = j -1
col(MatStencil_c,1) = 1 -1
v(1) = whatever_it_is

col(MatStencil_i,2) = i-1 -1
col(MatStencil_j,2) = j -1
col(MatStencil_c,2) = 1 -1
v(2) = whatever_it_is

col(MatStencil_i,3) = i -1
col(MatStencil_j,3) = j -1
col(MatStencil_c,3) = 2 -1
v(3) = whatever_it_is

...
...
..

...
...
...

Note that the index of the degree of freedom (or what field you are coupling
to), is indicated by MatStencil_c


Finally use MatSetValuesStencil

ione = 1
isix = 6
call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr)

If it is not clear don't hesitate to ask more details. For me it worked that
way, I succesfully computed a Jacobian that way. It is very sensitive. If you
slightly depart from the right jacobian, you will see a huge difference compared
to using matrix free with -snes_mf, so you can hardly make a mistake because you
would see it. That's how I finally got it to work.

Best

Timothee


2015-08-24 18:09 GMT+09:00 Wee-Beng Tay < [zonexo at gmail.com] zonexo at gmail.com [zonexo at gmail.com] > :
Hi,
I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2
directions (y,z)
Previously I was using MatSetValues with global indices. However, now I'm using
DM and global indices is much more difficult.
I come across MatSetValuesStencil or MatSetValuesLocal.
So what's the difference bet the one since they both seem to work locally?
Which is a simpler/better option?
Is there an example in Fortran for MatSetValuesStencil?
Do I also need to use DMDAGetAO together with MatSetValuesStencil or
MatSetValuesLocal?
Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150827/ccf39e86/attachment-0001.html>

From gideon.simpson at gmail.com  Thu Aug 27 19:00:56 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Thu, 27 Aug 2015 20:00:56 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
Message-ID: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>

I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form

-\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
-\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 

Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).

Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).

Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?

-gideon


From bsmith at mcs.anl.gov  Thu Aug 27 21:02:45 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 27 Aug 2015 21:02:45 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
Message-ID: <E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>


   Gideon,

    Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.

    Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.

   Barry

> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
> 
> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
> 
> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
> 
> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
> 
> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
> 
> -gideon
> 


From knepley at gmail.com  Thu Aug 27 21:04:57 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 27 Aug 2015 21:04:57 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
Message-ID: <CAMYG4GmtHa3fUp0J4a=Ukn062xPWNHOOjha8Ss8328hqmyBObg@mail.gmail.com>

On Thu, Aug 27, 2015 at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com>
wrote:

> I?m working on a problem which, morally, can be posed as a system of
> coupled semi linear elliptic PDEs together with unknown nonlinear
> eigenvalue parameters, loosely, of the form
>
> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx
> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx
>
> Currently, I have it set up with a DMComposite with two sub da?s, one for
> the parameters (lam, mu), and one for the vector field (u_1, u_2) on the
> mesh.  I have had success in solving this as a fully coupled system with
> SNES + sparse direct solvers (MUMPS, SuperLU).
>
> Lately, I am finding that, when the mesh resolution gets fine enough
> (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm
> = O(10^{-4}),  eventually returning reason -6 (failed line search).
>
> Perhaps there is another way around the above problem, but one thing I was
> thinking of trying would be to get away from direct solvers, and I was
> hoping to use field split for this.  However, it?s a bit beyond what I?ve
> seen examples for because it has 2 types of variables: scalar parameters
> which appear globally in the system and vector valued field variables.  Any
> suggestions on how to get started?


Barry is right. However, I also really think we should have a nonlinear
fieldsplit. I tried to write one (SNES multiblock), but no one has ever
used it. I would be willing to put some time in if you need it. You would
likely nonlinearly precondition the Newton solve with this, which is
what X. Cai does to great effect in some problems he works on.

   Matt


>
> -gideon
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150827/67b3b89e/attachment.html>

From gideon.simpson at gmail.com  Thu Aug 27 21:11:38 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Thu, 27 Aug 2015 22:11:38 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
Message-ID: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com>

HI Barry,

Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation

-gideon

> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>   Gideon,
> 
>    Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
> 
>    Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
> 
>   Barry
> 
>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>> 
>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
>> 
>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>> 
>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>> 
>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>> 
>> -gideon
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150827/bb47ba81/attachment.html>

From knepley at gmail.com  Thu Aug 27 21:15:58 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 27 Aug 2015 21:15:58 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
Message-ID: <CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>

On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com>
wrote:

> HI Barry,
>
> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of
> sense, to solve on a spatially coarse mesh for the field variables,
> interpolate onto the finer mesh, and then solve again.  I?m not entirely
> clear on the practical implementation
>

SNES should do this automatically using -snes_grid_sequence <k>.  If this
does not work, complain. Loudly.

   Matt

-gideon
>
> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
>   Gideon,
>
>    Are you using grid sequencing? Simply solve on a coarse grid,
> interpolate u1 and u2 to a once refined version of the grid and use that
> plus the mu lam as initial guess for the next level. Repeat to as fine a
> grid as you want. You can use DMRefine() and DMGetInterpolation() to get
> the interpolation needed to interpolate from the coarse to finer mesh.
>
>    Then and only then you can use multigrid (with or without fieldsplit)
> to solve the linear problems for finer meshes. Once you have the grid
> sequencing working we can help you with this.
>
>   Barry
>
> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
>
> I?m working on a problem which, morally, can be posed as a system of
> coupled semi linear elliptic PDEs together with unknown nonlinear
> eigenvalue parameters, loosely, of the form
>
> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx
> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx
>
> Currently, I have it set up with a DMComposite with two sub da?s, one for
> the parameters (lam, mu), and one for the vector field (u_1, u_2) on the
> mesh.  I have had success in solving this as a fully coupled system with
> SNES + sparse direct solvers (MUMPS, SuperLU).
>
> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.
>  10^6-10^8 lattice points), my SNES gets stuck with the function norm =
> O(10^{-4}),  eventually returning reason -6 (failed line search).
>
> Perhaps there is another way around the above problem, but one thing I was
> thinking of trying would be to get away from direct solvers, and I was
> hoping to use field split for this.  However, it?s a bit beyond what I?ve
> seen examples for because it has 2 types of variables: scalar parameters
> which appear globally in the system and vector valued field variables.  Any
> suggestions on how to get started?
>
> -gideon
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150827/2bf6909e/attachment-0001.html>

From gideon.simpson at gmail.com  Thu Aug 27 21:32:18 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Thu, 27 Aug 2015 22:32:18 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
Message-ID: <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>

I?m getting the following errors:

[1]PETSC ERROR: Argument out of range
[1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix

Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?

-gideon

> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> HI Barry,
> 
> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
> 
> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
> 
>    Matt
> 
> -gideon
> 
>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
>> 
>> 
>>   Gideon,
>> 
>>    Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
>> 
>>    Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
>> 
>>   Barry
>> 
>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
>>> 
>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>>> 
>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
>>> 
>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>>> 
>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>>> 
>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>>> 
>>> -gideon
>>> 
>> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150827/b7ff1c22/attachment.html>

From bsmith at mcs.anl.gov  Thu Aug 27 21:37:12 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 27 Aug 2015 21:37:12 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
Message-ID: <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>


  We need the full error message.

   But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars. 

  Barry

  Though you should not get this error even if you are using a DMDA there.

> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> I?m getting the following errors:
> 
> [1]PETSC ERROR: Argument out of range
> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
> 
> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
> 
> -gideon
> 
>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> 
>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> HI Barry,
>> 
>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
>> 
>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
>> 
>>    Matt
>> 
>> -gideon
>> 
>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>>   Gideon,
>>> 
>>>    Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
>>> 
>>>    Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
>>> 
>>>   Barry
>>> 
>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>> 
>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>>>> 
>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
>>>> 
>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>>>> 
>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>>>> 
>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>>>> 
>>>> -gideon
>>>> 
>>> 
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
> 


From gideon.simpson at gmail.com  Thu Aug 27 21:42:44 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Thu, 27 Aug 2015 22:42:44 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
Message-ID: <C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>

I have it set up as:

    DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
    DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
    DMCompositeAddDM(user.packer,user.p_dm);
    DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
		 nx, 4, 1, NULL, &user.Q_dm);
    DMCompositeAddDM(user.packer,user.Q_dm);
    DMCreateGlobalVector(user.packer,&U);

where the user.packer structure has

  DM packer;
  DM p_dm, Q_dm;

Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues).

Here are some of the errors that are generated:

[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: New nonzero at (0,3) caused a malloc
Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
[0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
[0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
[0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
[0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: Argument out of range
[1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
[1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
[1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
[1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
[1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
[1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c


-gideon

> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>  We need the full error message.
> 
>   But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars. 
> 
>  Barry
> 
>  Though you should not get this error even if you are using a DMDA there.
> 
>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> I?m getting the following errors:
>> 
>> [1]PETSC ERROR: Argument out of range
>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>> 
>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
>> 
>> -gideon
>> 
>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>> 
>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>> HI Barry,
>>> 
>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
>>> 
>>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
>>> 
>>>   Matt
>>> 
>>> -gideon
>>> 
>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>> 
>>>> 
>>>>  Gideon,
>>>> 
>>>>   Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
>>>> 
>>>>   Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
>>>> 
>>>>  Barry
>>>> 
>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>> 
>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>>>>> 
>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
>>>>> 
>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>>>>> 
>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>>>>> 
>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>>>>> 
>>>>> -gideon
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150827/f132f589/attachment-0001.html>

From bsmith at mcs.anl.gov  Thu Aug 27 22:09:22 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 27 Aug 2015 22:09:22 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
Message-ID: <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>


  Can you send the code, that will be the easiest way to find the problem.

   My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic.

  Barry

> On Aug 27, 2015, at 9:42 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> I have it set up as:
> 
>     DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
>     DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
>     DMCompositeAddDM(user.packer,user.p_dm);
>     DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
> 		 nx, 4, 1, NULL, &user.Q_dm);
>     DMCompositeAddDM(user.packer,user.Q_dm);
>     DMCreateGlobalVector(user.packer,&U);
> 
> where the user.packer structure has
> 
>   DM packer;
>   DM p_dm, Q_dm;
> 
> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues).
> 
> Here are some of the errors that are generated:
> 
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Argument out of range
> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc
> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [1]PETSC ERROR: Argument out of range
> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
> 
> 
> 
> -gideon
> 
>> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>  We need the full error message.
>> 
>>   But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars. 
>> 
>>  Barry
>> 
>>  Though you should not get this error even if you are using a DMDA there.
>> 
>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>> 
>>> I?m getting the following errors:
>>> 
>>> [1]PETSC ERROR: Argument out of range
>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>>> 
>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
>>> 
>>> -gideon
>>> 
>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>> 
>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>> HI Barry,
>>>> 
>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
>>>> 
>>>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
>>>> 
>>>>   Matt
>>>> 
>>>> -gideon
>>>> 
>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>> 
>>>>> 
>>>>>  Gideon,
>>>>> 
>>>>>   Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
>>>>> 
>>>>>   Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
>>>>> 
>>>>>  Barry
>>>>> 
>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>> 
>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>>>>>> 
>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
>>>>>> 
>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>>>>>> 
>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>>>>>> 
>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>>>>>> 
>>>>>> -gideon
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>> 
>> 
> 


From gideon.simpson at gmail.com  Thu Aug 27 22:15:50 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Thu, 27 Aug 2015 23:15:50 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
Message-ID: <ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>

That?s correct, I am not using the SNESGetDM.  I suppose I could.  Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field.  I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations.

Here is my form function.  I can send more code if needed.

/* Form the system of equations for computing a blowup solution*/
PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){

  blowup_ctx *user = (blowup_ctx *) ctx;
  PetscInt i;
  PetscScalar dx, dx2, xmax,x;
  PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx;
  DMDALocalInfo info;
  Vec p_vec, Q_vec, Fp_vec, FQ_vec;
  PetscScalar *p_array, *Fp_array;
  Q *Qvals, *FQvals;
  PetscScalar Q2sig, W2sig;
  PetscScalar a,a2, b, u0, sigma;

  dx = user->dx; dx2 = dx *dx;
  xmax = user->xmax;
  sigma = user->sigma;

  /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma);  */

  /* Extract raw arrays */
  DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec);
  DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec);
    
  DMCompositeScatter(user->packer, U, p_vec, Q_vec);
  /* VecView(Q_vec, 	PETSC_VIEWER_STDOUT_SELF); */

  VecGetArray(p_vec,&p_array);
  VecGetArray(Fp_vec,&Fp_array);  
    
  DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals);
  DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals);
  
  DMDAGetLocalInfo(user->Q_dm, &info);

  a = p_array[0]; a2 = a*a;
  b = p_array[1];
  u0 = p_array[2];

  /* Set boundary conditions at the origin*/
  if(info.xs ==0){
    set_origin_bcs(u0, Qvals);
  }
  /* Set boundray conditions in the far field */
  if(info.xs+ info.xm == info.mx){
    set_farfield_bcs(xmax,sigma, a,  b,  dx, Qvals,info.mx);
  }

  /* Solve auxiliary equations */
  if(info.xs ==0){
    uxx = (2 * Qvals[0].u-2 * u0)/dx2;
    vxx = (Qvals[0].v + Qvals[0].g)/dx2;
    vx = (Qvals[0].v - Qvals[0].g)/(2*dx);    
    Fp_array[0] = Qvals[0].u - Qvals[0].f;
    Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0;
    Fp_array[2] = -uxx + (1/a2) * u0
      + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx);   
  }

  /* Solve equations in the bulk */
  for(i=info.xs; i < info.xs + info.xm;i++){
    
    u = Qvals[i].u;
    v = Qvals[i].v;
    f = Qvals[i].f;
    g = Qvals[i].g;

    x = (i+1) * dx;
    
    Q2sig = PetscPowScalar(u*u + v*v,sigma);
    W2sig= PetscPowScalar(f*f + g*g, sigma);

    ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx);
    vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx);
    fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx);
    gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx);

    uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2);
    vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2);
    fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2);
    gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2);

    FQvals[i].u = -uxx +1/a2 * u
      + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx);
    
    FQvals[i].v = -vxx +1/a2 * v
      - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux);

    FQvals[i].f = -fxx +1/a2 * f
      + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx);

    FQvals[i].g =-gxx +1/a2 * g
      - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); 
  }

  /* Restore raw arrays */
  VecRestoreArray(p_vec, &p_array);
  VecRestoreArray(Fp_vec, &Fp_array);  

  DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals);
  DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals);  

  DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec);
  DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec);
  DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec);

  return 0;
}


Here is the form function:


-gideon

> On Aug 27, 2015, at 11:09 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>  Can you send the code, that will be the easiest way to find the problem.
> 
>   My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic.
> 
>  Barry
> 
>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> I have it set up as:
>> 
>>    DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
>>    DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
>>    DMCompositeAddDM(user.packer,user.p_dm);
>>    DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
>> 		 nx, 4, 1, NULL, &user.Q_dm);
>>    DMCompositeAddDM(user.packer,user.Q_dm);
>>    DMCreateGlobalVector(user.packer,&U);
>> 
>> where the user.packer structure has
>> 
>>  DM packer;
>>  DM p_dm, Q_dm;
>> 
>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues).
>> 
>> Here are some of the errors that are generated:
>> 
>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [0]PETSC ERROR: Argument out of range
>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc
>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [1]PETSC ERROR: Argument out of range
>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
>> 
>> 
>> 
>> -gideon
>> 
>>> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>> We need the full error message.
>>> 
>>>  But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars. 
>>> 
>>> Barry
>>> 
>>> Though you should not get this error even if you are using a DMDA there.
>>> 
>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>> 
>>>> I?m getting the following errors:
>>>> 
>>>> [1]PETSC ERROR: Argument out of range
>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>>>> 
>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
>>>> 
>>>> -gideon
>>>> 
>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>> 
>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>> HI Barry,
>>>>> 
>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
>>>>> 
>>>>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
>>>>> 
>>>>>  Matt
>>>>> 
>>>>> -gideon
>>>>> 
>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>> 
>>>>>> 
>>>>>> Gideon,
>>>>>> 
>>>>>>  Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
>>>>>> 
>>>>>>  Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
>>>>>> 
>>>>>> Barry
>>>>>> 
>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>>> 
>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>>>>>>> 
>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
>>>>>>> 
>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>>>>>>> 
>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>>>>>>> 
>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>>>>>>> 
>>>>>>> -gideon
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>> -- Norbert Wiener
>>>> 
>>> 
>> 
> 


From bsmith at mcs.anl.gov  Thu Aug 27 22:23:31 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 27 Aug 2015 22:23:31 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
Message-ID: <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>


> On Aug 27, 2015, at 10:15 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> That?s correct, I am not using the SNESGetDM.  I suppose I could.  Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field.  I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations.

    Nothing ever gets coarsened in grid sequencing, it only gets refined. 

    The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy).
> 
> Here is my form function.  I can send more code if needed.
> 

  Just change the user->packer that you use to be the DM obtained with SNESGetDM()


> /* Form the system of equations for computing a blowup solution*/
> PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){
> 
>  blowup_ctx *user = (blowup_ctx *) ctx;
>  PetscInt i;
>  PetscScalar dx, dx2, xmax,x;
>  PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx;
>  DMDALocalInfo info;
>  Vec p_vec, Q_vec, Fp_vec, FQ_vec;
>  PetscScalar *p_array, *Fp_array;
>  Q *Qvals, *FQvals;
>  PetscScalar Q2sig, W2sig;
>  PetscScalar a,a2, b, u0, sigma;
> 
>  dx = user->dx; dx2 = dx *dx;
>  xmax = user->xmax;
>  sigma = user->sigma;
> 
>  /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma);  */
> 
>  /* Extract raw arrays */
>  DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec);
>  DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec);
> 
>  DMCompositeScatter(user->packer, U, p_vec, Q_vec);
>  /* VecView(Q_vec, 	PETSC_VIEWER_STDOUT_SELF); */
> 
>  VecGetArray(p_vec,&p_array);
>  VecGetArray(Fp_vec,&Fp_array);  
> 
>  DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals);
>  DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals);
> 
>  DMDAGetLocalInfo(user->Q_dm, &info);
> 
>  a = p_array[0]; a2 = a*a;
>  b = p_array[1];
>  u0 = p_array[2];
> 
>  /* Set boundary conditions at the origin*/
>  if(info.xs ==0){
>    set_origin_bcs(u0, Qvals);
>  }
>  /* Set boundray conditions in the far field */
>  if(info.xs+ info.xm == info.mx){
>    set_farfield_bcs(xmax,sigma, a,  b,  dx, Qvals,info.mx);
>  }
> 
>  /* Solve auxiliary equations */
>  if(info.xs ==0){
>    uxx = (2 * Qvals[0].u-2 * u0)/dx2;
>    vxx = (Qvals[0].v + Qvals[0].g)/dx2;
>    vx = (Qvals[0].v - Qvals[0].g)/(2*dx);    
>    Fp_array[0] = Qvals[0].u - Qvals[0].f;
>    Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0;
>    Fp_array[2] = -uxx + (1/a2) * u0
>      + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx);   
>  }
> 
>  /* Solve equations in the bulk */
>  for(i=info.xs; i < info.xs + info.xm;i++){
> 
>    u = Qvals[i].u;
>    v = Qvals[i].v;
>    f = Qvals[i].f;
>    g = Qvals[i].g;
> 
>    x = (i+1) * dx;
> 
>    Q2sig = PetscPowScalar(u*u + v*v,sigma);
>    W2sig= PetscPowScalar(f*f + g*g, sigma);
> 
>    ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx);
>    vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx);
>    fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx);
>    gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx);
> 
>    uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2);
>    vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2);
>    fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2);
>    gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2);
> 
>    FQvals[i].u = -uxx +1/a2 * u
>      + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx);
> 
>    FQvals[i].v = -vxx +1/a2 * v
>      - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux);
> 
>    FQvals[i].f = -fxx +1/a2 * f
>      + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx);
> 
>    FQvals[i].g =-gxx +1/a2 * g
>      - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); 
>  }
> 
>  /* Restore raw arrays */
>  VecRestoreArray(p_vec, &p_array);
>  VecRestoreArray(Fp_vec, &Fp_array);  
> 
>  DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals);
>  DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals);  
> 
>  DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec);
>  DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec);
>  DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec);
> 
>  return 0;
> }
> 
> 
> Here is the form function:
> 
> 
> 
> -gideon
> 
>> On Aug 27, 2015, at 11:09 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>> Can you send the code, that will be the easiest way to find the problem.
>> 
>>  My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic.
>> 
>> Barry
>> 
>>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>> 
>>> I have it set up as:
>>> 
>>>   DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
>>>   DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
>>>   DMCompositeAddDM(user.packer,user.p_dm);
>>>   DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
>>> 		 nx, 4, 1, NULL, &user.Q_dm);
>>>   DMCompositeAddDM(user.packer,user.Q_dm);
>>>   DMCreateGlobalVector(user.packer,&U);
>>> 
>>> where the user.packer structure has
>>> 
>>> DM packer;
>>> DM p_dm, Q_dm;
>>> 
>>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues).
>>> 
>>> Here are some of the errors that are generated:
>>> 
>>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [0]PETSC ERROR: Argument out of range
>>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc
>>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
>>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
>>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>> [1]PETSC ERROR: Argument out of range
>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
>>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
>>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
>>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
>>> 
>>> 
>>> 
>>> -gideon
>>> 
>>>> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>> 
>>>> 
>>>> We need the full error message.
>>>> 
>>>> But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars. 
>>>> 
>>>> Barry
>>>> 
>>>> Though you should not get this error even if you are using a DMDA there.
>>>> 
>>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>> 
>>>>> I?m getting the following errors:
>>>>> 
>>>>> [1]PETSC ERROR: Argument out of range
>>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>>>>> 
>>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
>>>>> 
>>>>> -gideon
>>>>> 
>>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>>> 
>>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>> HI Barry,
>>>>>> 
>>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
>>>>>> 
>>>>>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
>>>>>> 
>>>>>> Matt
>>>>>> 
>>>>>> -gideon
>>>>>> 
>>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> Gideon,
>>>>>>> 
>>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
>>>>>>> 
>>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
>>>>>>> 
>>>>>>> Barry
>>>>>>> 
>>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>>>> 
>>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>>>>>>>> 
>>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
>>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
>>>>>>>> 
>>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>>>>>>>> 
>>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>>>>>>>> 
>>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>>>>>>>> 
>>>>>>>> -gideon
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener
>>>>> 
>>>> 
>>> 
>> 
> 


From gideon.simpson at gmail.com  Thu Aug 27 22:56:05 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Thu, 27 Aug 2015 23:56:05 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
Message-ID: <A16AEC4B-AB05-41D6-B57B-9E580B64470A@gmail.com>

Ok, it seems to work with that switch, however, when I try to use DM sequence, I get errors like:
      0 SNES Function norm 5.067205249874e-03 
      1 SNES Function norm 7.983917252341e-08 
      2 SNES Function norm 7.291012540201e-11 
    0 SNES Function norm 2.228951406196e+02 
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: New nonzero at (0,3) caused a malloc
Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
[0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 23:53:19 2015
[0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
[0]PETSC ERROR: #1 MatSetValues_SeqAIJ() line 487 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Argument out of range
[0]PETSC ERROR: New nonzero at (1,4) caused a malloc
Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
[0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 23:53:19 2015
[0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
[0]PETSC ERROR: #3 MatSetValues_SeqAIJ() line 487 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: #4 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c

And in my main program, I did set 

    MatSetOption(J, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);

for the Jacobian.

-gideon

> On Aug 27, 2015, at 11:23 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>> On Aug 27, 2015, at 10:15 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> That?s correct, I am not using the SNESGetDM.  I suppose I could.  Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field.  I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations.
> 
>    Nothing ever gets coarsened in grid sequencing, it only gets refined. 
> 
>    The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy).
>> 
>> Here is my form function.  I can send more code if needed.
>> 
> 
>  Just change the user->packer that you use to be the DM obtained with SNESGetDM()
> 
> 
>> /* Form the system of equations for computing a blowup solution*/
>> PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){
>> 
>> blowup_ctx *user = (blowup_ctx *) ctx;
>> PetscInt i;
>> PetscScalar dx, dx2, xmax,x;
>> PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx;
>> DMDALocalInfo info;
>> Vec p_vec, Q_vec, Fp_vec, FQ_vec;
>> PetscScalar *p_array, *Fp_array;
>> Q *Qvals, *FQvals;
>> PetscScalar Q2sig, W2sig;
>> PetscScalar a,a2, b, u0, sigma;
>> 
>> dx = user->dx; dx2 = dx *dx;
>> xmax = user->xmax;
>> sigma = user->sigma;
>> 
>> /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma);  */
>> 
>> /* Extract raw arrays */
>> DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec);
>> DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec);
>> 
>> DMCompositeScatter(user->packer, U, p_vec, Q_vec);
>> /* VecView(Q_vec, 	PETSC_VIEWER_STDOUT_SELF); */
>> 
>> VecGetArray(p_vec,&p_array);
>> VecGetArray(Fp_vec,&Fp_array);  
>> 
>> DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals);
>> DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals);
>> 
>> DMDAGetLocalInfo(user->Q_dm, &info);
>> 
>> a = p_array[0]; a2 = a*a;
>> b = p_array[1];
>> u0 = p_array[2];
>> 
>> /* Set boundary conditions at the origin*/
>> if(info.xs ==0){
>>   set_origin_bcs(u0, Qvals);
>> }
>> /* Set boundray conditions in the far field */
>> if(info.xs+ info.xm == info.mx){
>>   set_farfield_bcs(xmax,sigma, a,  b,  dx, Qvals,info.mx);
>> }
>> 
>> /* Solve auxiliary equations */
>> if(info.xs ==0){
>>   uxx = (2 * Qvals[0].u-2 * u0)/dx2;
>>   vxx = (Qvals[0].v + Qvals[0].g)/dx2;
>>   vx = (Qvals[0].v - Qvals[0].g)/(2*dx);    
>>   Fp_array[0] = Qvals[0].u - Qvals[0].f;
>>   Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0;
>>   Fp_array[2] = -uxx + (1/a2) * u0
>>     + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx);   
>> }
>> 
>> /* Solve equations in the bulk */
>> for(i=info.xs; i < info.xs + info.xm;i++){
>> 
>>   u = Qvals[i].u;
>>   v = Qvals[i].v;
>>   f = Qvals[i].f;
>>   g = Qvals[i].g;
>> 
>>   x = (i+1) * dx;
>> 
>>   Q2sig = PetscPowScalar(u*u + v*v,sigma);
>>   W2sig= PetscPowScalar(f*f + g*g, sigma);
>> 
>>   ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx);
>>   vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx);
>>   fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx);
>>   gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx);
>> 
>>   uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2);
>>   vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2);
>>   fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2);
>>   gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2);
>> 
>>   FQvals[i].u = -uxx +1/a2 * u
>>     + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx);
>> 
>>   FQvals[i].v = -vxx +1/a2 * v
>>     - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux);
>> 
>>   FQvals[i].f = -fxx +1/a2 * f
>>     + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx);
>> 
>>   FQvals[i].g =-gxx +1/a2 * g
>>     - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); 
>> }
>> 
>> /* Restore raw arrays */
>> VecRestoreArray(p_vec, &p_array);
>> VecRestoreArray(Fp_vec, &Fp_array);  
>> 
>> DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals);
>> DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals);  
>> 
>> DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec);
>> DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec);
>> DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec);
>> 
>> return 0;
>> }
>> 
>> 
>> Here is the form function:
>> 
>> 
>> 
>> -gideon
>> 
>>> On Aug 27, 2015, at 11:09 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>> 
>>> Can you send the code, that will be the easiest way to find the problem.
>>> 
>>> My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic.
>>> 
>>> Barry
>>> 
>>>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>> 
>>>> I have it set up as:
>>>> 
>>>>  DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
>>>>  DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
>>>>  DMCompositeAddDM(user.packer,user.p_dm);
>>>>  DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
>>>> 		 nx, 4, 1, NULL, &user.Q_dm);
>>>>  DMCompositeAddDM(user.packer,user.Q_dm);
>>>>  DMCreateGlobalVector(user.packer,&U);
>>>> 
>>>> where the user.packer structure has
>>>> 
>>>> DM packer;
>>>> DM p_dm, Q_dm;
>>>> 
>>>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues).
>>>> 
>>>> Here are some of the errors that are generated:
>>>> 
>>>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>>> [0]PETSC ERROR: Argument out of range
>>>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc
>>>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
>>>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
>>>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>>>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
>>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>>>> [1]PETSC ERROR: Argument out of range
>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>>>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>>>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown 
>>>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
>>>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>>>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
>>>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
>>>> 
>>>> 
>>>> 
>>>> -gideon
>>>> 
>>>>> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>> 
>>>>> 
>>>>> We need the full error message.
>>>>> 
>>>>> But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars. 
>>>>> 
>>>>> Barry
>>>>> 
>>>>> Though you should not get this error even if you are using a DMDA there.
>>>>> 
>>>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>> 
>>>>>> I?m getting the following errors:
>>>>>> 
>>>>>> [1]PETSC ERROR: Argument out of range
>>>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>>>>>> 
>>>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
>>>>>> 
>>>>>> -gideon
>>>>>> 
>>>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>>>>>> 
>>>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>>> HI Barry,
>>>>>>> 
>>>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
>>>>>>> 
>>>>>>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
>>>>>>> 
>>>>>>> Matt
>>>>>>> 
>>>>>>> -gideon
>>>>>>> 
>>>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Gideon,
>>>>>>>> 
>>>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
>>>>>>>> 
>>>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
>>>>>>>> 
>>>>>>>> Barry
>>>>>>>> 
>>>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>>>>>>>>> 
>>>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
>>>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 
>>>>>>>>> 
>>>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>>>>>>>>> 
>>>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>>>>>>>>> 
>>>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>>>>>>>>> 
>>>>>>>>> -gideon
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150827/57e38a88/attachment-0001.html>

From zonexo at gmail.com  Thu Aug 27 23:32:09 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Fri, 28 Aug 2015 12:32:09 +0800
Subject: [petsc-users] Problem with linking PETSc
In-Reply-To: <alpine.LFD.2.20.1508271038150.27561@asterix>
References: <55DEA8C7.5010100@gmail.com>
	<alpine.LFD.2.20.1508271038150.27561@asterix>
Message-ID: <55DFE449.3020401@gmail.com>


On 27/8/2015 11:38 PM, Satish Balay wrote:
> Are you able to compile and run both C and fortran petsc examples
> using the corresponding makefile?
>
> Satish
Hi Satish,

Yes, there is no problem except for a minor warning:

/*$ make ex2*//*
*//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -o 
ex2.o -c -MT -wd4996 -Z7 
-I/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/include -I/cygdrive/c*//*
*//*/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/include 
-I/cygdrive/c/Program\ Files\ 
\(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/include    `pwd`/e*//*
*//*x2.c*//*
*//*ex2.c*//*
*//*You are using an Intel supplied intrinsic header file with a 
third-party compiler.*//*
*//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -MT 
-wd4996 -Z7  -o ex2 ex2.o 
-L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_im*//*
*//*pi_vs2008/lib  -lpetsc 
-L/cygdrive/c/wtay/Lib/petsc-3.6.1_win64_impi_vs2008/lib -lflapack 
-lfblas /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110*//*
*//*/intel64/lib/debug/impi.lib /cygdrive/c/Program\ Files\ 
\(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/impicxx.lib 
/cygdrive/c/Program\ Files\ \(x86\)/IntelSWTo*//*
*//*ols/mpi/5.1.1.110/intel64/lib/impicxxd.lib /cygdrive/c/Program\ 
Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/libmpi_ilp64.lib 
Gdi32.lib User32.lib Adva*//*
*//*pi32.lib Kernel32.lib Ws2_32.lib*//*
*//*/usr/bin/rm -f ex2.o*//*
*//*
*//*tsltaywb at 1C3YYY1 
/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//*
*//*$ ./ex2f*//*
*//*Norm of error  0.1192E-05 iterations     4*//*
*//*
*//*tsltaywb at 1C3YYY1 
/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//*
*//*$ mpiexec -n 2 ./ex2f*//*
*//*Norm of error < 1.e-12,iterations     7*/

>
> On Thu, 27 Aug 2015, TAY wee-beng wrote:
>
>> Hi,
>>
>> I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008.
>>
>> Due to some MPICH2 issues, I am trying to use Intel MPI (newest version).
>> Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly.
>>
>> However, I now have problem linking the files on VS2008 to create the final
>> exe. The error is:
>>
>> /*1>Compiling manifest to resources...*//*
>> *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//*
>> *//*1>Copyright (C) Microsoft Corporation.  All rights reserved.*//*
>> *//*1>Linking...*//*
>> *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ
>> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
>> *//*1>global.obj : error LNK2019: unresolved external symbol MATSETFROMOPTIONS
>> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
>> *//*...*//*
>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
>> VECGETARRAY referenced in function PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
>> VECRESTOREARRAY referenced in function
>> PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
>> DMLOCALTOLOCALBEGIN referenced in function
>> PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
>> DMLOCALTOLOCALEND referenced in function PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
>> *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol
>> PETSCINITIALIZE referenced in function MAIN__*//*
>> *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error
>> LNK1120: 74 unresolved externals*//*
>> *//*1>*//*
>> *//*1>Build log written to
>> "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//*
>> *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//*
>> *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/
>>
>> I did not do much changes since the prev PETSc worked. I only changed the
>> directory $(PETSC_DIR) and $(IMPI) to the new directory in win7 environment
>> variables. I wonder what's wrong.
>>
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150828/55292095/attachment.html>

From balay at mcs.anl.gov  Thu Aug 27 23:46:48 2015
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 27 Aug 2015 23:46:48 -0500
Subject: [petsc-users] Problem with linking PETSc
In-Reply-To: <55DFE449.3020401@gmail.com>
References: <55DEA8C7.5010100@gmail.com>
	<alpine.LFD.2.20.1508271038150.27561@asterix>
	<55DFE449.3020401@gmail.com>
Message-ID: <alpine.LFD.2.20.1508272344140.2536@asterix>

I don't see a compile of ex2f in the copy/paste. Assuming that ran
correctly and [ex2f was not an old binary lying arround] - it implies
that your project file has bugs.

Perhaps there is a verbose mode that it provides that you can use to
see the exact compile command its using.

I suspect its missing the equivalent of the following options:

-L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/lib  -lpetsc

Satish


On Thu, 27 Aug 2015, TAY wee-beng wrote:

> 
> On 27/8/2015 11:38 PM, Satish Balay wrote:
> > Are you able to compile and run both C and fortran petsc examples
> > using the corresponding makefile?
> > 
> > Satish
> Hi Satish,
> 
> Yes, there is no problem except for a minor warning:
> 
> /*$ make ex2*//*
> *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -o ex2.o
> -c -MT -wd4996 -Z7 -I/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/include
> -I/cygdrive/c*//*
> *//*/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/include
> -I/cygdrive/c/Program\ Files\
> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/include    `pwd`/e*//*
> *//*x2.c*//*
> *//*ex2.c*//*
> *//*You are using an Intel supplied intrinsic header file with a third-party
> compiler.*//*
> *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -MT
> -wd4996 -Z7  -o ex2 ex2.o
> -L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_im*//*
> *//*pi_vs2008/lib  -lpetsc
> -L/cygdrive/c/wtay/Lib/petsc-3.6.1_win64_impi_vs2008/lib -lflapack -lfblas
> /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110*//*
> *//*/intel64/lib/debug/impi.lib /cygdrive/c/Program\ Files\
> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/impicxx.lib
> /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTo*//*
> *//*ols/mpi/5.1.1.110/intel64/lib/impicxxd.lib /cygdrive/c/Program\ Files\
> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/libmpi_ilp64.lib Gdi32.lib
> User32.lib Adva*//*
> *//*pi32.lib Kernel32.lib Ws2_32.lib*//*
> *//*/usr/bin/rm -f ex2.o*//*
> *//*
> *//*tsltaywb at 1C3YYY1
> /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//*
> *//*$ ./ex2f*//*
> *//*Norm of error  0.1192E-05 iterations     4*//*
> *//*
> *//*tsltaywb at 1C3YYY1
> /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//*
> *//*$ mpiexec -n 2 ./ex2f*//*
> *//*Norm of error < 1.e-12,iterations     7*/
> 
> > 
> > On Thu, 27 Aug 2015, TAY wee-beng wrote:
> > 
> > > Hi,
> > > 
> > > I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008.
> > > 
> > > Due to some MPICH2 issues, I am trying to use Intel MPI (newest version).
> > > Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly.
> > > 
> > > However, I now have problem linking the files on VS2008 to create the
> > > final
> > > exe. The error is:
> > > 
> > > /*1>Compiling manifest to resources...*//*
> > > *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//*
> > > *//*1>Copyright (C) Microsoft Corporation.  All rights reserved.*//*
> > > *//*1>Linking...*//*
> > > *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ
> > > referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
> > > *//*1>global.obj : error LNK2019: unresolved external symbol
> > > MATSETFROMOPTIONS
> > > referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
> > > *//*...*//*
> > > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
> > > VECGETARRAY referenced in function
> > > PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
> > > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
> > > VECRESTOREARRAY referenced in function
> > > PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
> > > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
> > > DMLOCALTOLOCALBEGIN referenced in function
> > > PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
> > > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
> > > DMLOCALTOLOCALEND referenced in function
> > > PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
> > > *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol
> > > PETSCINITIALIZE referenced in function MAIN__*//*
> > > *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error
> > > LNK1120: 74 unresolved externals*//*
> > > *//*1>*//*
> > > *//*1>Build log written to
> > > "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//*
> > > *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//*
> > > *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/
> > > 
> > > I did not do much changes since the prev PETSc worked. I only changed the
> > > directory $(PETSC_DIR) and $(IMPI) to the new directory in win7
> > > environment
> > > variables. I wonder what's wrong.
> > > 
> > > 
> 
> 


From zonexo at gmail.com  Fri Aug 28 03:20:22 2015
From: zonexo at gmail.com (TAY wee-beng)
Date: Fri, 28 Aug 2015 16:20:22 +0800
Subject: [petsc-users] Problem with linking PETSc
In-Reply-To: <alpine.LFD.2.20.1508272344140.2536@asterix>
References: <55DEA8C7.5010100@gmail.com>
	<alpine.LFD.2.20.1508271038150.27561@asterix>
	<55DFE449.3020401@gmail.com>
	<alpine.LFD.2.20.1508272344140.2536@asterix>
Message-ID: <55E019C6.5030003@gmail.com>

Hi Satish,

Was searching high and low at the wrong place! I accidentally removed 
the libpetsc.lib from the library files...

Now it worked.

Thank you

Yours sincerely,

TAY wee-beng

On 28/8/2015 12:46 PM, Satish Balay wrote:
> I don't see a compile of ex2f in the copy/paste. Assuming that ran
> correctly and [ex2f was not an old binary lying arround] - it implies
> that your project file has bugs.
>
> Perhaps there is a verbose mode that it provides that you can use to
> see the exact compile command its using.
>
> I suspect its missing the equivalent of the following options:
>
> -L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/lib  -lpetsc
>
> Satish
>
>
> On Thu, 27 Aug 2015, TAY wee-beng wrote:
>
>> On 27/8/2015 11:38 PM, Satish Balay wrote:
>>> Are you able to compile and run both C and fortran petsc examples
>>> using the corresponding makefile?
>>>
>>> Satish
>> Hi Satish,
>>
>> Yes, there is no problem except for a minor warning:
>>
>> /*$ make ex2*//*
>> *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -o ex2.o
>> -c -MT -wd4996 -Z7 -I/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/include
>> -I/cygdrive/c*//*
>> *//*/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/include
>> -I/cygdrive/c/Program\ Files\
>> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/include    `pwd`/e*//*
>> *//*x2.c*//*
>> *//*ex2.c*//*
>> *//*You are using an Intel supplied intrinsic header file with a third-party
>> compiler.*//*
>> *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -MT
>> -wd4996 -Z7  -o ex2 ex2.o
>> -L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_im*//*
>> *//*pi_vs2008/lib  -lpetsc
>> -L/cygdrive/c/wtay/Lib/petsc-3.6.1_win64_impi_vs2008/lib -lflapack -lfblas
>> /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110*//*
>> *//*/intel64/lib/debug/impi.lib /cygdrive/c/Program\ Files\
>> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/impicxx.lib
>> /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTo*//*
>> *//*ols/mpi/5.1.1.110/intel64/lib/impicxxd.lib /cygdrive/c/Program\ Files\
>> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/libmpi_ilp64.lib Gdi32.lib
>> User32.lib Adva*//*
>> *//*pi32.lib Kernel32.lib Ws2_32.lib*//*
>> *//*/usr/bin/rm -f ex2.o*//*
>> *//*
>> *//*tsltaywb at 1C3YYY1
>> /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//*
>> *//*$ ./ex2f*//*
>> *//*Norm of error  0.1192E-05 iterations     4*//*
>> *//*
>> *//*tsltaywb at 1C3YYY1
>> /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//*
>> *//*$ mpiexec -n 2 ./ex2f*//*
>> *//*Norm of error < 1.e-12,iterations     7*/
>>
>>> On Thu, 27 Aug 2015, TAY wee-beng wrote:
>>>
>>>> Hi,
>>>>
>>>> I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008.
>>>>
>>>> Due to some MPICH2 issues, I am trying to use Intel MPI (newest version).
>>>> Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly.
>>>>
>>>> However, I now have problem linking the files on VS2008 to create the
>>>> final
>>>> exe. The error is:
>>>>
>>>> /*1>Compiling manifest to resources...*//*
>>>> *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//*
>>>> *//*1>Copyright (C) Microsoft Corporation.  All rights reserved.*//*
>>>> *//*1>Linking...*//*
>>>> *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ
>>>> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
>>>> *//*1>global.obj : error LNK2019: unresolved external symbol
>>>> MATSETFROMOPTIONS
>>>> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//*
>>>> *//*...*//*
>>>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
>>>> VECGETARRAY referenced in function
>>>> PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
>>>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
>>>> VECRESTOREARRAY referenced in function
>>>> PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//*
>>>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
>>>> DMLOCALTOLOCALBEGIN referenced in function
>>>> PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
>>>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol
>>>> DMLOCALTOLOCALEND referenced in function
>>>> PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//*
>>>> *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol
>>>> PETSCINITIALIZE referenced in function MAIN__*//*
>>>> *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error
>>>> LNK1120: 74 unresolved externals*//*
>>>> *//*1>*//*
>>>> *//*1>Build log written to
>>>> "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//*
>>>> *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//*
>>>> *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/
>>>>
>>>> I did not do much changes since the prev PETSc worked. I only changed the
>>>> directory $(PETSC_DIR) and $(IMPI) to the new directory in win7
>>>> environment
>>>> variables. I wonder what's wrong.
>>>>
>>>>
>>


From knepley at gmail.com  Fri Aug 28 05:20:02 2015
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 28 Aug 2015 05:20:02 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
Message-ID: <CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>

On Thu, Aug 27, 2015 at 10:23 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Aug 27, 2015, at 10:15 PM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
> >
> > That?s correct, I am not using the SNESGetDM.  I suppose I could.  Keep
> in mind that I?m trying to solve, simultaneously, for the scalar parameters
> and the vector field.  I guess what I am unclear about is how DMRefine is
> to know that the unknown associated with the scalar parameters can never be
> coarsened out, but must be retained at all iterations.
>
>     Nothing ever gets coarsened in grid sequencing, it only gets refined.
>
>     The reason it knows not to "refine" the scalars is because the scalars
> are created with DMRedundant and the DMRedundant object knows that
> refinement means "leave as is, since there is no grid" while the DMDA knows
> it is a grid and knows how to refine itself. So when it "interpolates" the
> DMRedundant variables it just copies them (or multiples them by the matrix
> 1 which is just a copy).
>

I think you might be misunderstanding the "scalars" part. He is solving a
nonlinear eigenproblem (which he did not write down) for some variables.
Then he
uses those variable in the coupled diffusion equations he did write down.
He has wrapped the whole problem in a SNES with 2 parts: the nonlinear
eigenproblem
and the diffusion equations. He uses DMComposite to deal with all the
unknowns.

I think Nonlinear Block Gauss-Siedel on the different problems would be a
useful starting point, but we do not have that.

  Thanks,

     Matt


> >
> > Here is my form function.  I can send more code if needed.
> >
>
>   Just change the user->packer that you use to be the DM obtained with
> SNESGetDM()
>
>
> > /* Form the system of equations for computing a blowup solution*/
> > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){
> >
> >  blowup_ctx *user = (blowup_ctx *) ctx;
> >  PetscInt i;
> >  PetscScalar dx, dx2, xmax,x;
> >  PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx;
> >  DMDALocalInfo info;
> >  Vec p_vec, Q_vec, Fp_vec, FQ_vec;
> >  PetscScalar *p_array, *Fp_array;
> >  Q *Qvals, *FQvals;
> >  PetscScalar Q2sig, W2sig;
> >  PetscScalar a,a2, b, u0, sigma;
> >
> >  dx = user->dx; dx2 = dx *dx;
> >  xmax = user->xmax;
> >  sigma = user->sigma;
> >
> >  /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma);
> */
> >
> >  /* Extract raw arrays */
> >  DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec);
> >  DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec);
> >
> >  DMCompositeScatter(user->packer, U, p_vec, Q_vec);
> >  /* VecView(Q_vec,    PETSC_VIEWER_STDOUT_SELF); */
> >
> >  VecGetArray(p_vec,&p_array);
> >  VecGetArray(Fp_vec,&Fp_array);
> >
> >  DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals);
> >  DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals);
> >
> >  DMDAGetLocalInfo(user->Q_dm, &info);
> >
> >  a = p_array[0]; a2 = a*a;
> >  b = p_array[1];
> >  u0 = p_array[2];
> >
> >  /* Set boundary conditions at the origin*/
> >  if(info.xs ==0){
> >    set_origin_bcs(u0, Qvals);
> >  }
> >  /* Set boundray conditions in the far field */
> >  if(info.xs+ info.xm == info.mx){
> >    set_farfield_bcs(xmax,sigma, a,  b,  dx, Qvals,info.mx);
> >  }
> >
> >  /* Solve auxiliary equations */
> >  if(info.xs ==0){
> >    uxx = (2 * Qvals[0].u-2 * u0)/dx2;
> >    vxx = (Qvals[0].v + Qvals[0].g)/dx2;
> >    vx = (Qvals[0].v - Qvals[0].g)/(2*dx);
> >    Fp_array[0] = Qvals[0].u - Qvals[0].f;
> >    Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0;
> >    Fp_array[2] = -uxx + (1/a2) * u0
> >      + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx);
> >  }
> >
> >  /* Solve equations in the bulk */
> >  for(i=info.xs; i < info.xs + info.xm;i++){
> >
> >    u = Qvals[i].u;
> >    v = Qvals[i].v;
> >    f = Qvals[i].f;
> >    g = Qvals[i].g;
> >
> >    x = (i+1) * dx;
> >
> >    Q2sig = PetscPowScalar(u*u + v*v,sigma);
> >    W2sig= PetscPowScalar(f*f + g*g, sigma);
> >
> >    ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx);
> >    vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx);
> >    fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx);
> >    gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx);
> >
> >    uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2);
> >    vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2);
> >    fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2);
> >    gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2);
> >
> >    FQvals[i].u = -uxx +1/a2 * u
> >      + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx);
> >
> >    FQvals[i].v = -vxx +1/a2 * v
> >      - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux);
> >
> >    FQvals[i].f = -fxx +1/a2 * f
> >      + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx);
> >
> >    FQvals[i].g =-gxx +1/a2 * g
> >      - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx);
> >  }
> >
> >  /* Restore raw arrays */
> >  VecRestoreArray(p_vec, &p_array);
> >  VecRestoreArray(Fp_vec, &Fp_array);
> >
> >  DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals);
> >  DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals);
> >
> >  DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec);
> >  DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec);
> >  DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec);
> >
> >  return 0;
> > }
> >
> >
> > Here is the form function:
> >
> >
> >
> > -gideon
> >
> >> On Aug 27, 2015, at 11:09 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>
> >>
> >> Can you send the code, that will be the easiest way to find the problem.
> >>
> >>  My guess is that you have hardwired in your function/Jacobian
> computation the use of the original DM for computations instead of using
> the current DM (with refinement there will be a new DM on the second level
> different than your original DM). So what you need to do in writing your
> FormFunction and FormJacobian is to call SNESGetDM() to get the current DM
> and then use DMComputeGet... to access the individual DMDA and DMRedundent
> for the parts. I notice you have this user.Q_dm I bet inside your form
> functions you use this DM? You have to remove this logic.
> >>
> >> Barry
> >>
> >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson <gideon.simpson at gmail.com>
> wrote:
> >>>
> >>> I have it set up as:
> >>>
> >>>   DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
> >>>   DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
> >>>   DMCompositeAddDM(user.packer,user.p_dm);
> >>>   DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
> >>>              nx, 4, 1, NULL, &user.Q_dm);
> >>>   DMCompositeAddDM(user.packer,user.Q_dm);
> >>>   DMCreateGlobalVector(user.packer,&U);
> >>>
> >>> where the user.packer structure has
> >>>
> >>> DM packer;
> >>> DM p_dm, Q_dm;
> >>>
> >>> Q_dm holds the field variables and p_dm holds the scalar values (the
> nonlinear eigenvalues).
> >>>
> >>> Here are some of the errors that are generated:
> >>>
> >>> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >>> [0]PETSC ERROR: Argument out of range
> >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc
> >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to
> turn off this check
> >>> [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown
> >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by
> gideon Thu Aug 27 22:40:54 2015
> >>> [0]PETSC ERROR: Configure options --prefix=/opt/local
> --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries
> --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0
> --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate
> --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local
> --with-superlu-dir=/opt/local --with-metis-dir=/opt/local
> --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local
> --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local
> CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp
> FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp
> F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os
> FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names"
> CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os
> FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports
> --with-mpiexec=mpiexec-mpich-mp
> >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in
> /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
> >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> >>> [1]PETSC ERROR: Argument out of range
> >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
> >>> [1]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown
> >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by
> gideon Thu Aug 27 22:40:54 2015
> >>> [1]PETSC ERROR: Configure options --prefix=/opt/local
> --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries
> --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0
> --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate
> --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local
> --with-superlu-dir=/opt/local --with-metis-dir=/opt/local
> --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local
> --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local
> CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp
> FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp
> F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os
> FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names"
> CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os
> FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports
> --with-mpiexec=mpiexec-mpich-mp
> >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in
> /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
> >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in
> /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
> >>>
> >>>
> >>>
> >>> -gideon
> >>>
> >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >>>>
> >>>>
> >>>> We need the full error message.
> >>>>
> >>>> But are you using a DMDA for the scalars?  You should not be, you
> should be using a DMRedundant for the scalars.
> >>>>
> >>>> Barry
> >>>>
> >>>> Though you should not get this error even if you are using a DMDA
> there.
> >>>>
> >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <
> gideon.simpson at gmail.com> wrote:
> >>>>>
> >>>>> I?m getting the following errors:
> >>>>>
> >>>>> [1]PETSC ERROR: Argument out of range
> >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
> >>>>>
> >>>>> Could this have to do with me using the DMComposite with one da
> holding the scalar parameters and the other holding the field variables?
> >>>>>
> >>>>> -gideon
> >>>>>
> >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> >>>>>>
> >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <
> gideon.simpson at gmail.com> wrote:
> >>>>>> HI Barry,
> >>>>>>
> >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot
> of sense, to solve on a spatially coarse mesh for the field variables,
> interpolate onto the finer mesh, and then solve again.  I?m not entirely
> clear on the practical implementation
> >>>>>>
> >>>>>> SNES should do this automatically using -snes_grid_sequence <k>.
> If this does not work, complain. Loudly.
> >>>>>>
> >>>>>> Matt
> >>>>>>
> >>>>>> -gideon
> >>>>>>
> >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Gideon,
> >>>>>>>
> >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid,
> interpolate u1 and u2 to a once refined version of the grid and use that
> plus the mu lam as initial guess for the next level. Repeat to as fine a
> grid as you want. You can use DMRefine() and DMGetInterpolation() to get
> the interpolation needed to interpolate from the coarse to finer mesh.
> >>>>>>>
> >>>>>>> Then and only then you can use multigrid (with or without
> fieldsplit) to solve the linear problems for finer meshes. Once you have
> the grid sequencing working we can help you with this.
> >>>>>>>
> >>>>>>> Barry
> >>>>>>>
> >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <
> gideon.simpson at gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> I?m working on a problem which, morally, can be posed as a system
> of coupled semi linear elliptic PDEs together with unknown nonlinear
> eigenvalue parameters, loosely, of the form
> >>>>>>>>
> >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx
> >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx
> >>>>>>>>
> >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s,
> one for the parameters (lam, mu), and one for the vector field (u_1, u_2)
> on the mesh.  I have had success in solving this as a fully coupled system
> with SNES + sparse direct solvers (MUMPS, SuperLU).
> >>>>>>>>
> >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine
> enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the
> function norm = O(10^{-4}),  eventually returning reason -6 (failed line
> search).
> >>>>>>>>
> >>>>>>>> Perhaps there is another way around the above problem, but one
> thing I was thinking of trying would be to get away from direct solvers,
> and I was hoping to use field split for this.  However, it?s a bit beyond
> what I?ve seen examples for because it has 2 types of variables: scalar
> parameters which appear globally in the system and vector valued field
> variables.  Any suggestions on how to get started?
> >>>>>>>>
> >>>>>>>> -gideon
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >>>>>> -- Norbert Wiener
> >>>>>
> >>>>
> >>>
> >>
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150828/2a0a8e43/attachment-0001.html>

From gideon.simpson at gmail.com  Fri Aug 28 12:21:36 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Fri, 28 Aug 2015 13:21:36 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
Message-ID: <3BD30839-963B-427A-B65A-F20D794606B9@gmail.com>

Yes, to clarify, the problem with two scalar fields and two scalar ?eigenvalues? is


-\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx 
-\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx 

(u_1, u_2) are defined on the mesh, and (lam, mu) are unknown scalars.  My actual problem has a 4 degrees of freedom at each mesh point and 3 unknown scalars, but this makes the point.

-gideon

> On Aug 28, 2015, at 6:20 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Thu, Aug 27, 2015 at 10:23 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> 
> > On Aug 27, 2015, at 10:15 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >
> > That?s correct, I am not using the SNESGetDM.  I suppose I could.  Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field.  I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations.
> 
>     Nothing ever gets coarsened in grid sequencing, it only gets refined.
> 
>     The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy).
> 
> I think you might be misunderstanding the "scalars" part. He is solving a nonlinear eigenproblem (which he did not write down) for some variables. Then he
> uses those variable in the coupled diffusion equations he did write down. He has wrapped the whole problem in a SNES with 2 parts: the nonlinear eigenproblem
> and the diffusion equations. He uses DMComposite to deal with all the unknowns.
> 
> I think Nonlinear Block Gauss-Siedel on the different problems would be a useful starting point, but we do not have that.
> 
>   Thanks,
> 
>      Matt
>  
> >
> > Here is my form function.  I can send more code if needed.
> >
> 
>   Just change the user->packer that you use to be the DM obtained with SNESGetDM()
> 
> 
> > /* Form the system of equations for computing a blowup solution*/
> > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){
> >
> >  blowup_ctx *user = (blowup_ctx *) ctx;
> >  PetscInt i;
> >  PetscScalar dx, dx2, xmax,x;
> >  PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx;
> >  DMDALocalInfo info;
> >  Vec p_vec, Q_vec, Fp_vec, FQ_vec;
> >  PetscScalar *p_array, *Fp_array;
> >  Q *Qvals, *FQvals;
> >  PetscScalar Q2sig, W2sig;
> >  PetscScalar a,a2, b, u0, sigma;
> >
> >  dx = user->dx; dx2 = dx *dx;
> >  xmax = user->xmax;
> >  sigma = user->sigma;
> >
> >  /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma);  */
> >
> >  /* Extract raw arrays */
> >  DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec);
> >  DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec);
> >
> >  DMCompositeScatter(user->packer, U, p_vec, Q_vec);
> >  /* VecView(Q_vec,    PETSC_VIEWER_STDOUT_SELF); */
> >
> >  VecGetArray(p_vec,&p_array);
> >  VecGetArray(Fp_vec,&Fp_array);
> >
> >  DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals);
> >  DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals);
> >
> >  DMDAGetLocalInfo(user->Q_dm, &info);
> >
> >  a = p_array[0]; a2 = a*a;
> >  b = p_array[1];
> >  u0 = p_array[2];
> >
> >  /* Set boundary conditions at the origin*/
> >  if(info.xs ==0){
> >    set_origin_bcs(u0, Qvals);
> >  }
> >  /* Set boundray conditions in the far field */
> >  if(info.xs+ info.xm == info.mx <http://info.mx/>){
> >    set_farfield_bcs(xmax,sigma, a,  b,  dx, Qvals,info.mx <http://info.mx/>);
> >  }
> >
> >  /* Solve auxiliary equations */
> >  if(info.xs ==0){
> >    uxx = (2 * Qvals[0].u-2 * u0)/dx2;
> >    vxx = (Qvals[0].v + Qvals[0].g)/dx2;
> >    vx = (Qvals[0].v - Qvals[0].g)/(2*dx);
> >    Fp_array[0] = Qvals[0].u - Qvals[0].f;
> >    Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0;
> >    Fp_array[2] = -uxx + (1/a2) * u0
> >      + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx);
> >  }
> >
> >  /* Solve equations in the bulk */
> >  for(i=info.xs; i < info.xs + info.xm;i++){
> >
> >    u = Qvals[i].u;
> >    v = Qvals[i].v;
> >    f = Qvals[i].f;
> >    g = Qvals[i].g;
> >
> >    x = (i+1) * dx;
> >
> >    Q2sig = PetscPowScalar(u*u + v*v,sigma);
> >    W2sig= PetscPowScalar(f*f + g*g, sigma);
> >
> >    ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx);
> >    vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx);
> >    fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx);
> >    gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx);
> >
> >    uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2);
> >    vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2);
> >    fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2);
> >    gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2);
> >
> >    FQvals[i].u = -uxx +1/a2 * u
> >      + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx);
> >
> >    FQvals[i].v = -vxx +1/a2 * v
> >      - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux);
> >
> >    FQvals[i].f = -fxx +1/a2 * f
> >      + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx);
> >
> >    FQvals[i].g =-gxx +1/a2 * g
> >      - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx);
> >  }
> >
> >  /* Restore raw arrays */
> >  VecRestoreArray(p_vec, &p_array);
> >  VecRestoreArray(Fp_vec, &Fp_array);
> >
> >  DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals);
> >  DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals);
> >
> >  DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec);
> >  DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec);
> >  DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec);
> >
> >  return 0;
> > }
> >
> >
> > Here is the form function:
> >
> >
> >
> > -gideon
> >
> >> On Aug 27, 2015, at 11:09 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> >>
> >>
> >> Can you send the code, that will be the easiest way to find the problem.
> >>
> >>  My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic.
> >>
> >> Barry
> >>
> >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>>
> >>> I have it set up as:
> >>>
> >>>   DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
> >>>   DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
> >>>   DMCompositeAddDM(user.packer,user.p_dm);
> >>>   DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
> >>>              nx, 4, 1, NULL, &user.Q_dm);
> >>>   DMCompositeAddDM(user.packer,user.Q_dm);
> >>>   DMCreateGlobalVector(user.packer,&U);
> >>>
> >>> where the user.packer structure has
> >>>
> >>> DM packer;
> >>> DM p_dm, Q_dm;
> >>>
> >>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues).
> >>>
> >>> Here are some of the errors that are generated:
> >>>
> >>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >>> [0]PETSC ERROR: Argument out of range
> >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc
> >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
> >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown
> >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
> >>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
> >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
> >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >>> [1]PETSC ERROR: Argument out of range
> >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
> >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown
> >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
> >>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
> >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
> >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
> >>>
> >>>
> >>>
> >>> -gideon
> >>>
> >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> >>>>
> >>>>
> >>>> We need the full error message.
> >>>>
> >>>> But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars.
> >>>>
> >>>> Barry
> >>>>
> >>>> Though you should not get this error even if you are using a DMDA there.
> >>>>
> >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>>>>
> >>>>> I?m getting the following errors:
> >>>>>
> >>>>> [1]PETSC ERROR: Argument out of range
> >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
> >>>>>
> >>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
> >>>>>
> >>>>> -gideon
> >>>>>
> >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
> >>>>>>
> >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>>>>> HI Barry,
> >>>>>>
> >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
> >>>>>>
> >>>>>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
> >>>>>>
> >>>>>> Matt
> >>>>>>
> >>>>>> -gideon
> >>>>>>
> >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Gideon,
> >>>>>>>
> >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
> >>>>>>>
> >>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
> >>>>>>>
> >>>>>>> Barry
> >>>>>>>
> >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>>>>>>>
> >>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
> >>>>>>>>
> >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx
> >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx
> >>>>>>>>
> >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
> >>>>>>>>
> >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
> >>>>>>>>
> >>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
> >>>>>>>>
> >>>>>>>> -gideon
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> >>>>>> -- Norbert Wiener
> >>>>>
> >>>>
> >>>
> >>
> >
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150828/aa9985c5/attachment-0001.html>

From gideon.simpson at gmail.com  Fri Aug 28 14:41:26 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Fri, 28 Aug 2015 15:41:26 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
Message-ID: <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>

Hi Barry, Matt,

Barry, your solution worked, but I wanted to follow up on a few points on grid sequencing

1.  If, in using the grid sequence, there is no preallocation of the matrix, does that mean I?m going to get hit (badly) with the mallocing as I increase the number of levels?


2.  In my problem, with real data, I see behavior like:

    0 SNES Function norm 7.948742655505e-03 
      0 KSP Residual norm 3.666593515373e-03 
      1 KSP Residual norm 7.943650614441e-16 
    1 SNES Function norm 9.001557371893e-07 
      0 KSP Residual norm 8.814810163693e-06 
      1 KSP Residual norm 6.638031123907e-18 
    2 SNES Function norm 4.176927119066e-11 
  0 SNES Function norm 1.500187158175e+02 
    0 KSP Residual norm 1.006776821797e-01 
    1 KSP Residual norm 2.010368372645e-13 
  1 SNES Function norm 5.899853203939e-03 
    0 KSP Residual norm 1.752660743738e-02 
    1 KSP Residual norm 1.244868008219e-14 
  2 SNES Function norm 1.748583606371e-06 
    0 KSP Residual norm 4.933624839470e-06 
    1 KSP Residual norm 5.789658241868e-18 
  3 SNES Function norm 2.034638891687e-10 

Where, when it gets to the first refinement, it?s not clear how much advantage its taking of the coarser solution.


3.  This problem is actually part of a continuation problem that roughly looks like this 

for( continuation parameter p = 0 to 1){

	solve with parameter p_i using solution from p_{i-1},
}

What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.

4.  When I do SNESSolve(snes, NULL, U) with grid sequencing, U is not the solution on the fine mesh.  But what is it?  Is it still the starting guess, or is it the solution on the coarse mesh?


-gideon

> On Aug 28, 2015, at 6:20 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Thu, Aug 27, 2015 at 10:23 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> 
> > On Aug 27, 2015, at 10:15 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >
> > That?s correct, I am not using the SNESGetDM.  I suppose I could.  Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field.  I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations.
> 
>     Nothing ever gets coarsened in grid sequencing, it only gets refined.
> 
>     The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy).
> 
> I think you might be misunderstanding the "scalars" part. He is solving a nonlinear eigenproblem (which he did not write down) for some variables. Then he
> uses those variable in the coupled diffusion equations he did write down. He has wrapped the whole problem in a SNES with 2 parts: the nonlinear eigenproblem
> and the diffusion equations. He uses DMComposite to deal with all the unknowns.
> 
> I think Nonlinear Block Gauss-Siedel on the different problems would be a useful starting point, but we do not have that.
> 
>   Thanks,
> 
>      Matt
>  
> >
> > Here is my form function.  I can send more code if needed.
> >
> 
>   Just change the user->packer that you use to be the DM obtained with SNESGetDM()
> 
> 
> > /* Form the system of equations for computing a blowup solution*/
> > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){
> >
> >  blowup_ctx *user = (blowup_ctx *) ctx;
> >  PetscInt i;
> >  PetscScalar dx, dx2, xmax,x;
> >  PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx;
> >  DMDALocalInfo info;
> >  Vec p_vec, Q_vec, Fp_vec, FQ_vec;
> >  PetscScalar *p_array, *Fp_array;
> >  Q *Qvals, *FQvals;
> >  PetscScalar Q2sig, W2sig;
> >  PetscScalar a,a2, b, u0, sigma;
> >
> >  dx = user->dx; dx2 = dx *dx;
> >  xmax = user->xmax;
> >  sigma = user->sigma;
> >
> >  /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma);  */
> >
> >  /* Extract raw arrays */
> >  DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec);
> >  DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec);
> >
> >  DMCompositeScatter(user->packer, U, p_vec, Q_vec);
> >  /* VecView(Q_vec,    PETSC_VIEWER_STDOUT_SELF); */
> >
> >  VecGetArray(p_vec,&p_array);
> >  VecGetArray(Fp_vec,&Fp_array);
> >
> >  DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals);
> >  DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals);
> >
> >  DMDAGetLocalInfo(user->Q_dm, &info);
> >
> >  a = p_array[0]; a2 = a*a;
> >  b = p_array[1];
> >  u0 = p_array[2];
> >
> >  /* Set boundary conditions at the origin*/
> >  if(info.xs ==0){
> >    set_origin_bcs(u0, Qvals);
> >  }
> >  /* Set boundray conditions in the far field */
> >  if(info.xs+ info.xm == info.mx <http://info.mx/>){
> >    set_farfield_bcs(xmax,sigma, a,  b,  dx, Qvals,info.mx <http://info.mx/>);
> >  }
> >
> >  /* Solve auxiliary equations */
> >  if(info.xs ==0){
> >    uxx = (2 * Qvals[0].u-2 * u0)/dx2;
> >    vxx = (Qvals[0].v + Qvals[0].g)/dx2;
> >    vx = (Qvals[0].v - Qvals[0].g)/(2*dx);
> >    Fp_array[0] = Qvals[0].u - Qvals[0].f;
> >    Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0;
> >    Fp_array[2] = -uxx + (1/a2) * u0
> >      + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx);
> >  }
> >
> >  /* Solve equations in the bulk */
> >  for(i=info.xs; i < info.xs + info.xm;i++){
> >
> >    u = Qvals[i].u;
> >    v = Qvals[i].v;
> >    f = Qvals[i].f;
> >    g = Qvals[i].g;
> >
> >    x = (i+1) * dx;
> >
> >    Q2sig = PetscPowScalar(u*u + v*v,sigma);
> >    W2sig= PetscPowScalar(f*f + g*g, sigma);
> >
> >    ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx);
> >    vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx);
> >    fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx);
> >    gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx);
> >
> >    uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2);
> >    vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2);
> >    fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2);
> >    gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2);
> >
> >    FQvals[i].u = -uxx +1/a2 * u
> >      + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx);
> >
> >    FQvals[i].v = -vxx +1/a2 * v
> >      - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux);
> >
> >    FQvals[i].f = -fxx +1/a2 * f
> >      + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx);
> >
> >    FQvals[i].g =-gxx +1/a2 * g
> >      - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx);
> >  }
> >
> >  /* Restore raw arrays */
> >  VecRestoreArray(p_vec, &p_array);
> >  VecRestoreArray(Fp_vec, &Fp_array);
> >
> >  DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals);
> >  DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals);
> >
> >  DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec);
> >  DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec);
> >  DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec);
> >
> >  return 0;
> > }
> >
> >
> > Here is the form function:
> >
> >
> >
> > -gideon
> >
> >> On Aug 27, 2015, at 11:09 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> >>
> >>
> >> Can you send the code, that will be the easiest way to find the problem.
> >>
> >>  My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic.
> >>
> >> Barry
> >>
> >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>>
> >>> I have it set up as:
> >>>
> >>>   DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
> >>>   DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
> >>>   DMCompositeAddDM(user.packer,user.p_dm);
> >>>   DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
> >>>              nx, 4, 1, NULL, &user.Q_dm);
> >>>   DMCompositeAddDM(user.packer,user.Q_dm);
> >>>   DMCreateGlobalVector(user.packer,&U);
> >>>
> >>> where the user.packer structure has
> >>>
> >>> DM packer;
> >>> DM p_dm, Q_dm;
> >>>
> >>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues).
> >>>
> >>> Here are some of the errors that are generated:
> >>>
> >>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >>> [0]PETSC ERROR: Argument out of range
> >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc
> >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
> >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown
> >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
> >>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
> >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
> >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> >>> [1]PETSC ERROR: Argument out of range
> >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
> >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html <http://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown
> >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
> >>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
> >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
> >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
> >>>
> >>>
> >>>
> >>> -gideon
> >>>
> >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> >>>>
> >>>>
> >>>> We need the full error message.
> >>>>
> >>>> But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars.
> >>>>
> >>>> Barry
> >>>>
> >>>> Though you should not get this error even if you are using a DMDA there.
> >>>>
> >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>>>>
> >>>>> I?m getting the following errors:
> >>>>>
> >>>>> [1]PETSC ERROR: Argument out of range
> >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
> >>>>>
> >>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
> >>>>>
> >>>>> -gideon
> >>>>>
> >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
> >>>>>>
> >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>>>>> HI Barry,
> >>>>>>
> >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
> >>>>>>
> >>>>>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
> >>>>>>
> >>>>>> Matt
> >>>>>>
> >>>>>> -gideon
> >>>>>>
> >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> Gideon,
> >>>>>>>
> >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
> >>>>>>>
> >>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
> >>>>>>>
> >>>>>>> Barry
> >>>>>>>
> >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com <mailto:gideon.simpson at gmail.com>> wrote:
> >>>>>>>>
> >>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
> >>>>>>>>
> >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx
> >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx
> >>>>>>>>
> >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
> >>>>>>>>
> >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
> >>>>>>>>
> >>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
> >>>>>>>>
> >>>>>>>> -gideon
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> >>>>>> -- Norbert Wiener
> >>>>>
> >>>>
> >>>
> >>
> >
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150828/70b46a4a/attachment-0001.html>

From bsmith at mcs.anl.gov  Fri Aug 28 14:55:18 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 28 Aug 2015 14:55:18 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
	<0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>
Message-ID: <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov>


> On Aug 28, 2015, at 2:41 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> Hi Barry, Matt,
> 
> Barry, your solution worked, but I wanted to follow up on a few points on grid sequencing
> 
> 1.  If, in using the grid sequence, there is no preallocation of the matrix, does that mean I?m going to get hit (badly) with the mallocing as I increase the number of levels?

   Actually the preallocation of the matrix is the same with or without grid sequencing! It is doing the proper preallocation for the DMDA part of the matrix, it is the coupling between the DMDA variables and the DMREDUNDANT variables that is not preallocated. This may be a performance hit for large problems but lets cross that bridge when we get to it.


> 
> 
> 2.  In my problem, with real data, I see behavior like:
> 
>     0 SNES Function norm 7.948742655505e-03 
>       0 KSP Residual norm 3.666593515373e-03 
>       1 KSP Residual norm 7.943650614441e-16 
>     1 SNES Function norm 9.001557371893e-07 
>       0 KSP Residual norm 8.814810163693e-06 
>       1 KSP Residual norm 6.638031123907e-18 
>     2 SNES Function norm 4.176927119066e-11 
>   0 SNES Function norm 1.500187158175e+02 
>     0 KSP Residual norm 1.006776821797e-01 
>     1 KSP Residual norm 2.010368372645e-13 
>   1 SNES Function norm 5.899853203939e-03 
>     0 KSP Residual norm 1.752660743738e-02 
>     1 KSP Residual norm 1.244868008219e-14 
>   2 SNES Function norm 1.748583606371e-06 
>     0 KSP Residual norm 4.933624839470e-06 
>     1 KSP Residual norm 5.789658241868e-18 
>   3 SNES Function norm 2.034638891687e-10 
> 
> Where, when it gets to the first refinement, it?s not clear how much advantage its taking of the coarser solution.

   I see this often with grid sequencing. When you interpolate to the next mesh the residual norm is still pretty big so it looks like it does not help, but actually even though the residual norm is big the initial guess is still much better for Newton's method. So the question is not if the interpolated residual norm is big, instead the question is can you get convergence on finer and finer meshes that you could not get before. You'll need to run the refinement for several levels to see, I predict you will.
> 
> 
> 3.  This problem is actually part of a continuation problem that roughly looks like this 
> 
> for( continuation parameter p = 0 to 1){
> 
> 	solve with parameter p_i using solution from p_{i-1},
> }
> 
> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.

   So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).

   If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: 

Do not use -snes_grid_sequencing  

Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.

Call SNESSetGridSequence()

Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.

> 
> 4.  When I do SNESSolve(snes, NULL, U) with grid sequencing, U is not the solution on the fine mesh.  But what is it?  Is it still the starting guess, or is it the solution on the coarse mesh?

  It will contain the solution on the coarse mesh. After SNESSolve() call SNESGetSolution() and it will give back the solution on the fine mesh.

  Barry

> 
> 
> -gideon
> 
>> On Aug 28, 2015, at 6:20 AM, Matthew Knepley <knepley at gmail.com> wrote:
>> 
>> On Thu, Aug 27, 2015 at 10:23 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> > On Aug 27, 2015, at 10:15 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> >
>> > That?s correct, I am not using the SNESGetDM.  I suppose I could.  Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field.  I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations.
>> 
>>     Nothing ever gets coarsened in grid sequencing, it only gets refined.
>> 
>>     The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy).
>> 
>> I think you might be misunderstanding the "scalars" part. He is solving a nonlinear eigenproblem (which he did not write down) for some variables. Then he
>> uses those variable in the coupled diffusion equations he did write down. He has wrapped the whole problem in a SNES with 2 parts: the nonlinear eigenproblem
>> and the diffusion equations. He uses DMComposite to deal with all the unknowns.
>> 
>> I think Nonlinear Block Gauss-Siedel on the different problems would be a useful starting point, but we do not have that.
>> 
>>   Thanks,
>> 
>>      Matt
>>  
>> >
>> > Here is my form function.  I can send more code if needed.
>> >
>> 
>>   Just change the user->packer that you use to be the DM obtained with SNESGetDM()
>> 
>> 
>> > /* Form the system of equations for computing a blowup solution*/
>> > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){
>> >
>> >  blowup_ctx *user = (blowup_ctx *) ctx;
>> >  PetscInt i;
>> >  PetscScalar dx, dx2, xmax,x;
>> >  PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx;
>> >  DMDALocalInfo info;
>> >  Vec p_vec, Q_vec, Fp_vec, FQ_vec;
>> >  PetscScalar *p_array, *Fp_array;
>> >  Q *Qvals, *FQvals;
>> >  PetscScalar Q2sig, W2sig;
>> >  PetscScalar a,a2, b, u0, sigma;
>> >
>> >  dx = user->dx; dx2 = dx *dx;
>> >  xmax = user->xmax;
>> >  sigma = user->sigma;
>> >
>> >  /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma);  */
>> >
>> >  /* Extract raw arrays */
>> >  DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec);
>> >  DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec);
>> >
>> >  DMCompositeScatter(user->packer, U, p_vec, Q_vec);
>> >  /* VecView(Q_vec,    PETSC_VIEWER_STDOUT_SELF); */
>> >
>> >  VecGetArray(p_vec,&p_array);
>> >  VecGetArray(Fp_vec,&Fp_array);
>> >
>> >  DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals);
>> >  DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals);
>> >
>> >  DMDAGetLocalInfo(user->Q_dm, &info);
>> >
>> >  a = p_array[0]; a2 = a*a;
>> >  b = p_array[1];
>> >  u0 = p_array[2];
>> >
>> >  /* Set boundary conditions at the origin*/
>> >  if(info.xs ==0){
>> >    set_origin_bcs(u0, Qvals);
>> >  }
>> >  /* Set boundray conditions in the far field */
>> >  if(info.xs+ info.xm == info.mx){
>> >    set_farfield_bcs(xmax,sigma, a,  b,  dx, Qvals,info.mx);
>> >  }
>> >
>> >  /* Solve auxiliary equations */
>> >  if(info.xs ==0){
>> >    uxx = (2 * Qvals[0].u-2 * u0)/dx2;
>> >    vxx = (Qvals[0].v + Qvals[0].g)/dx2;
>> >    vx = (Qvals[0].v - Qvals[0].g)/(2*dx);
>> >    Fp_array[0] = Qvals[0].u - Qvals[0].f;
>> >    Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0;
>> >    Fp_array[2] = -uxx + (1/a2) * u0
>> >      + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx);
>> >  }
>> >
>> >  /* Solve equations in the bulk */
>> >  for(i=info.xs; i < info.xs + info.xm;i++){
>> >
>> >    u = Qvals[i].u;
>> >    v = Qvals[i].v;
>> >    f = Qvals[i].f;
>> >    g = Qvals[i].g;
>> >
>> >    x = (i+1) * dx;
>> >
>> >    Q2sig = PetscPowScalar(u*u + v*v,sigma);
>> >    W2sig= PetscPowScalar(f*f + g*g, sigma);
>> >
>> >    ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx);
>> >    vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx);
>> >    fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx);
>> >    gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx);
>> >
>> >    uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2);
>> >    vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2);
>> >    fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2);
>> >    gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2);
>> >
>> >    FQvals[i].u = -uxx +1/a2 * u
>> >      + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx);
>> >
>> >    FQvals[i].v = -vxx +1/a2 * v
>> >      - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux);
>> >
>> >    FQvals[i].f = -fxx +1/a2 * f
>> >      + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx);
>> >
>> >    FQvals[i].g =-gxx +1/a2 * g
>> >      - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx);
>> >  }
>> >
>> >  /* Restore raw arrays */
>> >  VecRestoreArray(p_vec, &p_array);
>> >  VecRestoreArray(Fp_vec, &Fp_array);
>> >
>> >  DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals);
>> >  DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals);
>> >
>> >  DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec);
>> >  DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec);
>> >  DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec);
>> >
>> >  return 0;
>> > }
>> >
>> >
>> > Here is the form function:
>> >
>> >
>> >
>> > -gideon
>> >
>> >> On Aug 27, 2015, at 11:09 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >>
>> >>
>> >> Can you send the code, that will be the easiest way to find the problem.
>> >>
>> >>  My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic.
>> >>
>> >> Barry
>> >>
>> >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> >>>
>> >>> I have it set up as:
>> >>>
>> >>>   DMCompositeCreate(PETSC_COMM_WORLD, &user.packer);
>> >>>   DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm);
>> >>>   DMCompositeAddDM(user.packer,user.p_dm);
>> >>>   DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,
>> >>>              nx, 4, 1, NULL, &user.Q_dm);
>> >>>   DMCompositeAddDM(user.packer,user.Q_dm);
>> >>>   DMCreateGlobalVector(user.packer,&U);
>> >>>
>> >>> where the user.packer structure has
>> >>>
>> >>> DM packer;
>> >>> DM p_dm, Q_dm;
>> >>>
>> >>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues).
>> >>>
>> >>> Here are some of the errors that are generated:
>> >>>
>> >>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >>> [0]PETSC ERROR: Argument out of range
>> >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc
>> >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check
>> >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown
>> >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
>> >>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>> >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
>> >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> >>> [1]PETSC ERROR: Argument out of range
>> >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>> >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown
>> >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015
>> >>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp
>> >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c
>> >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c
>> >>>
>> >>>
>> >>>
>> >>> -gideon
>> >>>
>> >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >>>>
>> >>>>
>> >>>> We need the full error message.
>> >>>>
>> >>>> But are you using a DMDA for the scalars?  You should not be, you should be using a DMRedundant for the scalars.
>> >>>>
>> >>>> Barry
>> >>>>
>> >>>> Though you should not get this error even if you are using a DMDA there.
>> >>>>
>> >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> >>>>>
>> >>>>> I?m getting the following errors:
>> >>>>>
>> >>>>> [1]PETSC ERROR: Argument out of range
>> >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix
>> >>>>>
>> >>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables?
>> >>>>>
>> >>>>> -gideon
>> >>>>>
>> >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> >>>>>>
>> >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> >>>>>> HI Barry,
>> >>>>>>
>> >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again.  I?m not entirely clear on the practical implementation
>> >>>>>>
>> >>>>>> SNES should do this automatically using -snes_grid_sequence <k>.  If this does not work, complain. Loudly.
>> >>>>>>
>> >>>>>> Matt
>> >>>>>>
>> >>>>>> -gideon
>> >>>>>>
>> >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Gideon,
>> >>>>>>>
>> >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh.
>> >>>>>>>
>> >>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this.
>> >>>>>>>
>> >>>>>>> Barry
>> >>>>>>>
>> >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> >>>>>>>>
>> >>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form
>> >>>>>>>>
>> >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx
>> >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx
>> >>>>>>>>
>> >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh.  I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU).
>> >>>>>>>>
>> >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e.  10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}),  eventually returning reason -6 (failed line search).
>> >>>>>>>>
>> >>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this.  However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables.  Any suggestions on how to get started?
>> >>>>>>>>
>> >>>>>>>> -gideon
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> >>>>>> -- Norbert Wiener
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>> 
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener


From gideon.simpson at gmail.com  Fri Aug 28 15:04:47 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Fri, 28 Aug 2015 16:04:47 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
	<0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>
	<122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov>
Message-ID: <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com>

Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint.  The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.  

One subtlety is that I actually want the intermediate continuation solutions  too.  Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one.  So I now need to go back an refine them.  I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.

The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined.  Perhaps that?s the most practical thing to do.

-gideon

> On Aug 28, 2015, at 3:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>> 
>> 
>> 3.  This problem is actually part of a continuation problem that roughly looks like this 
>> 
>> for( continuation parameter p = 0 to 1){
>> 
>> 	solve with parameter p_i using solution from p_{i-1},
>> }
>> 
>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.
> 
>   So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).
> 
>   If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: 
> 
> Do not use -snes_grid_sequencing  
> 
> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.
> 
> Call SNESSetGridSequence()
> 
> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150828/d8a3328f/attachment.html>

From bsmith at mcs.anl.gov  Fri Aug 28 15:21:44 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 28 Aug 2015 15:21:44 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
	<0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>
	<122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov>
	<3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com>
Message-ID: <E9DDFD9A-E597-4B30-8F8E-829A6129BF75@mcs.anl.gov>


> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint.  The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.  
> 
> One subtlety is that I actually want the intermediate continuation solutions  too.  Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one.  So I now need to go back an refine them.  I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.
> 
> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined.  Perhaps that?s the most practical thing to do.

   I would do the following. Create your DM and create a SNES that will do the continuation

   loop over continuation parameter

        SNESSolve(snes,NULL,Ucoarse);

        if (you decide you want to see the refined solution at this continuation point) {
             SNESCreate(comm,&snesrefine);
             SNESSetDM()
             etc
             SNESSetGridSequence(snesrefine,)
             SNESSolve(snesrefine,0,Ucoarse);
             SNESGetSolution(snesrefine,&Ufine);
             VecView(Ufine or do whatever you want to do with the Ufine at that continuation point
             SNESDestroy(snesrefine);
       end if

   end loop over continuation parameter.

   Barry

> 
> -gideon
> 
>> On Aug 28, 2015, at 3:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>> 
>>> 
>>> 3.  This problem is actually part of a continuation problem that roughly looks like this 
>>> 
>>> for( continuation parameter p = 0 to 1){
>>> 
>>> 	solve with parameter p_i using solution from p_{i-1},
>>> }
>>> 
>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.
>> 
>>   So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).
>> 
>>   If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: 
>> 
>> Do not use -snes_grid_sequencing  
>> 
>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.
>> 
>> Call SNESSetGridSequence()
>> 
>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.
> 


From hanklammiv at gmail.com  Fri Aug 28 16:13:56 2015
From: hanklammiv at gmail.com (Hank Lamm)
Date: Fri, 28 Aug 2015 14:13:56 -0700
Subject: [petsc-users] Increasing nodes doesn't decrease memory per node.
Message-ID: <CAG5-w77v5c_nFwQRQcv8coXSSVR4km0osiU+2PH0q6wRpoPM8g@mail.gmail.com>

Hi All,

I am having a problem running Petsc3.6 and Slepc3.6 on Stampede.  My code
should be a simple eigenvalue solver, but when I attempt to solve large
problems (8488x8488 matrices) I get errors:

--------------------- Error Message
--------------------------------------------------------------
[1]Total space allocated 1736835920 bytes
[1]PETSC ERROR: Out of memory. This could be due to allocating
[1]PETSC ERROR: too large an object or bleeding by not properly
[1]PETSC ERROR: destroying unneeded objects.
[1]PETSC ERROR: Memory allocated 1736835920 Memory used by process
1769742336
[1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796
[1]PETSC ERROR: [0]PETSC ERROR: See
http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[1]PETSC ERROR: #8 STSetUp() line 305 in
/work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
[1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line 4030
in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
[1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in
/home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c
[1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in
/home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
[1]PETSC ERROR: #5 MatDuplicate() line 4252 in
/home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c
[1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in
/work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
[1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in
/work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c
[1]PETSC ERROR: #8 STSetUp() line 305 in
/work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
[1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
[1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
[1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c
[1]PETSC ERROR: #12 EPSSetUp() line 121 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c
[1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
[1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
[1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c
[1]PETSC ERROR: #16 EPSSetUp() line 121 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c
[1]PETSC ERROR: #17 EPSSolve() line 88 in
/work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c
[1]PETSC ERROR: #18 eigensolver() line 64 in
/work/03324/hlammiv/TMSWIFT/src/solver.cpp
[1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced()
1.73684e+09
[1]Current process memory 1.76979e+09 max process memory 1.76979e+09


The curious thing about this error, is that it seems that if I increase the
number of nodes, from 32 to 64 to 128, the amount of memory per node
doesn't decrease.  I have used valgrind and it doesn't seem to a memory
leak.

The relevant code piece is:

void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc,
char **argv)
{


    EPS        eps;             /* eigenproblem solver context */
      EPSType        type;
      ST             st;
    KSP            ksp;
      PC             pc;
    PetscReal    tol,error;
    PetscReal    lower,upper;
      //PetscInt       nev=dim,maxit,its;
      PetscInt           nev,maxit,its,nconv;
      Vec                xr,xi;
      PetscScalar       kr,ki;
    PetscReal     re,im;
      PetscViewer        viewer;
    PetscInt rank;
    PetscInt size;
    std::string eig_file_n;
    std::ofstream eig_file;
    char ofile[100];

        MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
        MPI_Comm_size(PETSC_COMM_WORLD,&size);

      ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue
Solver---\n");CHKERRV(ierr);
     ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr);

        eig_file_n.append(params->ofile_n);
        eig_file_n.append("_eval");
        eig_file.open(eig_file_n.c_str(),std::ofstream::trunc);

    //Set operators. In this case, it is a standard eigenvalue problem
    ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr);
    ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr);

    ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr);

      ierr = EPSGetST(eps,&st);CHKERRV(ierr);
      ierr = STSetType(st,STSINVERT);CHKERRV(ierr);

      ierr = STGetKSP(st,&ksp);CHKERRV(ierr);
      ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr);
      ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr);
      ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr);
    ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr);

    for(PetscInt i=0;i<params->nf;i++){

lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0);
    upper=4.0*params->m[i]*params->m[i];
    ierr = EPSSetInterval(eps,lower,upper);
    ierr = EPSSetWhichEigenpairs(eps,EPS_ALL);
        //Set solver parameters at runtime
      ierr = EPSSetFromOptions(eps);CHKERRV(ierr);
//     ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);

    ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr);
    ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr);


       ierr = EPSSolve(eps);CHKERRV(ierr);

       ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr);
      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the
method: %D\n",its);CHKERRV(ierr);


       //Optional: Get some information from the solver and display it
      ierr = EPSGetType(eps,&type);CHKERRV(ierr);
      ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method:
%s\n\n",type);CHKERRV(ierr);
      ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr);
      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested
eigenvalues: %D\n",nev);CHKERRV(ierr);
      ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr);
      ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition: tol=%.4g,
maxit=%D\n",tol,maxit);CHKERRV(ierr);

    ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr);
      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged eigenpairs:
%D\n\n",nconv);CHKERRV(ierr);

    strcpy(ofile,params->ofile_n);
          strcat(ofile,"_evecr");

        ierr =
PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr);

    if (nconv>0)
    {
            ierr = PetscPrintf(PETSC_COMM_WORLD,
                 "           k          ||Ax-kx||/||kx||\n"
                 "   ----------------- ------------------\n");CHKERRV(ierr);

        for (PetscInt i=0;i<nconv;i++)
        {
            //Get converged eigenpairs: i-th eigenvalue is stored in kr
(real part) and ki (imaginary part)
                  ierr = EPSGetEigenpair(eps,i,&kr,&ki,xr,xi);CHKERRV(ierr);
                 //Compute the relative error associated to each eigenpair
                 ierr =
EPSComputeError(eps,i,EPS_ERROR_RELATIVE,&error);CHKERRV(ierr);

            #if defined(PETSC_USE_COMPLEX)
                      re = PetscRealPart(kr);
                      im = PetscImaginaryPart(kr);
            #else
                      re = kr;
                      im = ki;
            #endif

                  if (im!=0.0)
            {

                ierr = PetscPrintf(PETSC_COMM_WORLD," %9f%+9f j
%12g\n",re,im,error);CHKERRV(ierr);
                if(rank==0) eig_file << re << " " << im << " " << error <<
std::endl;
            } else
            {
                    ierr = PetscPrintf(PETSC_COMM_WORLD,"   %12f
%12g\n",re,error);CHKERRV(ierr);
                if(rank==0) eig_file << re << " " << 0 << " " << error <<
std::endl;
                 }

                        ierr = VecView(xr,viewer);CHKERRV(ierr);

        }
            ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");CHKERRV(ierr);
      }
    }
    eig_file.close();
    ierr = EPSDestroy(&eps);CHKERRV(ierr);
    ierr = PetscViewerDestroy(&viewer);CHKERRV(ierr);
    ierr = VecDestroy(&xr);CHKERRV(ierr);
    ierr = VecDestroy(&xi);CHKERRV(ierr);

      ierr = PetscPrintf(PETSC_COMM_WORLD,"---Finishing Eigenvalue
Solver---\n");CHKERRV(ierr);
}


Thanks,
Hank
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150828/21052ec5/attachment.html>

From gideon.simpson at gmail.com  Fri Aug 28 16:35:35 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Fri, 28 Aug 2015 17:35:35 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <E9DDFD9A-E597-4B30-8F8E-829A6129BF75@mcs.anl.gov>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
	<0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>
	<122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov>
	<3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com>
	<E9DDFD9A-E597-4B30-8F8E-829A6129BF75@mcs.anl.gov>
Message-ID: <E46EA93B-06AF-4AD2-A006-1E2A27269701@gmail.com>

Hi Barry,

Ok, I tried that and it works as intended, but there?s something I noticed.  If i use that, and do a SNESGetConvergedReason on the snesrefine, it always seems to return 0.  Is there a reason for that? 

-gideon

> On Aug 28, 2015, at 4:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>> 
>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint.  The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.  
>> 
>> One subtlety is that I actually want the intermediate continuation solutions  too.  Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one.  So I now need to go back an refine them.  I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.
>> 
>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined.  Perhaps that?s the most practical thing to do.
> 
>   I would do the following. Create your DM and create a SNES that will do the continuation
> 
>   loop over continuation parameter
> 
>        SNESSolve(snes,NULL,Ucoarse);
> 
>        if (you decide you want to see the refined solution at this continuation point) {
>             SNESCreate(comm,&snesrefine);
>             SNESSetDM()
>             etc
>             SNESSetGridSequence(snesrefine,)
>             SNESSolve(snesrefine,0,Ucoarse);
>             SNESGetSolution(snesrefine,&Ufine);
>             VecView(Ufine or do whatever you want to do with the Ufine at that continuation point
>             SNESDestroy(snesrefine);
>       end if
> 
>   end loop over continuation parameter.
> 
>   Barry
> 
>> 
>> -gideon
>> 
>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>>> 
>>>> 
>>>> 3.  This problem is actually part of a continuation problem that roughly looks like this 
>>>> 
>>>> for( continuation parameter p = 0 to 1){
>>>> 
>>>> 	solve with parameter p_i using solution from p_{i-1},
>>>> }
>>>> 
>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.
>>> 
>>>  So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).
>>> 
>>>  If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: 
>>> 
>>> Do not use -snes_grid_sequencing  
>>> 
>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.
>>> 
>>> Call SNESSetGridSequence()
>>> 
>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150828/b37dea45/attachment-0001.html>

From bsmith at mcs.anl.gov  Fri Aug 28 17:03:01 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 28 Aug 2015 17:03:01 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <E46EA93B-06AF-4AD2-A006-1E2A27269701@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
	<0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>
	<122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov>
	<3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com>
	<E9DDFD9A-E597-4B30-8F8E-829A6129BF75@mcs.anl.gov>
	<E46EA93B-06AF-4AD2-A006-1E2A27269701@gmail.com>
Message-ID: <76703FF4-3008-4744-B334-C1EB732DFC4C@mcs.anl.gov>


> On Aug 28, 2015, at 4:35 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> Hi Barry,
> 
> Ok, I tried that and it works as intended, but there?s something I noticed.  If i use that, and do a SNESGetConvergedReason on the snesrefine, it always seems to return 0.  Is there a reason for that? 

  Should never do that; are you sure that SNESSolve() has actually been called on it. What does -snes_monitor and -snes_converged_reason show.

  Barry


> 
> -gideon
> 
>> On Aug 28, 2015, at 4:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>> 
>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>> 
>>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint.  The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.  
>>> 
>>> One subtlety is that I actually want the intermediate continuation solutions  too.  Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one.  So I now need to go back an refine them.  I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.
>>> 
>>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined.  Perhaps that?s the most practical thing to do.
>> 
>>   I would do the following. Create your DM and create a SNES that will do the continuation
>> 
>>   loop over continuation parameter
>> 
>>        SNESSolve(snes,NULL,Ucoarse);
>> 
>>        if (you decide you want to see the refined solution at this continuation point) {
>>             SNESCreate(comm,&snesrefine);
>>             SNESSetDM()
>>             etc
>>             SNESSetGridSequence(snesrefine,)
>>             SNESSolve(snesrefine,0,Ucoarse);
>>             SNESGetSolution(snesrefine,&Ufine);
>>             VecView(Ufine or do whatever you want to do with the Ufine at that continuation point
>>             SNESDestroy(snesrefine);
>>       end if
>> 
>>   end loop over continuation parameter.
>> 
>>   Barry
>> 
>>> 
>>> -gideon
>>> 
>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> 3.  This problem is actually part of a continuation problem that roughly looks like this 
>>>>> 
>>>>> for( continuation parameter p = 0 to 1){
>>>>> 
>>>>> 	solve with parameter p_i using solution from p_{i-1},
>>>>> }
>>>>> 
>>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.
>>>> 
>>>>  So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).
>>>> 
>>>>  If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: 
>>>> 
>>>> Do not use -snes_grid_sequencing  
>>>> 
>>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.
>>>> 
>>>> Call SNESSetGridSequence()
>>>> 
>>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.
> 


From bsmith at mcs.anl.gov  Fri Aug 28 18:14:40 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 28 Aug 2015 18:14:40 -0500
Subject: [petsc-users] Increasing nodes doesn't decrease memory per node.
In-Reply-To: <CAG5-w77v5c_nFwQRQcv8coXSSVR4km0osiU+2PH0q6wRpoPM8g@mail.gmail.com>
References: <CAG5-w77v5c_nFwQRQcv8coXSSVR4km0osiU+2PH0q6wRpoPM8g@mail.gmail.com>
Message-ID: <E4FC8DE2-8040-4423-AAAC-CD95B5E9FF00@mcs.anl.gov>


  It is using a SeqAIJ matrix, not a parallel matrix. Increasing the number of cores won't affect the size of a sequential matrix since it must be stored entirely on one process.  Perhaps you need to use parallel matrices?


[1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
[1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c


> On Aug 28, 2015, at 4:13 PM, Hank Lamm <hanklammiv at gmail.com> wrote:
> 
> Hi All,
> 
> I am having a problem running Petsc3.6 and Slepc3.6 on Stampede.  My code should be a simple eigenvalue solver, but when I attempt to solve large problems (8488x8488 matrices) I get errors:
> 
> --------------------- Error Message --------------------------------------------------------------
> [1]Total space allocated 1736835920 bytes
> [1]PETSC ERROR: Out of memory. This could be due to allocating
> [1]PETSC ERROR: too large an object or bleeding by not properly
> [1]PETSC ERROR: destroying unneeded objects.
> [1]PETSC ERROR: Memory allocated 1736835920 Memory used by process 1769742336
> [1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796
> [1]PETSC ERROR: [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
> [1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line 4030 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
> [1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c
> [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
> [1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c
> [1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
> [1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c
> [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
> [1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
> [1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
> [1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c
> [1]PETSC ERROR: #12 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c
> [1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
> [1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
> [1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c
> [1]PETSC ERROR: #16 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c
> [1]PETSC ERROR: #17 EPSSolve() line 88 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c
> [1]PETSC ERROR: #18 eigensolver() line 64 in /work/03324/hlammiv/TMSWIFT/src/solver.cpp
> [1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced() 1.73684e+09
> [1]Current process memory 1.76979e+09 max process memory 1.76979e+09
> 
> 
> The curious thing about this error, is that it seems that if I increase the number of nodes, from 32 to 64 to 128, the amount of memory per node doesn't decrease.  I have used valgrind and it doesn't seem to a memory leak.
> 
> The relevant code piece is:
> 
> void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc, char **argv)
> {
>     
> 
>     EPS        eps;             /* eigenproblem solver context */
>       EPSType        type;
>       ST             st;
>     KSP            ksp;
>       PC             pc; 
>     PetscReal    tol,error;
>     PetscReal    lower,upper;
>       //PetscInt       nev=dim,maxit,its;
>       PetscInt           nev,maxit,its,nconv;
>       Vec                xr,xi;
>       PetscScalar       kr,ki;
>     PetscReal     re,im;
>       PetscViewer        viewer;
>     PetscInt rank;
>     PetscInt size;
>     std::string eig_file_n;
>     std::ofstream eig_file;    
>     char ofile[100];
> 
>         MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
>         MPI_Comm_size(PETSC_COMM_WORLD,&size);
> 
>       ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue Solver---\n");CHKERRV(ierr);
>      ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr);
> 
>         eig_file_n.append(params->ofile_n);
>         eig_file_n.append("_eval");
>         eig_file.open(eig_file_n.c_str(),std::ofstream::trunc);
> 
>     //Set operators. In this case, it is a standard eigenvalue problem
>     ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr);
>     ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr);  
> 
>     ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr);
> 
>       ierr = EPSGetST(eps,&st);CHKERRV(ierr);
>       ierr = STSetType(st,STSINVERT);CHKERRV(ierr);
>   
>       ierr = STGetKSP(st,&ksp);CHKERRV(ierr);
>       ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr);
>       ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr);
>       ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr);
>     ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr);
> 
>     for(PetscInt i=0;i<params->nf;i++){
>     lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0);
>     upper=4.0*params->m[i]*params->m[i];
>     ierr = EPSSetInterval(eps,lower,upper);
>     ierr = EPSSetWhichEigenpairs(eps,EPS_ALL);
>         //Set solver parameters at runtime
>       ierr = EPSSetFromOptions(eps);CHKERRV(ierr);
> //     ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);
> 
>     ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr);
>     ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr);
> 
> 
>        ierr = EPSSolve(eps);CHKERRV(ierr);
> 
>        ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr);
>       ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the method: %D\n",its);CHKERRV(ierr);
> 
>  
>        //Optional: Get some information from the solver and display it
>       ierr = EPSGetType(eps,&type);CHKERRV(ierr);
>       ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method: %s\n\n",type);CHKERRV(ierr);
>       ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr);
>       ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested eigenvalues: %D\n",nev);CHKERRV(ierr);
>       ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr);
>       ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition: tol=%.4g, maxit=%D\n",tol,maxit);CHKERRV(ierr);
> 
>     ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr);
>       ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged eigenpairs: %D\n\n",nconv);CHKERRV(ierr);
>         
>     strcpy(ofile,params->ofile_n);
>           strcat(ofile,"_evecr");
> 
>         ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr);
> 
>     if (nconv>0) 
>     {
>             ierr = PetscPrintf(PETSC_COMM_WORLD,
>                  "           k          ||Ax-kx||/||kx||\n"
>                  "   ----------------- ------------------\n");CHKERRV(ierr);
> 
>         for (PetscInt i=0;i<nconv;i++)
>         {
>             //Get converged eigenpairs: i-th eigenvalue is stored in kr (real part) and ki (imaginary part)
>                   ierr = EPSGetEigenpair(eps,i,&kr,&ki,xr,xi);CHKERRV(ierr);
>                  //Compute the relative error associated to each eigenpair
>                  ierr = EPSComputeError(eps,i,EPS_ERROR_RELATIVE,&error);CHKERRV(ierr);
>                 
>             #if defined(PETSC_USE_COMPLEX)
>                       re = PetscRealPart(kr);
>                       im = PetscImaginaryPart(kr);
>             #else
>                       re = kr;
>                       im = ki;
>             #endif
> 
>                   if (im!=0.0)
>             {
>                 
>                 ierr = PetscPrintf(PETSC_COMM_WORLD," %9f%+9f j %12g\n",re,im,error);CHKERRV(ierr);
>                 if(rank==0) eig_file << re << " " << im << " " << error << std::endl;
>             } else 
>             {
>                     ierr = PetscPrintf(PETSC_COMM_WORLD,"   %12f       %12g\n",re,error);CHKERRV(ierr);
>                 if(rank==0) eig_file << re << " " << 0 << " " << error << std::endl;
>                  }
> 
>                         ierr = VecView(xr,viewer);CHKERRV(ierr);
> 
>         }
>             ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");CHKERRV(ierr);
>       }
>     }
>     eig_file.close();
>     ierr = EPSDestroy(&eps);CHKERRV(ierr);
>     ierr = PetscViewerDestroy(&viewer);CHKERRV(ierr);
>     ierr = VecDestroy(&xr);CHKERRV(ierr);
>     ierr = VecDestroy(&xi);CHKERRV(ierr);
>     
>       ierr = PetscPrintf(PETSC_COMM_WORLD,"---Finishing Eigenvalue Solver---\n");CHKERRV(ierr);
> }
> 
> 
> 
> Thanks,
> Hank


From gideon.simpson at gmail.com  Fri Aug 28 19:29:03 2015
From: gideon.simpson at gmail.com (Gideon Simpson)
Date: Fri, 28 Aug 2015 20:29:03 -0400
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <76703FF4-3008-4744-B334-C1EB732DFC4C@mcs.anl.gov>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
	<0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>
	<122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov>
	<3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com>
	<E9DDFD9A-E597-4B30-8F8E-829A6129BF75@mcs.anl.gov>
	<E46EA93B-06AF-4AD2-A006-1E2A27269701@gmail.com>
	<76703FF4-3008-4744-B334-C1EB732DFC4C@mcs.anl.gov>
Message-ID: <BA01801B-9CBD-4545-A0B5-E5BDF46FE983@gmail.com>

That was a mistake on my part.  But I did want to ask, what should be the behavior with a grid sequence if the SNES fails during one of the intermediate steps?

-gideon

> On Aug 28, 2015, at 6:03 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>> On Aug 28, 2015, at 4:35 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>> 
>> Hi Barry,
>> 
>> Ok, I tried that and it works as intended, but there?s something I noticed.  If i use that, and do a SNESGetConvergedReason on the snesrefine, it always seems to return 0.  Is there a reason for that? 
> 
>  Should never do that; are you sure that SNESSolve() has actually been called on it. What does -snes_monitor and -snes_converged_reason show.
> 
>  Barry
> 
> 
> 
>> 
>> -gideon
>> 
>>> On Aug 28, 2015, at 4:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>> 
>>>> 
>>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>> 
>>>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint.  The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.  
>>>> 
>>>> One subtlety is that I actually want the intermediate continuation solutions  too.  Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one.  So I now need to go back an refine them.  I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.
>>>> 
>>>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined.  Perhaps that?s the most practical thing to do.
>>> 
>>>  I would do the following. Create your DM and create a SNES that will do the continuation
>>> 
>>>  loop over continuation parameter
>>> 
>>>       SNESSolve(snes,NULL,Ucoarse);
>>> 
>>>       if (you decide you want to see the refined solution at this continuation point) {
>>>            SNESCreate(comm,&snesrefine);
>>>            SNESSetDM()
>>>            etc
>>>            SNESSetGridSequence(snesrefine,)
>>>            SNESSolve(snesrefine,0,Ucoarse);
>>>            SNESGetSolution(snesrefine,&Ufine);
>>>            VecView(Ufine or do whatever you want to do with the Ufine at that continuation point
>>>            SNESDestroy(snesrefine);
>>>      end if
>>> 
>>>  end loop over continuation parameter.
>>> 
>>>  Barry
>>> 
>>>> 
>>>> -gideon
>>>> 
>>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 3.  This problem is actually part of a continuation problem that roughly looks like this 
>>>>>> 
>>>>>> for( continuation parameter p = 0 to 1){
>>>>>> 
>>>>>> 	solve with parameter p_i using solution from p_{i-1},
>>>>>> }
>>>>>> 
>>>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.
>>>>> 
>>>>> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).
>>>>> 
>>>>> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: 
>>>>> 
>>>>> Do not use -snes_grid_sequencing  
>>>>> 
>>>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.
>>>>> 
>>>>> Call SNESSetGridSequence()
>>>>> 
>>>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150828/e7c43b8b/attachment.html>

From bsmith at mcs.anl.gov  Fri Aug 28 19:54:59 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 28 Aug 2015 19:54:59 -0500
Subject: [petsc-users] pcfieldsplit for a composite dm with multiple
	subfields
In-Reply-To: <BA01801B-9CBD-4545-A0B5-E5BDF46FE983@gmail.com>
References: <E7684D1E-FA5F-4A78-8205-F13AF5F98D8D@gmail.com>
	<E4888BDC-96F1-436E-85CF-57B196B777ED@mcs.anl.gov>
	<22066404-50E3-4BBF-9D27-26384815571A@gmail.com>
	<CAMYG4G=Trdw0_5MC33+PcXKRCGDWP7P2i9Jh07j_-mbSU8xavg@mail.gmail.com>
	<8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com>
	<4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov>
	<C47F0A60-2B80-4FB0-8B4B-2AD2A176CBC4@gmail.com>
	<5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov>
	<ED77EE0B-CCD5-4DB9-9DC9-F411074A49DB@gmail.com>
	<05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov>
	<CAMYG4Gmt4MJb-7PuJ_buugTrDUYxM9TKoZsptWssYyhg_ieFtw@mail.gmail.com>
	<0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com>
	<122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov>
	<3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com>
	<E9DDFD9A-E597-4B30-8F8E-829A6129BF75@mcs.anl.gov>
	<E46EA93B-06AF-4AD2-A006-1E2A27269701@gmail.com>
	<76703FF4-3008-4744-B334-C1EB732DFC4C@mcs.anl.gov>
	<BA01801B-9CBD-4545-A0B5-E5BDF46FE983@gmail.com>
Message-ID: <D3B3499C-7E68-485A-B824-F0CC916488A6@mcs.anl.gov>


> On Aug 28, 2015, at 7:29 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
> 
> That was a mistake on my part.  But I did want to ask, what should be the behavior with a grid sequence if the SNES fails during one of the intermediate steps?

  You'll have to look at the code. So I just did, unless you set the option -snes_error_if_not_converged it will blinding go on. But the final SNESConvergedReason() will hopefully be negative indicating that SNES has not converged.

  Barry
> 
> -gideon
> 
>> On Aug 28, 2015, at 6:03 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>> On Aug 28, 2015, at 4:35 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>> 
>>> Hi Barry,
>>> 
>>> Ok, I tried that and it works as intended, but there?s something I noticed.  If i use that, and do a SNESGetConvergedReason on the snesrefine, it always seems to return 0.  Is there a reason for that? 
>> 
>>  Should never do that; are you sure that SNESSolve() has actually been called on it. What does -snes_monitor and -snes_converged_reason show.
>> 
>>  Barry
>> 
>> 
>> 
>>> 
>>> -gideon
>>> 
>>>> On Aug 28, 2015, at 4:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>> 
>>>>> 
>>>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <gideon.simpson at gmail.com> wrote:
>>>>> 
>>>>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint.  The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement.  
>>>>> 
>>>>> One subtlety is that I actually want the intermediate continuation solutions  too.  Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one.  So I now need to go back an refine them.  I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid.
>>>>> 
>>>>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined.  Perhaps that?s the most practical thing to do.
>>>> 
>>>>  I would do the following. Create your DM and create a SNES that will do the continuation
>>>> 
>>>>  loop over continuation parameter
>>>> 
>>>>       SNESSolve(snes,NULL,Ucoarse);
>>>> 
>>>>       if (you decide you want to see the refined solution at this continuation point) {
>>>>            SNESCreate(comm,&snesrefine);
>>>>            SNESSetDM()
>>>>            etc
>>>>            SNESSetGridSequence(snesrefine,)
>>>>            SNESSolve(snesrefine,0,Ucoarse);
>>>>            SNESGetSolution(snesrefine,&Ufine);
>>>>            VecView(Ufine or do whatever you want to do with the Ufine at that continuation point
>>>>            SNESDestroy(snesrefine);
>>>>      end if
>>>> 
>>>>  end loop over continuation parameter.
>>>> 
>>>>  Barry
>>>> 
>>>>> 
>>>>> -gideon
>>>>> 
>>>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 3.  This problem is actually part of a continuation problem that roughly looks like this 
>>>>>>> 
>>>>>>> for( continuation parameter p = 0 to 1){
>>>>>>> 
>>>>>>> 	solve with parameter p_i using solution from p_{i-1},
>>>>>>> }
>>>>>>> 
>>>>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that.  But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh.
>>>>>> 
>>>>>> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it).
>>>>>> 
>>>>>> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: 
>>>>>> 
>>>>>> Do not use -snes_grid_sequencing  
>>>>>> 
>>>>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh.
>>>>>> 
>>>>>> Call SNESSetGridSequence()
>>>>>> 
>>>>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc.
>>> 
>> 
> 


From timothee.nicolas at gmail.com  Fri Aug 28 22:15:30 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Sat, 29 Aug 2015 12:15:30 +0900
Subject: [petsc-users] How to extract a slice at a given coordinate and view
	it ?
Message-ID: <CAGi1ndTW=_3iAYdsqn7SircUh0N0JUcdeEmzWkwZW2FpaFqoNQ@mail.gmail.com>

Hi,

I have been thinking for several hours about this problem and can't find an
efficient solution, however I imagine this must be possible somehow with
Petsc.

My problem is the following :

I work in 3D (R,Z,Phi) which makes my data quite heavy and I don't want to
save all the data of all my fields, even just once in a while. Instead, I
would like to save in a binary file a slice at a given angle, say phi=0.

As I did not find if it's natively possible in Petsc, I considered creating
a second 2D DMDA, on which I can create 2D vectors and view them with the
binary viewer. So far so good. However, upon creating the 2D DMDA,
naturally the distribution of processors does not correspond to the
distribution of the 3D DMDA. So I was considering creating global arrays,
filling them with the data of the 3D array in phi=0, then doing an
MPI_allgather to give the information to all the processors, to be able to
read the array and fill the 2D Petsc Vector with it. So the code would be
something along the lines of :

  PetscScalar, pointer :: gX2D(:,:,:)
  PetscScalar, pointer :: gX(:,:,:,:)
  ! LocalArray is locally
filled

  ! It is transmitted to GlobalArray via
MPI_Allgather

  real(8)          :: LocalArray(user%dof,user%mr,user%mz)
  real(8)          :: GlobalArray(user%dof,user%mr,user%mz)

  call DMDAVecGetArrayF90(da_phi0,X2D,gX2D,ierr)
  call DMDAVecGetArrayF90(da,X,gX,ierr)

  do k = user%phis,user%phie
     do j = user%zs,user%ze
        do i = user%rs,user%re
           do l=1,user%dof
              if (k.eq.phi_print) then
                 ! Numbering obtained with DMDAGetArrayF90 differs from
usual

                 LocalArray(l,i,j) = gX(l-1,i-1,j-1,k-1)
              end if
           end do
        end do
     end do
  end do

  nvals = user%dof*user%rm*user%zm

  call MPI_AllGather(LocalArray(1,user%rs,user%zs),        &
       &             nvals,MPI_REAL,                        &
       &             GlobalArray,       &
       &             nvals,MPI_REAL,MPI_COMM_WORLD,ierr)

  do j = zs2D,ze2D
     do i = rs2D,re2D
        do l=1,user%dof
           gX2D(l-1,i-1,j-1) = GlobalArray(l,i,j)
        end do
     end do
  end do

 call DMDAVecRestoreArrayF90(da_phi0,X2D,gX2D,ierr)
  call DMDAVecRestoreArrayF90(da,X,gX,ierr)

The problem is that MPI_allgather is not at all that simple. Exchanging
array information is much more complicated that I had anticipated ! See
this long post on stackoverflow :

http://stackoverflow.com/questions/17508647/sending-2d-arrays-in-fortran-with-mpi-gather

I could probably get it to work eventually, but it's pretty complicated,
and I was wondering if there was not a simpler alternative I could not see.
Besides, I am concerned about what could happen if the number of processors
is so large that the 2D Vector gets less than 2 points per processor (I
have lots of points in phi, so this can happen easily). Then Petsc would
complain.

Does anybody have ideas ?

Best

Timoth?e
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150829/7b3bd2fe/attachment.html>

From bsmith at mcs.anl.gov  Fri Aug 28 22:40:46 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 28 Aug 2015 22:40:46 -0500
Subject: [petsc-users] How to extract a slice at a given coordinate and
	view it ?
In-Reply-To: <CAGi1ndTW=_3iAYdsqn7SircUh0N0JUcdeEmzWkwZW2FpaFqoNQ@mail.gmail.com>
References: <CAGi1ndTW=_3iAYdsqn7SircUh0N0JUcdeEmzWkwZW2FpaFqoNQ@mail.gmail.com>
Message-ID: <DDF5B107-51A2-4711-81B3-57A68668DBDF@mcs.anl.gov>


  I wrote a routine DMDAGetRay() that pulls a 1 dimensional slice out of a 2d DMDA and puts it on process 0. It uses AOApplicationToPetsc() so is not truly scalable but perhaps you could take a look at that. Since you say  "(I have lots of points in phi, so this can happen easily)" it may be ok for you to just stick the 2d slice on process 0 and then save it?

  Barry

Without using AOApplicationToPetsc() or something similar yes it is in general a nightmare.


> On Aug 28, 2015, at 10:15 PM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
> 
> Hi,
> 
> I have been thinking for several hours about this problem and can't find an efficient solution, however I imagine this must be possible somehow with Petsc.
> 
> My problem is the following :
> 
> I work in 3D (R,Z,Phi) which makes my data quite heavy and I don't want to save all the data of all my fields, even just once in a while. Instead, I would like to save in a binary file a slice at a given angle, say phi=0.
> 
> As I did not find if it's natively possible in Petsc, I considered creating a second 2D DMDA, on which I can create 2D vectors and view them with the binary viewer. So far so good. However, upon creating the 2D DMDA, naturally the distribution of processors does not correspond to the distribution of the 3D DMDA. So I was considering creating global arrays, filling them with the data of the 3D array in phi=0, then doing an MPI_allgather to give the information to all the processors, to be able to read the array and fill the 2D Petsc Vector with it. So the code would be something along the lines of :
> 
>   PetscScalar, pointer :: gX2D(:,:,:)
>   PetscScalar, pointer :: gX(:,:,:,:)
>   ! LocalArray is locally filled                                                                                                                                                                       
>   ! It is transmitted to GlobalArray via MPI_Allgather                                                                                                                                                 
>   real(8)          :: LocalArray(user%dof,user%mr,user%mz)
>   real(8)          :: GlobalArray(user%dof,user%mr,user%mz)
> 
>   call DMDAVecGetArrayF90(da_phi0,X2D,gX2D,ierr)
>   call DMDAVecGetArrayF90(da,X,gX,ierr)
> 
>   do k = user%phis,user%phie
>      do j = user%zs,user%ze
>         do i = user%rs,user%re
>            do l=1,user%dof
>               if (k.eq.phi_print) then
>                  ! Numbering obtained with DMDAGetArrayF90 differs from usual                                                                                                                          
>                  LocalArray(l,i,j) = gX(l-1,i-1,j-1,k-1)
>               end if
>            end do
>         end do
>      end do
>   end do
> 
>   nvals = user%dof*user%rm*user%zm
> 
>   call MPI_AllGather(LocalArray(1,user%rs,user%zs),        &
>        &             nvals,MPI_REAL,                        &
>        &             GlobalArray,       &
>        &             nvals,MPI_REAL,MPI_COMM_WORLD,ierr)
> 
>   do j = zs2D,ze2D
>      do i = rs2D,re2D
>         do l=1,user%dof
>            gX2D(l-1,i-1,j-1) = GlobalArray(l,i,j)
>         end do
>      end do
>   end do
> 
>  call DMDAVecRestoreArrayF90(da_phi0,X2D,gX2D,ierr)
>   call DMDAVecRestoreArrayF90(da,X,gX,ierr)
> 
> The problem is that MPI_allgather is not at all that simple. Exchanging array information is much more complicated that I had anticipated ! See this long post on stackoverflow : 
> 
> http://stackoverflow.com/questions/17508647/sending-2d-arrays-in-fortran-with-mpi-gather
> 
> I could probably get it to work eventually, but it's pretty complicated, and I was wondering if there was not a simpler alternative I could not see. Besides, I am concerned about what could happen if the number of processors is so large that the 2D Vector gets less than 2 points per processor (I have lots of points in phi, so this can happen easily). Then Petsc would complain.
> 
> Does anybody have ideas ?
> 
> Best
> 
> Timoth?e


From jroman at dsic.upv.es  Sat Aug 29 02:55:35 2015
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sat, 29 Aug 2015 09:55:35 +0200
Subject: [petsc-users] Increasing nodes doesn't decrease memory per node.
In-Reply-To: <E4FC8DE2-8040-4423-AAAC-CD95B5E9FF00@mcs.anl.gov>
References: <CAG5-w77v5c_nFwQRQcv8coXSSVR4km0osiU+2PH0q6wRpoPM8g@mail.gmail.com>
	<E4FC8DE2-8040-4423-AAAC-CD95B5E9FF00@mcs.anl.gov>
Message-ID: <97608CC3-A769-48DD-B3EA-1BEA073ADFB9@dsic.upv.es>

You are doing a spectrum slicing run (EPSSetInterval) with ?size? partitions (EPSKrylovSchurSetPartitions), so every single process will be in charge of computing a subinterval. Each subcommunicator needs a redundant copy of the matrix, and in this case this copy is SeqAIJ since subcommunicators consist in just one process. You will probably need to share this memory across a set of processes and use MUMPS for the factorization. Try setting e.g. size/8 partitions.

Jose


> El 29/8/2015, a las 1:14, Barry Smith <bsmith at mcs.anl.gov> escribi?:
> 
> 
>  It is using a SeqAIJ matrix, not a parallel matrix. Increasing the number of cores won't affect the size of a sequential matrix since it must be stored entirely on one process.  Perhaps you need to use parallel matrices?
> 
> 
> [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
> [1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c
> 
> 
>> On Aug 28, 2015, at 4:13 PM, Hank Lamm <hanklammiv at gmail.com> wrote:
>> 
>> Hi All,
>> 
>> I am having a problem running Petsc3.6 and Slepc3.6 on Stampede.  My code should be a simple eigenvalue solver, but when I attempt to solve large problems (8488x8488 matrices) I get errors:
>> 
>> --------------------- Error Message --------------------------------------------------------------
>> [1]Total space allocated 1736835920 bytes
>> [1]PETSC ERROR: Out of memory. This could be due to allocating
>> [1]PETSC ERROR: too large an object or bleeding by not properly
>> [1]PETSC ERROR: destroying unneeded objects.
>> [1]PETSC ERROR: Memory allocated 1736835920 Memory used by process 1769742336
>> [1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796
>> [1]PETSC ERROR: [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
>> [1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line 4030 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
>> [1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c
>> [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
>> [1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c
>> [1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
>> [1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c
>> [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
>> [1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
>> [1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
>> [1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c
>> [1]PETSC ERROR: #12 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c
>> [1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
>> [1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
>> [1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c
>> [1]PETSC ERROR: #16 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c
>> [1]PETSC ERROR: #17 EPSSolve() line 88 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c
>> [1]PETSC ERROR: #18 eigensolver() line 64 in /work/03324/hlammiv/TMSWIFT/src/solver.cpp
>> [1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced() 1.73684e+09
>> [1]Current process memory 1.76979e+09 max process memory 1.76979e+09
>> 
>> 
>> The curious thing about this error, is that it seems that if I increase the number of nodes, from 32 to 64 to 128, the amount of memory per node doesn't decrease.  I have used valgrind and it doesn't seem to a memory leak.
>> 
>> The relevant code piece is:
>> 
>> void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc, char **argv)
>> {
>> 
>> 
>>    EPS        eps;             /* eigenproblem solver context */
>>      EPSType        type;
>>      ST             st;
>>    KSP            ksp;
>>      PC             pc; 
>>    PetscReal    tol,error;
>>    PetscReal    lower,upper;
>>      //PetscInt       nev=dim,maxit,its;
>>      PetscInt           nev,maxit,its,nconv;
>>      Vec                xr,xi;
>>      PetscScalar       kr,ki;
>>    PetscReal     re,im;
>>      PetscViewer        viewer;
>>    PetscInt rank;
>>    PetscInt size;
>>    std::string eig_file_n;
>>    std::ofstream eig_file;    
>>    char ofile[100];
>> 
>>        MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
>>        MPI_Comm_size(PETSC_COMM_WORLD,&size);
>> 
>>      ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue Solver---\n");CHKERRV(ierr);
>>     ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr);
>> 
>>        eig_file_n.append(params->ofile_n);
>>        eig_file_n.append("_eval");
>>        eig_file.open(eig_file_n.c_str(),std::ofstream::trunc);
>> 
>>    //Set operators. In this case, it is a standard eigenvalue problem
>>    ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr);
>>    ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr);  
>> 
>>    ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr);
>> 
>>      ierr = EPSGetST(eps,&st);CHKERRV(ierr);
>>      ierr = STSetType(st,STSINVERT);CHKERRV(ierr);
>> 
>>      ierr = STGetKSP(st,&ksp);CHKERRV(ierr);
>>      ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr);
>>      ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr);
>>      ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr);
>>    ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr);
>> 
>>    for(PetscInt i=0;i<params->nf;i++){
>>    lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0);
>>    upper=4.0*params->m[i]*params->m[i];
>>    ierr = EPSSetInterval(eps,lower,upper);
>>    ierr = EPSSetWhichEigenpairs(eps,EPS_ALL);
>>        //Set solver parameters at runtime
>>      ierr = EPSSetFromOptions(eps);CHKERRV(ierr);
>> //     ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);
>> 
>>    ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr);
>>    ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr);
>> 
>> 
>>       ierr = EPSSolve(eps);CHKERRV(ierr);
>> 
>>       ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr);
>>      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the method: %D\n",its);CHKERRV(ierr);
>> 
>> 
>>       //Optional: Get some information from the solver and display it
>>      ierr = EPSGetType(eps,&type);CHKERRV(ierr);
>>      ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method: %s\n\n",type);CHKERRV(ierr);
>>      ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr);
>>      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested eigenvalues: %D\n",nev);CHKERRV(ierr);
>>      ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr);
>>      ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition: tol=%.4g, maxit=%D\n",tol,maxit);CHKERRV(ierr);
>> 
>>    ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr);
>>      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged eigenpairs: %D\n\n",nconv);CHKERRV(ierr);
>> 
>>    strcpy(ofile,params->ofile_n);
>>          strcat(ofile,"_evecr");
>> 
>>        ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr);
>> 
>>    if (nconv>0) 
>>    {
>>            ierr = PetscPrintf(PETSC_COMM_WORLD,
>>                 "           k          ||Ax-kx||/||kx||\n"
>>                 "   ----------------- ------------------\n");CHKERRV(ierr);
>> 
>>        for (PetscInt i=0;i<nconv;i++)
>>        {
>>            //Get converged eigenpairs: i-th eigenvalue is stored in kr (real part) and ki (imaginary part)
>>                  ierr = EPSGetEigenpair(eps,i,&kr,&ki,xr,xi);CHKERRV(ierr);
>>                 //Compute the relative error associated to each eigenpair
>>                 ierr = EPSComputeError(eps,i,EPS_ERROR_RELATIVE,&error);CHKERRV(ierr);
>> 
>>            #if defined(PETSC_USE_COMPLEX)
>>                      re = PetscRealPart(kr);
>>                      im = PetscImaginaryPart(kr);
>>            #else
>>                      re = kr;
>>                      im = ki;
>>            #endif
>> 
>>                  if (im!=0.0)
>>            {
>> 
>>                ierr = PetscPrintf(PETSC_COMM_WORLD," %9f%+9f j %12g\n",re,im,error);CHKERRV(ierr);
>>                if(rank==0) eig_file << re << " " << im << " " << error << std::endl;
>>            } else 
>>            {
>>                    ierr = PetscPrintf(PETSC_COMM_WORLD,"   %12f       %12g\n",re,error);CHKERRV(ierr);
>>                if(rank==0) eig_file << re << " " << 0 << " " << error << std::endl;
>>                 }
>> 
>>                        ierr = VecView(xr,viewer);CHKERRV(ierr);
>> 
>>        }
>>            ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");CHKERRV(ierr);
>>      }
>>    }
>>    eig_file.close();
>>    ierr = EPSDestroy(&eps);CHKERRV(ierr);
>>    ierr = PetscViewerDestroy(&viewer);CHKERRV(ierr);
>>    ierr = VecDestroy(&xr);CHKERRV(ierr);
>>    ierr = VecDestroy(&xi);CHKERRV(ierr);
>> 
>>      ierr = PetscPrintf(PETSC_COMM_WORLD,"---Finishing Eigenvalue Solver---\n");CHKERRV(ierr);
>> }
>> 
>> 
>> 
>> Thanks,
>> Hank
> 


From timothee.nicolas at gmail.com  Sat Aug 29 03:46:35 2015
From: timothee.nicolas at gmail.com (timothee.nicolas at gmail.com)
Date: Sat, 29 Aug 2015 17:46:35 +0900
Subject: [petsc-users] How to extract a slice at a given coordinate and
	view it ?
In-Reply-To: <DDF5B107-51A2-4711-81B3-57A68668DBDF@mcs.anl.gov>
References: <CAGi1ndTW=_3iAYdsqn7SircUh0N0JUcdeEmzWkwZW2FpaFqoNQ@mail.gmail.com>
	<DDF5B107-51A2-4711-81B3-57A68668DBDF@mcs.anl.gov>
Message-ID: <92010472-EC99-4820-B52D-AC2308F4AB6C@gmail.com>

I see. I will have a look. Regarding memory there would be no problem to put everything on process 0 I believe. If I can't figure out how to process with your routine, I will go back to the initial try with mpi_allgather.

Thx

Timothee

Sent from my iPhone

> On 2015/08/29, at 12:40, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>  I wrote a routine DMDAGetRay() that pulls a 1 dimensional slice out of a 2d DMDA and puts it on process 0. It uses AOApplicationToPetsc() so is not truly scalable but perhaps you could take a look at that. Since you say  "(I have lots of points in phi, so this can happen easily)" it may be ok for you to just stick the 2d slice on process 0 and then save it?
> 
>  Barry
> 
> Without using AOApplicationToPetsc() or something similar yes it is in general a nightmare.
> 
> 
>> On Aug 28, 2015, at 10:15 PM, Timoth?e Nicolas <timothee.nicolas at gmail.com> wrote:
>> 
>> Hi,
>> 
>> I have been thinking for several hours about this problem and can't find an efficient solution, however I imagine this must be possible somehow with Petsc.
>> 
>> My problem is the following :
>> 
>> I work in 3D (R,Z,Phi) which makes my data quite heavy and I don't want to save all the data of all my fields, even just once in a while. Instead, I would like to save in a binary file a slice at a given angle, say phi=0.
>> 
>> As I did not find if it's natively possible in Petsc, I considered creating a second 2D DMDA, on which I can create 2D vectors and view them with the binary viewer. So far so good. However, upon creating the 2D DMDA, naturally the distribution of processors does not correspond to the distribution of the 3D DMDA. So I was considering creating global arrays, filling them with the data of the 3D array in phi=0, then doing an MPI_allgather to give the information to all the processors, to be able to read the array and fill the 2D Petsc Vector with it. So the code would be something along the lines of :
>> 
>>  PetscScalar, pointer :: gX2D(:,:,:)
>>  PetscScalar, pointer :: gX(:,:,:,:)
>>  ! LocalArray is locally filled                                                                                                                                                                       
>>  ! It is transmitted to GlobalArray via MPI_Allgather                                                                                                                                                 
>>  real(8)          :: LocalArray(user%dof,user%mr,user%mz)
>>  real(8)          :: GlobalArray(user%dof,user%mr,user%mz)
>> 
>>  call DMDAVecGetArrayF90(da_phi0,X2D,gX2D,ierr)
>>  call DMDAVecGetArrayF90(da,X,gX,ierr)
>> 
>>  do k = user%phis,user%phie
>>     do j = user%zs,user%ze
>>        do i = user%rs,user%re
>>           do l=1,user%dof
>>              if (k.eq.phi_print) then
>>                 ! Numbering obtained with DMDAGetArrayF90 differs from usual                                                                                                                          
>>                 LocalArray(l,i,j) = gX(l-1,i-1,j-1,k-1)
>>              end if
>>           end do
>>        end do
>>     end do
>>  end do
>> 
>>  nvals = user%dof*user%rm*user%zm
>> 
>>  call MPI_AllGather(LocalArray(1,user%rs,user%zs),        &
>>       &             nvals,MPI_REAL,                        &
>>       &             GlobalArray,       &
>>       &             nvals,MPI_REAL,MPI_COMM_WORLD,ierr)
>> 
>>  do j = zs2D,ze2D
>>     do i = rs2D,re2D
>>        do l=1,user%dof
>>           gX2D(l-1,i-1,j-1) = GlobalArray(l,i,j)
>>        end do
>>     end do
>>  end do
>> 
>> call DMDAVecRestoreArrayF90(da_phi0,X2D,gX2D,ierr)
>>  call DMDAVecRestoreArrayF90(da,X,gX,ierr)
>> 
>> The problem is that MPI_allgather is not at all that simple. Exchanging array information is much more complicated that I had anticipated ! See this long post on stackoverflow : 
>> 
>> http://stackoverflow.com/questions/17508647/sending-2d-arrays-in-fortran-with-mpi-gather
>> 
>> I could probably get it to work eventually, but it's pretty complicated, and I was wondering if there was not a simpler alternative I could not see. Besides, I am concerned about what could happen if the number of processors is so large that the 2D Vector gets less than 2 points per processor (I have lots of points in phi, so this can happen easily). Then Petsc would complain.
>> 
>> Does anybody have ideas ?
>> 
>> Best
>> 
>> Timoth?e
> 

From jychang48 at gmail.com  Sat Aug 29 16:41:36 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Sat, 29 Aug 2015 15:41:36 -0600
Subject: [petsc-users] Using the fieldsplit/schur/selfp preconditoner
Message-ID: <CAP2=TMiuR19UkffPXSOzP6iUDgez7FmH0KBEMKGDYmNk_W9Vmg@mail.gmail.com>

Hi all,

I am attempting to solve Darcy's equation:

u + grad[p] = g
div[u] = f

The weak form under the least-squares finite element method (LSFEM)
looks like this:

(u + grad[p]; v + grad[q]) + div[u]*div[v] = (g; v + grad[q]) + (f; div[v])

The classical mixed formulations using H(div) elements has the
following weak form:

(u; v) - (p; div[v]) - (div[v]; q) = (g; v) - (f; q)

For H(div) elements like RT0 and BDM, I was told that I could use these options:

-ksp_type gmres
-pc_type fieldsplit
-pc_fieldsplit_type schur
-pc_fieldsplit_schur_precondition selfp
-fieldsplit_0_ksp_type preonly
-fieldsplit_0_pc_type bjacobi
-fieldsplit_0_sub_pc_type ilu
-fieldsplir_1_ksp_type preonly
-fieldsplit_1_pc_type hypre

This works nicely for the  classical mixed form if g was zero and f
was nonzero. It also works if f was zero and g was non-zero although
it seems to me the solver requires a few more iterations. Now when I
attempt to apply these options to the LSFEM, my u solution is
nonsensical while my p is correct for nonzero g. For nonzero f, the
solver doesn't converge at all.

II have used CG/Jacobi with success for small LSFEM problems, but I
was wondering if it's possible (or even necessary) to do a
fieldsplit/schur complement for this kind of problem and how I could
modify the above options. Or what other preconditioner would work best
for this type of problem where its symmetric and positive definite?

Thanks,
Justin

From timothee.nicolas at gmail.com  Sat Aug 29 23:12:26 2015
From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=)
Date: Sun, 30 Aug 2015 13:12:26 +0900
Subject: [petsc-users] How to extract a slice at a given coordinate and
 view it ?
In-Reply-To: <92010472-EC99-4820-B52D-AC2308F4AB6C@gmail.com>
References: <CAGi1ndTW=_3iAYdsqn7SircUh0N0JUcdeEmzWkwZW2FpaFqoNQ@mail.gmail.com>
	<DDF5B107-51A2-4711-81B3-57A68668DBDF@mcs.anl.gov>
	<92010472-EC99-4820-B52D-AC2308F4AB6C@gmail.com>
Message-ID: <CAGi1ndRXRh5k6DdLpqtQ6ROPah-gp=uSqjWTZ_8uaokPASKGyw@mail.gmail.com>

Hi,

I finally found a quite simple solution using MPI_WRITE_FILE_AT, even
though there may be a more efficient one. Since it is only used to write a
single file with reduced dimensions, it should not be a hindrance. I
finally don't use Petsc viewers neither a secondary 2D DMDA. The routine
looks like this, if anyone is interested :

subroutine WriteVectorSelectK0(da,X,k0,filename,flg_open,ierr)
  implicit none
  DM  :: da
  Vec :: X
  PetscErrorCode :: ierr
  PetscViewer    :: viewer
  PetscBool      :: flg_open
  PetscScalar, pointer :: gX(:,:,:,:)
  PetscScalar    :: LocalArray(user%dof,user%mr,user%mz)
  character(*)   :: filename
  PetscInt       :: thefile
  integer(kind=MPI_OFFSET_KIND) :: offset
  PetscInt       :: i,j
  ! The slice to select in the Z direction
  PetscInt       :: k0

  ! access the array, indexed with global indices
  call DMDAVecGetArrayF90(da,X,gX,ierr)

  ! open a file
  call MPI_FILE_OPEN(MPI_COMM_WORLD, filename, &
       MPI_MODE_WRONLY + MPI_MODE_CREATE,        &
       MPI_INFO_NULL, thefile, ierr)

  ! only the processes which contain data on the slice k=k0 are interesting
and will write to the file
  if (k0.ge.user%phis .and. k0.le.user%phie) then
     do j=user%zs,user%ze
           offset = user%dof*((j-1)*user%mr+(user%rs-1))*8
           ! Indexing of gX : one has to subtract 1 because of the C like
indexing resulting from DMDAVecGetArrayF90
           call MPI_FILE_WRITE_AT(thefile,offset,gX(:,user%rs-1,j-1,k0-1),
&
                &                 user%dof,
MPI_DOUBLE_PRECISION,            &
                &                 MPI_STATUS_IGNORE, ierr)
     end do
  end if

  call MPI_FILE_CLOSE(thefile, ierr)

  ! restore the array
  call DMDAVecRestoreArrayF90(da,X,gX,ierr)

end subroutine WriteVectorSelectK0


Unfortunately, I did not find a way to avoid the loop on j, which are in
principle quite inefficient. That is because when you reach the end of the
local block in the first direction (user%re in my example), the next point
where j is incremented by 1 is not contiguous in memory in the written
file. So you have to change the offset and use a new call to
MPI_FILE_WRITE_AT.

Two things that one should be careful with :

1. By default, MPI-IO does not seem to include the 4 or 8 bytes at the
beginning of the file which are often added by FORTRAN (for example Petsc
Viewer add 8 bytes, which include one integer about the size of the data
written in the file). When you read your file outside of FORTRAN (e.g. with
python), you should be careful about this difference.

2. On my machine, Petsc Viewer writes the data in big endian. However, the
routine above gives me little endian (however this may be machine
dependent).

Best

Timoth?e


2015-08-29 17:46 GMT+09:00 <timothee.nicolas at gmail.com>:

> I see. I will have a look. Regarding memory there would be no problem to
> put everything on process 0 I believe. If I can't figure out how to process
> with your routine, I will go back to the initial try with mpi_allgather.
>
> Thx
>
> Timothee
>
> Sent from my iPhone
>
> > On 2015/08/29, at 12:40, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >
> >  I wrote a routine DMDAGetRay() that pulls a 1 dimensional slice out of
> a 2d DMDA and puts it on process 0. It uses AOApplicationToPetsc() so is
> not truly scalable but perhaps you could take a look at that. Since you
> say  "(I have lots of points in phi, so this can happen easily)" it may be
> ok for you to just stick the 2d slice on process 0 and then save it?
> >
> >  Barry
> >
> > Without using AOApplicationToPetsc() or something similar yes it is in
> general a nightmare.
> >
> >
> >> On Aug 28, 2015, at 10:15 PM, Timoth?e Nicolas <
> timothee.nicolas at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I have been thinking for several hours about this problem and can't
> find an efficient solution, however I imagine this must be possible somehow
> with Petsc.
> >>
> >> My problem is the following :
> >>
> >> I work in 3D (R,Z,Phi) which makes my data quite heavy and I don't want
> to save all the data of all my fields, even just once in a while. Instead,
> I would like to save in a binary file a slice at a given angle, say phi=0.
> >>
> >> As I did not find if it's natively possible in Petsc, I considered
> creating a second 2D DMDA, on which I can create 2D vectors and view them
> with the binary viewer. So far so good. However, upon creating the 2D DMDA,
> naturally the distribution of processors does not correspond to the
> distribution of the 3D DMDA. So I was considering creating global arrays,
> filling them with the data of the 3D array in phi=0, then doing an
> MPI_allgather to give the information to all the processors, to be able to
> read the array and fill the 2D Petsc Vector with it. So the code would be
> something along the lines of :
> >>
> >>  PetscScalar, pointer :: gX2D(:,:,:)
> >>  PetscScalar, pointer :: gX(:,:,:,:)
> >>  ! LocalArray is locally filled
> >>  ! It is transmitted to GlobalArray via MPI_Allgather
> >>  real(8)          :: LocalArray(user%dof,user%mr,user%mz)
> >>  real(8)          :: GlobalArray(user%dof,user%mr,user%mz)
> >>
> >>  call DMDAVecGetArrayF90(da_phi0,X2D,gX2D,ierr)
> >>  call DMDAVecGetArrayF90(da,X,gX,ierr)
> >>
> >>  do k = user%phis,user%phie
> >>     do j = user%zs,user%ze
> >>        do i = user%rs,user%re
> >>           do l=1,user%dof
> >>              if (k.eq.phi_print) then
> >>                 ! Numbering obtained with DMDAGetArrayF90 differs from
> usual
> >>                 LocalArray(l,i,j) = gX(l-1,i-1,j-1,k-1)
> >>              end if
> >>           end do
> >>        end do
> >>     end do
> >>  end do
> >>
> >>  nvals = user%dof*user%rm*user%zm
> >>
> >>  call MPI_AllGather(LocalArray(1,user%rs,user%zs),        &
> >>       &             nvals,MPI_REAL,                        &
> >>       &             GlobalArray,       &
> >>       &             nvals,MPI_REAL,MPI_COMM_WORLD,ierr)
> >>
> >>  do j = zs2D,ze2D
> >>     do i = rs2D,re2D
> >>        do l=1,user%dof
> >>           gX2D(l-1,i-1,j-1) = GlobalArray(l,i,j)
> >>        end do
> >>     end do
> >>  end do
> >>
> >> call DMDAVecRestoreArrayF90(da_phi0,X2D,gX2D,ierr)
> >>  call DMDAVecRestoreArrayF90(da,X,gX,ierr)
> >>
> >> The problem is that MPI_allgather is not at all that simple. Exchanging
> array information is much more complicated that I had anticipated ! See
> this long post on stackoverflow :
> >>
> >>
> http://stackoverflow.com/questions/17508647/sending-2d-arrays-in-fortran-with-mpi-gather
> >>
> >> I could probably get it to work eventually, but it's pretty
> complicated, and I was wondering if there was not a simpler alternative I
> could not see. Besides, I am concerned about what could happen if the
> number of processors is so large that the 2D Vector gets less than 2 points
> per processor (I have lots of points in phi, so this can happen easily).
> Then Petsc would complain.
> >>
> >> Does anybody have ideas ?
> >>
> >> Best
> >>
> >> Timoth?e
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150830/96c1a1c1/attachment.html>

From hanklammiv at gmail.com  Mon Aug 31 14:50:48 2015
From: hanklammiv at gmail.com (Hank Lamm)
Date: Mon, 31 Aug 2015 12:50:48 -0700
Subject: [petsc-users] Increasing nodes doesn't decrease memory per node.
In-Reply-To: <97608CC3-A769-48DD-B3EA-1BEA073ADFB9@dsic.upv.es>
References: <CAG5-w77v5c_nFwQRQcv8coXSSVR4km0osiU+2PH0q6wRpoPM8g@mail.gmail.com>
	<E4FC8DE2-8040-4423-AAAC-CD95B5E9FF00@mcs.anl.gov>
	<97608CC3-A769-48DD-B3EA-1BEA073ADFB9@dsic.upv.es>
Message-ID: <CAG5-w776FhoTL59EX_Qfg=AdfFQxRxy=sYbShveY+LikTQ+8kA@mail.gmail.com>

If I use MatView to check the type of my matrix, it replies mpiaij, not
seqaij.  Am I correct in understanding your comments to mean that the
reason for the error that when I do the spectrum slicing, it is creating
seqaij for each processor?

On Sat, Aug 29, 2015 at 12:55 AM, Jose E. Roman <jroman at dsic.upv.es> wrote:

> You are doing a spectrum slicing run (EPSSetInterval) with ?size?
> partitions (EPSKrylovSchurSetPartitions), so every single process will be
> in charge of computing a subinterval. Each subcommunicator needs a
> redundant copy of the matrix, and in this case this copy is SeqAIJ since
> subcommunicators consist in just one process. You will probably need to
> share this memory across a set of processes and use MUMPS for the
> factorization. Try setting e.g. size/8 partitions.
>
> Jose
>
>
> > El 29/8/2015, a las 1:14, Barry Smith <bsmith at mcs.anl.gov> escribi?:
> >
> >
> >  It is using a SeqAIJ matrix, not a parallel matrix. Increasing the
> number of cores won't affect the size of a sequential matrix since it must
> be stored entirely on one process.  Perhaps you need to use parallel
> matrices?
> >
> >
> > [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in
> /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
> > [1]PETSC ERROR: #5 MatDuplicate() line 4252 in
> /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c
> >
> >
> >> On Aug 28, 2015, at 4:13 PM, Hank Lamm <hanklammiv at gmail.com> wrote:
> >>
> >> Hi All,
> >>
> >> I am having a problem running Petsc3.6 and Slepc3.6 on Stampede.  My
> code should be a simple eigenvalue solver, but when I attempt to solve
> large problems (8488x8488 matrices) I get errors:
> >>
> >> --------------------- Error Message
> --------------------------------------------------------------
> >> [1]Total space allocated 1736835920 bytes
> >> [1]PETSC ERROR: Out of memory. This could be due to allocating
> >> [1]PETSC ERROR: too large an object or bleeding by not properly
> >> [1]PETSC ERROR: destroying unneeded objects.
> >> [1]PETSC ERROR: Memory allocated 1736835920 Memory used by process
> 1769742336
> >> [1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796
> >> [1]PETSC ERROR: [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> >> [1]PETSC ERROR: #8 STSetUp() line 305 in
> /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
> >> [1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line
> 4030 in
> /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
> >> [1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in
> /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c
> >> [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in
> /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c
> >> [1]PETSC ERROR: #5 MatDuplicate() line 4252 in
> /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c
> >> [1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in
> /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
> >> [1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in
> /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c
> >> [1]PETSC ERROR: #8 STSetUp() line 305 in
> /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c
> >> [1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
> >> [1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
> >> [1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c
> >> [1]PETSC ERROR: #12 EPSSetUp() line 121 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c
> >> [1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
> >> [1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c
> >> [1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c
> >> [1]PETSC ERROR: #16 EPSSetUp() line 121 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c
> >> [1]PETSC ERROR: #17 EPSSolve() line 88 in
> /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c
> >> [1]PETSC ERROR: #18 eigensolver() line 64 in
> /work/03324/hlammiv/TMSWIFT/src/solver.cpp
> >> [1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced()
> 1.73684e+09
> >> [1]Current process memory 1.76979e+09 max process memory 1.76979e+09
> >>
> >>
> >> The curious thing about this error, is that it seems that if I increase
> the number of nodes, from 32 to 64 to 128, the amount of memory per node
> doesn't decrease.  I have used valgrind and it doesn't seem to a memory
> leak.
> >>
> >> The relevant code piece is:
> >>
> >> void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc,
> char **argv)
> >> {
> >>
> >>
> >>    EPS        eps;             /* eigenproblem solver context */
> >>      EPSType        type;
> >>      ST             st;
> >>    KSP            ksp;
> >>      PC             pc;
> >>    PetscReal    tol,error;
> >>    PetscReal    lower,upper;
> >>      //PetscInt       nev=dim,maxit,its;
> >>      PetscInt           nev,maxit,its,nconv;
> >>      Vec                xr,xi;
> >>      PetscScalar       kr,ki;
> >>    PetscReal     re,im;
> >>      PetscViewer        viewer;
> >>    PetscInt rank;
> >>    PetscInt size;
> >>    std::string eig_file_n;
> >>    std::ofstream eig_file;
> >>    char ofile[100];
> >>
> >>        MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
> >>        MPI_Comm_size(PETSC_COMM_WORLD,&size);
> >>
> >>      ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue
> Solver---\n");CHKERRV(ierr);
> >>     ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr);
> >>
> >>        eig_file_n.append(params->ofile_n);
> >>        eig_file_n.append("_eval");
> >>        eig_file.open(eig_file_n.c_str(),std::ofstream::trunc);
> >>
> >>    //Set operators. In this case, it is a standard eigenvalue problem
> >>    ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr);
> >>    ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr);
> >>
> >>    ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr);
> >>
> >>      ierr = EPSGetST(eps,&st);CHKERRV(ierr);
> >>      ierr = STSetType(st,STSINVERT);CHKERRV(ierr);
> >>
> >>      ierr = STGetKSP(st,&ksp);CHKERRV(ierr);
> >>      ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr);
> >>      ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr);
> >>      ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr);
> >>    ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr);
> >>
> >>    for(PetscInt i=0;i<params->nf;i++){
> >>
> lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0);
> >>    upper=4.0*params->m[i]*params->m[i];
> >>    ierr = EPSSetInterval(eps,lower,upper);
> >>    ierr = EPSSetWhichEigenpairs(eps,EPS_ALL);
> >>        //Set solver parameters at runtime
> >>      ierr = EPSSetFromOptions(eps);CHKERRV(ierr);
> >> //     ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);
> >>
> >>    ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr);
> >>    ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr);
> >>
> >>
> >>       ierr = EPSSolve(eps);CHKERRV(ierr);
> >>
> >>       ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr);
> >>      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the
> method: %D\n",its);CHKERRV(ierr);
> >>
> >>
> >>       //Optional: Get some information from the solver and display it
> >>      ierr = EPSGetType(eps,&type);CHKERRV(ierr);
> >>      ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method:
> %s\n\n",type);CHKERRV(ierr);
> >>      ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr);
> >>      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested
> eigenvalues: %D\n",nev);CHKERRV(ierr);
> >>      ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr);
> >>      ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition:
> tol=%.4g, maxit=%D\n",tol,maxit);CHKERRV(ierr);
> >>
> >>    ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr);
> >>      ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged
> eigenpairs: %D\n\n",nconv);CHKERRV(ierr);
> >>
> >>    strcpy(ofile,params->ofile_n);
> >>          strcat(ofile,"_evecr");
> >>
> >>        ierr =
> PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr);
> >>
> >>    if (nconv>0)
> >>    {
> >>            ierr = PetscPrintf(PETSC_COMM_WORLD,
> >>                 "           k          ||Ax-kx||/||kx||\n"
> >>                 "   -----------------
> ------------------\n");CHKERRV(ierr);
> >>
> >>        for (PetscInt i=0;i<nconv;i++)
> >>        {
> >>            //Get converged eigenpairs: i-th eigenvalue is stored in kr
> (real part) and ki (imaginary part)
> >>                  ierr =
> EPSGetEigenpair(eps,i,&kr,&ki,xr,xi);CHKERRV(ierr);
> >>                 //Compute the relative error associated to each
> eigenpair
> >>                 ierr =
> EPSComputeError(eps,i,EPS_ERROR_RELATIVE,&error);CHKERRV(ierr);
> >>
> >>            #if defined(PETSC_USE_COMPLEX)
> >>                      re = PetscRealPart(kr);
> >>                      im = PetscImaginaryPart(kr);
> >>            #else
> >>                      re = kr;
> >>                      im = ki;
> >>            #endif
> >>
> >>                  if (im!=0.0)
> >>            {
> >>
> >>                ierr = PetscPrintf(PETSC_COMM_WORLD," %9f%+9f j
> %12g\n",re,im,error);CHKERRV(ierr);
> >>                if(rank==0) eig_file << re << " " << im << " " << error
> << std::endl;
> >>            } else
> >>            {
> >>                    ierr = PetscPrintf(PETSC_COMM_WORLD,"   %12f
>  %12g\n",re,error);CHKERRV(ierr);
> >>                if(rank==0) eig_file << re << " " << 0 << " " << error
> << std::endl;
> >>                 }
> >>
> >>                        ierr = VecView(xr,viewer);CHKERRV(ierr);
> >>
> >>        }
> >>            ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");CHKERRV(ierr);
> >>      }
> >>    }
> >>    eig_file.close();
> >>    ierr = EPSDestroy(&eps);CHKERRV(ierr);
> >>    ierr = PetscViewerDestroy(&viewer);CHKERRV(ierr);
> >>    ierr = VecDestroy(&xr);CHKERRV(ierr);
> >>    ierr = VecDestroy(&xi);CHKERRV(ierr);
> >>
> >>      ierr = PetscPrintf(PETSC_COMM_WORLD,"---Finishing Eigenvalue
> Solver---\n");CHKERRV(ierr);
> >> }
> >>
> >>
> >>
> >> Thanks,
> >> Hank
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150831/12079b6b/attachment.html>

From jroman at dsic.upv.es  Mon Aug 31 15:16:36 2015
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 31 Aug 2015 22:16:36 +0200
Subject: [petsc-users] Increasing nodes doesn't decrease memory per node.
In-Reply-To: <CAG5-w776FhoTL59EX_Qfg=AdfFQxRxy=sYbShveY+LikTQ+8kA@mail.gmail.com>
References: <CAG5-w77v5c_nFwQRQcv8coXSSVR4km0osiU+2PH0q6wRpoPM8g@mail.gmail.com>
	<E4FC8DE2-8040-4423-AAAC-CD95B5E9FF00@mcs.anl.gov>
	<97608CC3-A769-48DD-B3EA-1BEA073ADFB9@dsic.upv.es>
	<CAG5-w776FhoTL59EX_Qfg=AdfFQxRxy=sYbShveY+LikTQ+8kA@mail.gmail.com>
Message-ID: <687CC1A8-C197-4767-A164-1D353CE11F93@dsic.upv.es>


> El 31/8/2015, a las 21:50, Hank Lamm <hanklammiv at gmail.com> escribi?:
> 
> If I use MatView to check the type of my matrix, it replies mpiaij, not seqaij.  Am I correct in understanding your comments to mean that the reason for the error that when I do the spectrum slicing, it is creating seqaij for each processor?

It is creating a seqaij matrix if partitions=size. The more partitions the more memory per node you need. I presume with partitions=size there is not enough memory to store the copy of the matrix plus the factorized matrix. Use of multiple communicators is explained in section 3.4.5 of SLEPc users guide, including the use of MUMPS.

Jose


From jychang48 at gmail.com  Mon Aug 31 19:36:02 2015
From: jychang48 at gmail.com (Justin Chang)
Date: Mon, 31 Aug 2015 19:36:02 -0500
Subject: [petsc-users] Integrating TAO into SNES and TS
In-Reply-To: <6F8EDD4E-4AC5-4AEB-B593-6F0E682ABA1B@mcs.anl.gov>
References: <CAP2=TMgF8fKyX0o--rRz+cw01=d5ws6iCZu=2PVq1qBJq50okA@mail.gmail.com>
	<16064BCD-8B50-4E95-AF7F-386F6780E645@mcs.anl.gov>
	<CAP2=TMhD41MWMf=n+Z7ZdDdNYvRVx91fL956AXSjtJb_vz01gg@mail.gmail.com>
	<0759EFDD-F7BB-4826-B6E0-241159EF0D21@mcs.anl.gov>
	<CAP2=TMh=NSnObLKwgR_ZJHZDqNe_c+VO4NPkSFnA3X+NLd0Amg@mail.gmail.com>
	<6F8EDD4E-4AC5-4AEB-B593-6F0E682ABA1B@mcs.anl.gov>
Message-ID: <CAP2=TMiZXce6FBKMRNBtnyn3Ug7cnMn_2ZhgR2547=OqpO8Nog@mail.gmail.com>

Coming back to this,

Say I now want to ensure the DMP for advection-diffusion equations. The
linear operator is now asymmetric and non-self-adjoint (assuming I do
something like SUPG or finite volume), meaning I cannot simply solve this
problem without any manipulation (e.g. normalizing the equations) using
TAO's optimization solvers. Does this statement also hold true for SNESVI?

Thanks,
Justin

On Fri, Apr 3, 2015 at 7:38 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Apr 3, 2015, at 7:35 PM, Justin Chang <jchang27 at uh.edu> wrote:
> >
> > I guess I will have to write my own code then :)
> >
> > I am not all that familiar with Variational Inequalities at the moment,
> but if my Jacobian is symmetric and positive definite and I only have lower
> and upper bounds, doesn't the problem simply reduce to that of a convex
> optimization? That is, with SNES act as if it were Tao?
>
>   Yes, I think that is essentially correctly.
>
>   Barry
>
> >
> > On Fri, Apr 3, 2015 at 6:35 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   Justin,
> >
> >    We haven't done anything with TS to handle variational inequalities.
> So you can either write your own backward Euler (outside of TS) that solves
> each time-step problem either as 1) an optimization problem using Tao or 2)
> as a variational inequality using SNES.
> >
> >    More adventurously you could look at the TSTHETA code in TS (which is
> a general form that includes Euler, Backward Euler and Crank-Nicolson and
> see if you can add the constraints to the SNES problem that is solved; in
> theory this is straightforward but it would require understanding the
> current code (which Jed, of course, overwrote :-). I think you should do
> this.
> >
> >   Barry
> >
> >
> > > On Apr 3, 2015, at 12:31 PM, Justin Chang <jchang27 at uh.edu> wrote:
> > >
> > > I am solving the following anisotropic transient diffusion equation
> subject to 0 bounds:
> > >
> > > du/dt = div[D*grad[u]] + f
> > >
> > > Where the dispersion tensor D(x) is symmetric and positive definite.
> This formulation violates the discrete maximum principles so one of the
> ways to ensure nonnegative concentrations is to employ convex optimization.
> I am following the procedures in Nakshatrala and Valocchi (2009) JCP and
> Nagarajan and Nakshatrala (2011) IJNMF.
> > >
> > > The Variational Inequality method works gives what I want for my
> transient case, but what if I want to implement the Tao methodology in TS?
> That is, what TS functions do I need to set up steps a) through e) for each
> time step (also the Jacobian remains the same for all time steps so I would
> only call this once). Normally I would just call TSSolve() and let the
> libraries and functions do everything, but I would like to incorporate
> TaoSolve into every time step.
> > >
> > > Thanks,
> > >
> > > --
> > > Justin Chang
> > > PhD Candidate, Civil Engineering - Computational Sciences
> > > University of Houston, Department of Civil and Environmental
> Engineering
> > > Houston, TX 77004
> > > (512) 963-3262
> > >
> > > On Thu, Apr 2, 2015 at 6:53 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >   An alternative approach is for you to solve it as a (non)linear
> variational inequality. See src/snes/examples/tutorials/ex9.c
> > >
> > >   How you should proceed depends on your long term goal. What problem
> do you really want to solve? Is it really a linear time dependent problem
> with 0 bounds on U? Can the problem always be represented as an
> optimization problem easily? What are  and what will be the properties of
> K? For example if K is positive definite then likely the bounds will remain
> try without explicitly providing the constraints.
> > >
> > >   Barry
> > >
> > > > On Apr 2, 2015, at 6:39 PM, Justin Chang <jchang27 at uh.edu> wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > I have a two part question regarding the integration of the
> following optimization problem
> > > >
> > > > min 1/2 u^T*K*u + u^T*f
> > > > S.T. u >= 0
> > > >
> > > > into SNES and TS
> > > >
> > > > 1) For SNES, assuming I am working with a linear FE equation, I have
> the following algorithm/steps for solving my problem
> > > >
> > > > a) Set an initial guess x
> > > > b) Obtain residual r and jacobian A through functions
> SNESComputeFunction() and SNESComputeJacobian() respectively
> > > > c) Form vector b = r - A*x
> > > > d) Set Hessian equal to A, gradient to A*x, objective function value
> to 1/2*x^T*A*x + x^T*b, and variable (lower) bounds to a zero vector
> > > > e) Call TaoSolve
> > > >
> > > > This works well at the moment, but my question is there a more
> "efficient" way of doing this? Because with my current setup, I am making a
> rather bold assumption that my problem would converge in one SNES iteration
> without the bounded constraints and does not have any unexpected
> nonlinearities.
> > > >
> > > > 2) How would I go about doing the above for time-stepping problems?
> At each time step, I want to solve a convex optimization subject to the
> lower bounds constraint. I plan on using backward euler and my resulting
> jacobian should still be compatible with the above optimization problem.
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Justin Chang
> > > > PhD Candidate, Civil Engineering - Computational Sciences
> > > > University of Houston, Department of Civil and Environmental
> Engineering
> > > > Houston, TX 77004
> > > > (512) 963-3262
> > >
> > >
> > >
> > >
> > > --
> > > Justin Chang
> > > PhD Candidate, Civil Engineering - Computational Sciences
> > > University of Houston, Department of Civil and Environmental
> Engineering
> > > Houston, TX 77004
> > > (512) 963-3262
> >
> >
> >
> >
> > --
> > Justin Chang
> > PhD Candidate, Civil Engineering - Computational Sciences
> > University of Houston, Department of Civil and Environmental Engineering
> > Houston, TX 77004
> > (512) 963-3262
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150831/a9446cd9/attachment-0001.html>

From bsmith at mcs.anl.gov  Mon Aug 31 20:13:54 2015
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 31 Aug 2015 20:13:54 -0500
Subject: [petsc-users] Integrating TAO into SNES and TS
In-Reply-To: <CAP2=TMiZXce6FBKMRNBtnyn3Ug7cnMn_2ZhgR2547=OqpO8Nog@mail.gmail.com>
References: <CAP2=TMgF8fKyX0o--rRz+cw01=d5ws6iCZu=2PVq1qBJq50okA@mail.gmail.com>
	<16064BCD-8B50-4E95-AF7F-386F6780E645@mcs.anl.gov>
	<CAP2=TMhD41MWMf=n+Z7ZdDdNYvRVx91fL956AXSjtJb_vz01gg@mail.gmail.com>
	<0759EFDD-F7BB-4826-B6E0-241159EF0D21@mcs.anl.gov>
	<CAP2=TMh=NSnObLKwgR_ZJHZDqNe_c+VO4NPkSFnA3X+NLd0Amg@mail.gmail.com>
	<6F8EDD4E-4AC5-4AEB-B593-6F0E682ABA1B@mcs.anl.gov>
	<CAP2=TMiZXce6FBKMRNBtnyn3Ug7cnMn_2ZhgR2547=OqpO8Nog@mail.gmail.com>
Message-ID: <C9CB7856-391E-42A0-9EE7-C2D4C1A7E1A7@mcs.anl.gov>


> On Aug 31, 2015, at 7:36 PM, Justin Chang <jychang48 at gmail.com> wrote:
> 
> Coming back to this,
> 
> Say I now want to ensure the DMP for advection-diffusion equations. The linear operator is now asymmetric and non-self-adjoint (assuming I do something like SUPG or finite volume), meaning I cannot simply solve this problem without any manipulation (e.g. normalizing the equations) using TAO's optimization solvers. Does this statement also hold true for SNESVI?

  SNESVI doesn't care about symmetry etc

> 
> Thanks,
> Justin
> 
> On Fri, Apr 3, 2015 at 7:38 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On Apr 3, 2015, at 7:35 PM, Justin Chang <jchang27 at uh.edu> wrote:
> >
> > I guess I will have to write my own code then :)
> >
> > I am not all that familiar with Variational Inequalities at the moment, but if my Jacobian is symmetric and positive definite and I only have lower and upper bounds, doesn't the problem simply reduce to that of a convex optimization? That is, with SNES act as if it were Tao?
> 
>   Yes, I think that is essentially correctly.
> 
>   Barry
> 
> >
> > On Fri, Apr 3, 2015 at 6:35 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   Justin,
> >
> >    We haven't done anything with TS to handle variational inequalities. So you can either write your own backward Euler (outside of TS) that solves each time-step problem either as 1) an optimization problem using Tao or 2) as a variational inequality using SNES.
> >
> >    More adventurously you could look at the TSTHETA code in TS (which is a general form that includes Euler, Backward Euler and Crank-Nicolson and see if you can add the constraints to the SNES problem that is solved; in theory this is straightforward but it would require understanding the current code (which Jed, of course, overwrote :-). I think you should do this.
> >
> >   Barry
> >
> >
> > > On Apr 3, 2015, at 12:31 PM, Justin Chang <jchang27 at uh.edu> wrote:
> > >
> > > I am solving the following anisotropic transient diffusion equation subject to 0 bounds:
> > >
> > > du/dt = div[D*grad[u]] + f
> > >
> > > Where the dispersion tensor D(x) is symmetric and positive definite. This formulation violates the discrete maximum principles so one of the ways to ensure nonnegative concentrations is to employ convex optimization. I am following the procedures in Nakshatrala and Valocchi (2009) JCP and Nagarajan and Nakshatrala (2011) IJNMF.
> > >
> > > The Variational Inequality method works gives what I want for my transient case, but what if I want to implement the Tao methodology in TS? That is, what TS functions do I need to set up steps a) through e) for each time step (also the Jacobian remains the same for all time steps so I would only call this once). Normally I would just call TSSolve() and let the libraries and functions do everything, but I would like to incorporate TaoSolve into every time step.
> > >
> > > Thanks,
> > >
> > > --
> > > Justin Chang
> > > PhD Candidate, Civil Engineering - Computational Sciences
> > > University of Houston, Department of Civil and Environmental Engineering
> > > Houston, TX 77004
> > > (512) 963-3262
> > >
> > > On Thu, Apr 2, 2015 at 6:53 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >   An alternative approach is for you to solve it as a (non)linear variational inequality. See src/snes/examples/tutorials/ex9.c
> > >
> > >   How you should proceed depends on your long term goal. What problem do you really want to solve? Is it really a linear time dependent problem with 0 bounds on U? Can the problem always be represented as an optimization problem easily? What are  and what will be the properties of K? For example if K is positive definite then likely the bounds will remain try without explicitly providing the constraints.
> > >
> > >   Barry
> > >
> > > > On Apr 2, 2015, at 6:39 PM, Justin Chang <jchang27 at uh.edu> wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > I have a two part question regarding the integration of the following optimization problem
> > > >
> > > > min 1/2 u^T*K*u + u^T*f
> > > > S.T. u >= 0
> > > >
> > > > into SNES and TS
> > > >
> > > > 1) For SNES, assuming I am working with a linear FE equation, I have the following algorithm/steps for solving my problem
> > > >
> > > > a) Set an initial guess x
> > > > b) Obtain residual r and jacobian A through functions SNESComputeFunction() and SNESComputeJacobian() respectively
> > > > c) Form vector b = r - A*x
> > > > d) Set Hessian equal to A, gradient to A*x, objective function value to 1/2*x^T*A*x + x^T*b, and variable (lower) bounds to a zero vector
> > > > e) Call TaoSolve
> > > >
> > > > This works well at the moment, but my question is there a more "efficient" way of doing this? Because with my current setup, I am making a rather bold assumption that my problem would converge in one SNES iteration without the bounded constraints and does not have any unexpected nonlinearities.
> > > >
> > > > 2) How would I go about doing the above for time-stepping problems? At each time step, I want to solve a convex optimization subject to the lower bounds constraint. I plan on using backward euler and my resulting jacobian should still be compatible with the above optimization problem.
> > > >
> > > > Thanks,
> > > >
> > > > --
> > > > Justin Chang
> > > > PhD Candidate, Civil Engineering - Computational Sciences
> > > > University of Houston, Department of Civil and Environmental Engineering
> > > > Houston, TX 77004
> > > > (512) 963-3262
> > >
> > >
> > >
> > >
> > > --
> > > Justin Chang
> > > PhD Candidate, Civil Engineering - Computational Sciences
> > > University of Houston, Department of Civil and Environmental Engineering
> > > Houston, TX 77004
> > > (512) 963-3262
> >
> >
> >
> >
> > --
> > Justin Chang
> > PhD Candidate, Civil Engineering - Computational Sciences
> > University of Houston, Department of Civil and Environmental Engineering
> > Houston, TX 77004
> > (512) 963-3262
> 
>