From asmund.ervik at ntnu.no Mon Sep 1 03:22:50 2014 From: asmund.ervik at ntnu.no (=?UTF-8?B?w4VzbXVuZCBFcnZpaw==?=) Date: Mon, 01 Sep 2014 10:22:50 +0200 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes Message-ID: <54042CDA.9080002@ntnu.no> Hi, When I try to configure PETSc with HDF5 by using "--download-hdf5", I get a very long error message from PETSc compiling HDF5 (see attached error-down-hdf5.tar.gz) which boils down to h5tools_str.c: In function 'h5tools_str_indent': h5tools_str.c:635:1: error: expected expression before '/' token This happens both with PETSc 3.5.1 and 3.4.5. It appears this was an error in HDF5 version 1.8.10, cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 Subsequently I installed HDF5 v. 1.8.12 using my OS package manager and then tried to configure with "--with-hdf5 --with-hdf5-dir=/path/to/system/hdf5" using PETSc 3.5.1. This time both configure and make were successful, but "make test" fails with loads of undefined references to PETSc HDF5 stuff (see below). configure.log and make.log attached also for this case (error-with-hdf5.tar.gz) If you need any more info, just ask :) Regards, ?smund Using PETSC_DIR=/opt/petsc/optim_gfortran and PETSC_ARCH=linux *******************Error detected during compile or link!******************* See http://www.mcs.anl.gov/petsc/documentation/faq.html /opt/petsc/optim_gfortran/src/snes/examples/tutorials ex19 ********************************************************************************* /opt/petsc/optim_gfortran/linux/bin/mpicc -o ex19.o -c -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 -march=native -I/opt/petsc/optim_gfortran/include -I/opt/petsc/optim_gfortran/linux/include -I/usr/include `pwd`/ex19.c /opt/petsc/optim_gfortran/linux/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 -march=native -o ex19 ex19.o -Wl,-rpath,/opt/petsc/optim_gfortran/linux/lib -L/opt/petsc/optim_gfortran/linux/lib -lpetsc -Wl,-rpath,/opt/petsc/optim_gfortran/linux/lib -lHYPRE -Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 -L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 -lmpichcxx -lstdc++ -lsuperlu_4.3 -lflapack -lfblas -lparmetis -lmetis -lX11 -lpthread -lssl -lcrypto -Wl,-rpath,/usr/lib -L/usr/lib -lhdf5_hl -lhdf5 -lm -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5GetFileId' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5PushGroup' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5PopGroup' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerCreate_HDF5' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5ReadAttribute' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5WriteAttribute' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5SetTimestep' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PETSC_VIEWER_HDF5_' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5HasAttribute' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5GetTimestep' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5GetGroup' collect2: error: ld returned 1 exit status makefile:108: recipe for target 'ex19' failed make[3]: [ex19] Error 1 (ignored) /usr/bin/rm -f ex19.o *******************Error detected during compile or link!******************* See http://www.mcs.anl.gov/petsc/documentation/faq.html /opt/petsc/optim_gfortran/src/snes/examples/tutorials ex5f ********************************************************* /opt/petsc/optim_gfortran/linux/bin/mpif90 -c -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 -march=native -fno-protect-parens -fstack-arrays -I/opt/petsc/optim_gfortran/include -I/opt/petsc/optim_gfortran/linux/include -I/usr/include -o ex5f.o ex5f.F /opt/petsc/optim_gfortran/linux/bin/mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 -march=native -fno-protect-parens -fstack-arrays -o ex5f ex5f.o -Wl,-rpath,/opt/petsc/optim_gfortran/linux/lib -L/opt/petsc/optim_gfortran/linux/lib -lpetsc -Wl,-rpath,/opt/petsc/optim_gfortran/linux/lib -lHYPRE -Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 -L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 -lmpichcxx -lstdc++ -lsuperlu_4.3 -lflapack -lfblas -lparmetis -lmetis -lX11 -lpthread -lssl -lcrypto -Wl,-rpath,/usr/lib -L/usr/lib -lhdf5_hl -lhdf5 -lm -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5GetFileId' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5PushGroup' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5PopGroup' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerCreate_HDF5' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5ReadAttribute' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5WriteAttribute' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5SetTimestep' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PETSC_VIEWER_HDF5_' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5HasAttribute' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5GetTimestep' /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to `PetscViewerHDF5GetGroup' collect2: error: ld returned 1 exit status makefile:50: recipe for target 'ex5f' failed -------------- next part -------------- A non-text attachment was scrubbed... Name: error-with-hdf5.tar.gz Type: application/gzip Size: 329211 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: error-down-hdf5.tar.gz Type: application/gzip Size: 446573 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From christophe.ortiz at ciemat.es Mon Sep 1 03:56:46 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Mon, 1 Sep 2014 10:56:46 +0200 Subject: [petsc-users] Unable to configure PETSc with CUDA: Problem with thrust directory In-Reply-To: <878um745tm.fsf@jedbrown.org> References: <54008FCD.4030604@txcorp.com> <54009404.103@txcorp.com> <878um745tm.fsf@jedbrown.org> Message-ID: Hi all, Thanks a lot for your answers. I followed Satish's advice and downloaded pets-dev. Then I tried to configure it with CUDA 6.0. It complained because I did not have flex installed, no big deal. Then it complained because it prefers GNU compilers than Intel ones, no big deal. This being fixed, I was able to configure and compile pets-dev with CUDA 6.0. There was no problem due to deprecated arch or thrust directory. I configured it with the following options: --with-x=1 --with-mpi=0 --with-cc=gcc --with-cxx=g++ --with-clanguage=cxx --with-fc=gfortran --with-cuda=1 --with-cuda-dir=/usr/local/cuda-6.0 --with-cuda-arch=sm_35 --with-thrust=1 --with-thrust-dir=/usr/local/cuda-6.0/include/thrust --with-cusp=1 --with-cusp-dir=/usr/local/cuda-6.0/include/cusp --with-debugging=1 --with-scalar-type=real --with-precision=double --download-fblaslapack Now this is done, how should I port my code to use it with CUDA ? Should I change something and include some CUDA directives in the code ? Is there some examples of makefile with nvcc that I could use ? Thanks in advance. Christophe On Fri, Aug 29, 2014 at 6:29 PM, Jed Brown wrote: > Satish Balay writes: > > > On Fri, 29 Aug 2014, Dominic Meiser wrote: > > > >> On 08/29/2014 08:31 AM, Matthew Knepley wrote: > >> > On Fri, Aug 29, 2014 at 9:35 AM, Dominic Meiser >> > > wrote: > > > >> > > Dominic, I think that thrust.py should depend on cuda.py. Do you > >> > > know why it does not? > >> > In principle you are right, thrust.py should depend on cuda.py. > >> > > >> > However, in my opinion, thrust.py should go away as a separate > >> > package altogether. Thrust is shipped as part of any recent > >> > version of the cuda toolkit (I forget since which version, Paul > >> > might know) and it's always installed in > >> > $CUDA_TOOLKIT_ROOT/include/thrust. Thus we can automatically > >> > deduct the thrust location from the cuda location. Thrust should > >> > be considered part of cuda. > > > > I also think it should be removed. > > Agreed. If it was correct to consolidate umfpack, cholmod, etc., into > suitesparse, then there is no question that thrust should be > consolidated into cuda. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglists at xgm.de Mon Sep 1 04:10:24 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Mon, 01 Sep 2014 11:10:24 +0200 Subject: [petsc-users] Set matrix column to vector Message-ID: <01ba7f2169a09f581b3a0f127e12f2e8@xgm.de> Hello, I want to set the entire column of a N x M matrix to a N vector. What is the best way to do that? My first guess would be to VecGetArray and use that array for MatSetValuesLocal with nrow = VecGetLocalSize. What is the best to say MatSetValuesLocal that I want to set all rows continuesly (same like passing irow = [0, 1, ..., VecGetLocalSize-1]? Any better way? Thanks, Florian From knepley at gmail.com Mon Sep 1 05:45:38 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Sep 2014 05:45:38 -0500 Subject: [petsc-users] Set matrix column to vector In-Reply-To: <01ba7f2169a09f581b3a0f127e12f2e8@xgm.de> References: <01ba7f2169a09f581b3a0f127e12f2e8@xgm.de> Message-ID: On Mon, Sep 1, 2014 at 4:10 AM, Florian Lindner wrote: > Hello, > > I want to set the entire column of a N x M matrix to a N vector. What is > the best way to do that? > > My first guess would be to VecGetArray and use that array for > MatSetValuesLocal with nrow = VecGetLocalSize. What is the best to say > MatSetValuesLocal that I want to set all rows continuesly (same like > passing irow = [0, 1, ..., VecGetLocalSize-1]? > > Any better way? > You are assuming dense storage above, so you can use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatDenseGetArray.html Matt > Thanks, > Florian > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 1 05:51:01 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Sep 2014 05:51:01 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <54042CDA.9080002@ntnu.no> References: <54042CDA.9080002@ntnu.no> Message-ID: On Mon, Sep 1, 2014 at 3:22 AM, ?smund Ervik wrote: > Hi, > > When I try to configure PETSc with HDF5 by using "--download-hdf5", I > get a very long error message from PETSc compiling HDF5 (see attached > error-down-hdf5.tar.gz) which boils down to > > h5tools_str.c: In function 'h5tools_str_indent': > h5tools_str.c:635:1: error: expected expression before '/' token > > This happens both with PETSc 3.5.1 and 3.4.5. It appears this was an > error in HDF5 version 1.8.10, cf. > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 > > > Subsequently I installed HDF5 v. 1.8.12 using my OS package manager and > then tried to configure with "--with-hdf5 > --with-hdf5-dir=/path/to/system/hdf5" using PETSc 3.5.1. This time both > configure and make were successful, but "make test" fails with loads of > undefined references to PETSc HDF5 stuff (see below). configure.log and > make.log attached also for this case (error-with-hdf5.tar.gz) > It did not build src/sys/classes/viewer/impls/hdf5/, clearly from the log. However, it should have. We have been trying to understand why Make behaves in a bad way after configuration failure. It should go away with make clean and another make. Jed, here is another case of this make bug. Thanks, Matt > > If you need any more info, just ask :) > > Regards, > ?smund > > > Using PETSC_DIR=/opt/petsc/optim_gfortran and PETSC_ARCH=linux > *******************Error detected during compile or > link!******************* > See http://www.mcs.anl.gov/petsc/documentation/faq.html > /opt/petsc/optim_gfortran/src/snes/examples/tutorials ex19 > > ********************************************************************************* > /opt/petsc/optim_gfortran/linux/bin/mpicc -o ex19.o -c -fPIC -Wall > -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 > -march=native -I/opt/petsc/optim_gfortran/include > -I/opt/petsc/optim_gfortran/linux/include -I/usr/include `pwd`/ex19.c > /opt/petsc/optim_gfortran/linux/bin/mpicc -fPIC -Wall -Wwrite-strings > -Wno-strict-aliasing -Wno-unknown-pragmas -O3 -march=native -o ex19 > ex19.o -Wl,-rpath,/opt/petsc/optim_gfortran/linux/lib > -L/opt/petsc/optim_gfortran/linux/lib -lpetsc > -Wl,-rpath,/opt/petsc/optim_gfortran/linux/lib -lHYPRE > -Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 > -L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 -lmpichcxx -lstdc++ > -lsuperlu_4.3 -lflapack -lfblas -lparmetis -lmetis -lX11 -lpthread -lssl > -lcrypto -Wl,-rpath,/usr/lib -L/usr/lib -lhdf5_hl -lhdf5 -lm -lmpichf90 > -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -ldl > -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5GetFileId' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5PushGroup' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5PopGroup' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerCreate_HDF5' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5ReadAttribute' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5WriteAttribute' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5SetTimestep' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PETSC_VIEWER_HDF5_' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5HasAttribute' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5GetTimestep' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5GetGroup' > collect2: error: ld returned 1 exit status > makefile:108: recipe for target 'ex19' failed > make[3]: [ex19] Error 1 (ignored) > /usr/bin/rm -f ex19.o > *******************Error detected during compile or > link!******************* > See http://www.mcs.anl.gov/petsc/documentation/faq.html > /opt/petsc/optim_gfortran/src/snes/examples/tutorials ex5f > ********************************************************* > /opt/petsc/optim_gfortran/linux/bin/mpif90 -c -fPIC -Wall > -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 > -march=native -fno-protect-parens -fstack-arrays > -I/opt/petsc/optim_gfortran/include > -I/opt/petsc/optim_gfortran/linux/include -I/usr/include -o ex5f.o > ex5f.F > /opt/petsc/optim_gfortran/linux/bin/mpif90 -fPIC -Wall > -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 > -march=native -fno-protect-parens -fstack-arrays -o ex5f ex5f.o > -Wl,-rpath,/opt/petsc/optim_gfortran/linux/lib > -L/opt/petsc/optim_gfortran/linux/lib -lpetsc > -Wl,-rpath,/opt/petsc/optim_gfortran/linux/lib -lHYPRE > -Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 > -L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.0 -lmpichcxx -lstdc++ > -lsuperlu_4.3 -lflapack -lfblas -lparmetis -lmetis -lX11 -lpthread -lssl > -lcrypto -Wl,-rpath,/usr/lib -L/usr/lib -lhdf5_hl -lhdf5 -lm -lmpichf90 > -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -ldl > -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5GetFileId' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5PushGroup' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5PopGroup' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerCreate_HDF5' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5ReadAttribute' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5WriteAttribute' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5SetTimestep' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PETSC_VIEWER_HDF5_' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5HasAttribute' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5GetTimestep' > /opt/petsc/optim_gfortran/linux/lib/libpetsc.so: undefined reference to > `PetscViewerHDF5GetGroup' > collect2: error: ld returned 1 exit status > makefile:50: recipe for target 'ex5f' failed > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 1 05:56:33 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Sep 2014 05:56:33 -0500 Subject: [petsc-users] Unable to configure PETSc with CUDA: Problem with thrust directory In-Reply-To: References: <54008FCD.4030604@txcorp.com> <54009404.103@txcorp.com> <878um745tm.fsf@jedbrown.org> Message-ID: On Mon, Sep 1, 2014 at 3:56 AM, Christophe Ortiz wrote: > Hi all, > > Thanks a lot for your answers. > > I followed Satish's advice and downloaded pets-dev. Then I tried to > configure it with CUDA 6.0. It complained because I did not have flex > installed, no big deal. Then it complained because it prefers GNU compilers > than Intel ones, no big deal. > > This being fixed, I was able to configure and compile pets-dev with CUDA > 6.0. There was no problem due to deprecated arch or thrust directory. > > I configured it with the following options: > > --with-x=1 --with-mpi=0 --with-cc=gcc --with-cxx=g++ --with-clanguage=cxx > --with-fc=gfortran --with-cuda=1 --with-cuda-dir=/usr/local/cuda-6.0 > --with-cuda-arch=sm_35 --with-thrust=1 > --with-thrust-dir=/usr/local/cuda-6.0/include/thrust --with-cusp=1 > --with-cusp-dir=/usr/local/cuda-6.0/include/cusp --with-debugging=1 > --with-scalar-type=real --with-precision=double --download-fblaslapack > > > Now this is done, how should I port my code to use it with CUDA ? Should I > change something and include some CUDA directives in the code ? Is there > some examples of makefile with nvcc that I could use ? > You just change the types of objects to those that use the GPU. See here: http://www.mcs.anl.gov/petsc/features/gpus.html Thanks, Matt > Thanks in advance. > Christophe > > > > > > On Fri, Aug 29, 2014 at 6:29 PM, Jed Brown wrote: > >> Satish Balay writes: >> >> > On Fri, 29 Aug 2014, Dominic Meiser wrote: >> > >> >> On 08/29/2014 08:31 AM, Matthew Knepley wrote: >> >> > On Fri, Aug 29, 2014 at 9:35 AM, Dominic Meiser > >> > > wrote: >> > >> >> > > Dominic, I think that thrust.py should depend on cuda.py. Do >> you >> >> > > know why it does not? >> >> > In principle you are right, thrust.py should depend on cuda.py. >> >> > >> >> > However, in my opinion, thrust.py should go away as a separate >> >> > package altogether. Thrust is shipped as part of any recent >> >> > version of the cuda toolkit (I forget since which version, Paul >> >> > might know) and it's always installed in >> >> > $CUDA_TOOLKIT_ROOT/include/thrust. Thus we can automatically >> >> > deduct the thrust location from the cuda location. Thrust should >> >> > be considered part of cuda. >> > >> > I also think it should be removed. >> >> Agreed. If it was correct to consolidate umfpack, cholmod, etc., into >> suitesparse, then there is no question that thrust should be >> consolidated into cuda. >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From christophe.ortiz at ciemat.es Mon Sep 1 06:15:01 2014 From: christophe.ortiz at ciemat.es (Christophe Ortiz) Date: Mon, 1 Sep 2014 13:15:01 +0200 Subject: [petsc-users] Unable to configure PETSc with CUDA: Problem with thrust directory In-Reply-To: References: <54008FCD.4030604@txcorp.com> <54009404.103@txcorp.com> <878um745tm.fsf@jedbrown.org> Message-ID: On Mon, Sep 1, 2014 at 12:56 PM, Matthew Knepley wrote: > On Mon, Sep 1, 2014 at 3:56 AM, Christophe Ortiz < > christophe.ortiz at ciemat.es> wrote: > >> Hi all, >> >> Thanks a lot for your answers. >> >> I followed Satish's advice and downloaded pets-dev. Then I tried to >> configure it with CUDA 6.0. It complained because I did not have flex >> installed, no big deal. Then it complained because it prefers GNU compilers >> than Intel ones, no big deal. >> >> This being fixed, I was able to configure and compile pets-dev with CUDA >> 6.0. There was no problem due to deprecated arch or thrust directory. >> >> I configured it with the following options: >> >> --with-x=1 --with-mpi=0 --with-cc=gcc --with-cxx=g++ --with-clanguage=cxx >> --with-fc=gfortran --with-cuda=1 --with-cuda-dir=/usr/local/cuda-6.0 >> --with-cuda-arch=sm_35 --with-thrust=1 >> --with-thrust-dir=/usr/local/cuda-6.0/include/thrust --with-cusp=1 >> --with-cusp-dir=/usr/local/cuda-6.0/include/cusp --with-debugging=1 >> --with-scalar-type=real --with-precision=double --download-fblaslapack >> >> >> Now this is done, how should I port my code to use it with CUDA ? Should >> I change something and include some CUDA directives in the code ? Is there >> some examples of makefile with nvcc that I could use ? >> > > You just change the types of objects to those that use the GPU. > See here: http://www.mcs.anl.gov/petsc/features/gpus.html > > Thanks. I will have a look at it. I also found the examples in the /src/. Will try to inspire from what is done. Christophe > Thanks, > > Matt > > >> Thanks in advance. >> Christophe >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmund.ervik at ntnu.no Mon Sep 1 06:18:13 2014 From: asmund.ervik at ntnu.no (=?UTF-8?B?w4VzbXVuZCBFcnZpaw==?=) Date: Mon, 01 Sep 2014 13:18:13 +0200 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> Message-ID: <540455F5.8050908@ntnu.no> On 01. sep. 2014 12:51, Matthew Knepley wrote: > On Mon, Sep 1, 2014 at 3:22 AM, ?smund Ervik wrote: > >> >> Subsequently I installed HDF5 v. 1.8.12 using my OS package manager and >> then tried to configure with "--with-hdf5 >> --with-hdf5-dir=/path/to/system/hdf5" using PETSc 3.5.1. This time both >> configure and make were successful, but "make test" fails with loads of >> undefined references to PETSc HDF5 stuff (see below). configure.log and >> make.log attached also for this case (error-with-hdf5.tar.gz) >> > > It did not build src/sys/classes/viewer/impls/hdf5/, clearly from the log. > However, it > should have. We have been trying to understand why Make behaves in a bad way > after configuration failure. It should go away with make clean and another > make. Thanks Matt, make clean and then make did result in make test passing all tests. However, I am still unable to use HDF5 functionality. When I compile and run e.g. vec/vec/examples/tutorials/ex10.c it works fine with binary output, but with HDF5 I get the error message below. I'm guessing this is some incompatibility between the various MPI, HDF5 and PETSc versions, i.e. that the HDF5 from my OS is not using the same MPI as PETSc. Is it simple to edit something in PETSc and have "./configure --download-hdf5" get the most recent HDF5 library which compiles on my machine? Error message: $ ./ex10 -hdf5 Vec Object:Test_Vec 1 MPI processes type: seq 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 writing vector in hdf5 to vector.dat ... [0]PETSC ERROR: #1 PetscViewerFileSetName_HDF5() line 81 in /opt/petsc/optim_gfortran/src/sys/classes/viewer/impls/hdf5/hdf5v.c [0]PETSC ERROR: #2 PetscViewerFileSetName() line 624 in /opt/petsc/optim_gfortran/src/sys/classes/viewer/impls/ascii/filev.c [0]PETSC ERROR: #3 PetscViewerHDF5Open() line 163 in /opt/petsc/optim_gfortran/src/sys/classes/viewer/impls/hdf5/hdf5v.c [0]PETSC ERROR: #4 main() line 66 in /opt/petsc/optim_gfortran/src/vec/vec/examples/tutorials/ex10.c [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0 HDF5: infinite loop closing library D,T,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD Regards, ?smund -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From asmund.ervik at ntnu.no Mon Sep 1 06:45:31 2014 From: asmund.ervik at ntnu.no (=?windows-1252?Q?=C5smund_Ervik?=) Date: Mon, 01 Sep 2014 13:45:31 +0200 Subject: [petsc-users] Reusing the preconditioner when using KSPSetComputeOperators In-Reply-To: <9A6AC8E7-682A-4648-AEEA-24CB01C4119A@mcs.anl.gov> References: <0E576811AB298343AC632BBCAAEFC37945D5AE5A@WAREHOUSE08.win.ntnu.no> <9A6AC8E7-682A-4648-AEEA-24CB01C4119A@mcs.anl.gov> Message-ID: <54045C5B.2030508@ntnu.no> On 29. aug. 2014 19:03, Barry Smith wrote: > > > On Aug 29, 2014, at 9:35 AM, ?smund Ervik wrote: > >>> On 28. aug. 2014 20:52, Barry Smith wrote: >>>> >>>> On Aug 28, 2014, at 4:34 AM, ?smund Ervik wrote: >>>> >>>>> Hello, >>>>> >>>>> I am solving a pressure Poisson equation with KSP, where the initial >>>>> guess, RHS and matrix are computed by functions that I've hooked into >>>>> KSPSetComputeXXX. (I'm also using DMDA for my domain decomposition.) >>>>> >>>>> For (single-phase|two-phase) I would like to (reuse|not reuse) the >>>>> preconditioner. How do I specify that when using this way of setting the >>>>> operator? Is it toggled by whether or not I call KSPSetOperators before >>>>> each KSPSolve? (The manual does not mention KSPSetComputeXXX.) >>>> >>>> You should call KSPSetOperators() before each KSPSolve() (otherwise the function you provide to compute the matrix won?t be triggered). >>>> >>>> With PETSc 3.5 after the call to KSPSetOperators() call KSPSetReusePreconditioner() to tell KSP wether to reuse the preconditioner or build a new one. >>>> With PETSc 3.4 and earlier, the final argument to KSPSetOperators() would be MAT_SAME_PRECONDITIONER to reuse the preconditioner or MAT_SAME_NONZERO_PATTERN to construct a new preconditioner >>>> >> >> Thanks Barry for the clarification. Is there an example somewhere that >> does this? All the ones I can find which use KSPSetComputeOperators() >> have no calls to KSPSetOperators(). I guess this is because they are >> only doing one linear solve? >> >> Furthermore, what should I pass in for Amat and Pmat to the KSPSetOperators() call? PetscNullObject, or do I get the Amat from the KSP somehow? > > You better call KSPGetOperators() to get them. Yes this is kind of silly. Okay, thanks. I found I also have to call MatSetFromOptions() on Amat and Pmat after I get them from KSPGetOperators(), otherwise KSPSetOperators() complains that "Object is in the wrong state! Mat object's type is not set!" This is indeed a strange sequence of calls. If this is the "standard" way of doing like ksp/ex34.c but for several successive solves, then it would be nice to have a comment/instruction in ex34.c for that case. ?smund -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From knepley at gmail.com Mon Sep 1 06:52:20 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Sep 2014 06:52:20 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <540455F5.8050908@ntnu.no> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> Message-ID: On Mon, Sep 1, 2014 at 6:18 AM, ?smund Ervik wrote: > > > On 01. sep. 2014 12:51, Matthew Knepley wrote: > > On Mon, Sep 1, 2014 at 3:22 AM, ?smund Ervik > wrote: > > > >> > >> Subsequently I installed HDF5 v. 1.8.12 using my OS package manager and > >> then tried to configure with "--with-hdf5 > >> --with-hdf5-dir=/path/to/system/hdf5" using PETSc 3.5.1. This time both > >> configure and make were successful, but "make test" fails with loads of > >> undefined references to PETSc HDF5 stuff (see below). configure.log and > >> make.log attached also for this case (error-with-hdf5.tar.gz) > >> > > > > It did not build src/sys/classes/viewer/impls/hdf5/, clearly from the > log. > > However, it > > should have. We have been trying to understand why Make behaves in a bad > way > > after configuration failure. It should go away with make clean and > another > > make. > > Thanks Matt, make clean and then make did result in make test passing > all tests. > > However, I am still unable to use HDF5 functionality. When I compile and > run e.g. vec/vec/examples/tutorials/ex10.c it works fine with binary > output, but with HDF5 I get the error message below. > > I'm guessing this is some incompatibility between the various MPI, HDF5 > and PETSc versions, i.e. that the HDF5 from my OS is not using the same > MPI as PETSc. Is it simple to edit something in PETSc and have > "./configure --download-hdf5" get the most recent HDF5 library which > compiles on my machine? > Yes, I cannot reproduce, so it must be something like that. Can you reconfigure using the --download version? Thanks, Matt > Error message: > $ ./ex10 -hdf5 > Vec Object:Test_Vec 1 MPI processes > type: seq > 0 > 1 > 2 > 3 > 4 > 5 > 6 > 7 > 8 > 9 > 10 > 11 > 12 > 13 > 14 > 15 > 16 > 17 > 18 > 19 > writing vector in hdf5 to vector.dat ... > [0]PETSC ERROR: #1 PetscViewerFileSetName_HDF5() line 81 in > /opt/petsc/optim_gfortran/src/sys/classes/viewer/impls/hdf5/hdf5v.c > [0]PETSC ERROR: #2 PetscViewerFileSetName() line 624 in > /opt/petsc/optim_gfortran/src/sys/classes/viewer/impls/ascii/filev.c > [0]PETSC ERROR: #3 PetscViewerHDF5Open() line 163 in > /opt/petsc/optim_gfortran/src/sys/classes/viewer/impls/hdf5/hdf5v.c > [0]PETSC ERROR: #4 main() line 66 in > /opt/petsc/optim_gfortran/src/vec/vec/examples/tutorials/ex10.c > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0 > HDF5: infinite loop closing library > > > D,T,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD > > Regards, > ?smund > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From asmund.ervik at ntnu.no Mon Sep 1 06:57:23 2014 From: asmund.ervik at ntnu.no (=?UTF-8?B?w4VzbXVuZCBFcnZpaw==?=) Date: Mon, 01 Sep 2014 13:57:23 +0200 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> Message-ID: <54045F23.2070506@ntnu.no> On 01. sep. 2014 13:52, Matthew Knepley wrote: > On Mon, Sep 1, 2014 at 6:18 AM, ?smund Ervik wrote: > >> >> >> On 01. sep. 2014 12:51, Matthew Knepley wrote: >>> On Mon, Sep 1, 2014 at 3:22 AM, ?smund Ervik >> wrote: >>> >>>> >>>> Subsequently I installed HDF5 v. 1.8.12 using my OS package manager and >>>> then tried to configure with "--with-hdf5 >>>> --with-hdf5-dir=/path/to/system/hdf5" using PETSc 3.5.1. This time both >>>> configure and make were successful, but "make test" fails with loads of >>>> undefined references to PETSc HDF5 stuff (see below). configure.log and >>>> make.log attached also for this case (error-with-hdf5.tar.gz) >>>> >>> >>> It did not build src/sys/classes/viewer/impls/hdf5/, clearly from the >> log. >>> However, it >>> should have. We have been trying to understand why Make behaves in a bad >> way >>> after configuration failure. It should go away with make clean and >> another >>> make. >> >> Thanks Matt, make clean and then make did result in make test passing >> all tests. >> >> However, I am still unable to use HDF5 functionality. When I compile and >> run e.g. vec/vec/examples/tutorials/ex10.c it works fine with binary >> output, but with HDF5 I get the error message below. >> >> I'm guessing this is some incompatibility between the various MPI, HDF5 >> and PETSc versions, i.e. that the HDF5 from my OS is not using the same >> MPI as PETSc. Is it simple to edit something in PETSc and have >> "./configure --download-hdf5" get the most recent HDF5 library which >> compiles on my machine? >> > > Yes, I cannot reproduce, so it must be something like that. Can you > reconfigure > using the --download version? No, I am not able to, that is why I switched to "--with-hdf5-dir=". Like I said in the first email, the version of HDF5 (1.8.10) that ships with PETSc 3.5.1 does not compile on my machine. This is apparently a known bug that was fixed upstream, cf. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 This is the reason why I asked whether I can somehow tell "--download-hdf5" to download a more recent version. 1.8.11 should do it. ?smund -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From knepley at gmail.com Mon Sep 1 07:02:08 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Sep 2014 07:02:08 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <54045F23.2070506@ntnu.no> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> Message-ID: On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik wrote: > > On 01. sep. 2014 13:52, Matthew Knepley wrote: > > On Mon, Sep 1, 2014 at 6:18 AM, ?smund Ervik > wrote: > > > >> > >> > >> On 01. sep. 2014 12:51, Matthew Knepley wrote: > >>> On Mon, Sep 1, 2014 at 3:22 AM, ?smund Ervik > >> wrote: > >>> > >>>> > >>>> Subsequently I installed HDF5 v. 1.8.12 using my OS package manager > and > >>>> then tried to configure with "--with-hdf5 > >>>> --with-hdf5-dir=/path/to/system/hdf5" using PETSc 3.5.1. This time > both > >>>> configure and make were successful, but "make test" fails with loads > of > >>>> undefined references to PETSc HDF5 stuff (see below). configure.log > and > >>>> make.log attached also for this case (error-with-hdf5.tar.gz) > >>>> > >>> > >>> It did not build src/sys/classes/viewer/impls/hdf5/, clearly from the > >> log. > >>> However, it > >>> should have. We have been trying to understand why Make behaves in a > bad > >> way > >>> after configuration failure. It should go away with make clean and > >> another > >>> make. > >> > >> Thanks Matt, make clean and then make did result in make test passing > >> all tests. > >> > >> However, I am still unable to use HDF5 functionality. When I compile and > >> run e.g. vec/vec/examples/tutorials/ex10.c it works fine with binary > >> output, but with HDF5 I get the error message below. > >> > >> I'm guessing this is some incompatibility between the various MPI, HDF5 > >> and PETSc versions, i.e. that the HDF5 from my OS is not using the same > >> MPI as PETSc. Is it simple to edit something in PETSc and have > >> "./configure --download-hdf5" get the most recent HDF5 library which > >> compiles on my machine? > >> > > > > Yes, I cannot reproduce, so it must be something like that. Can you > > reconfigure > > using the --download version? > > No, I am not able to, that is why I switched to "--with-hdf5-dir=". Like > I said in the first email, the version of HDF5 (1.8.10) that ships with > PETSc 3.5.1 does not compile on my machine. This is apparently a known > bug that was fixed upstream, cf. > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 > > This is the reason why I asked whether I can somehow tell > "--download-hdf5" to download a more recent version. 1.8.11 should do it. People who leave commented out code in there should be tarred, feathered, and run out of town on a rail. You can just go in and delete that line, and then reconfigure. It will use the source that is already downloaded. Satish, can we patch our download source? I can do it, but I am not sure of the process for HDF5. Thanks, Matt > > ?smund > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglists at xgm.de Mon Sep 1 08:44:54 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Mon, 01 Sep 2014 15:44:54 +0200 Subject: [petsc-users] Set matrix column to vector In-Reply-To: References: <01ba7f2169a09f581b3a0f127e12f2e8@xgm.de> Message-ID: Am 01.09.2014 12:45, schrieb Matthew Knepley: > On Mon, Sep 1, 2014 at 4:10 AM, Florian Lindner > wrote: > >> Hello, >> >> I want to set the entire column of a N x M matrix to a N vector. What >> is >> the best way to do that? >> >> My first guess would be to VecGetArray and use that array for >> MatSetValuesLocal with nrow = VecGetLocalSize. What is the best to say >> MatSetValuesLocal that I want to set all rows continuesly (same like >> passing irow = [0, 1, ..., VecGetLocalSize-1]? >> >> Any better way? >> > > You are assuming dense storage above, so you can use > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatDenseGetArray.html+ How can you tell that I'm assuming dense storage. My matrix is actually dense, but I try to write my code as generic as possible (being a very petsc newbie). I have that code which crashes at the moment: void set_column_vector(Vector v, int col) { PetscErrorCode ierr = 0; const PetscScalar *vec; PetscInt size, mat_rows, mat_cols; VecGetLocalSize(v.vector, &size); cout << "Vector Size = " << size << endl; MatGetSize(matrix, &mat_rows, &mat_cols); cout << "Matrix Rows = " << mat_rows << " Columns = " << mat_cols << endl; PetscInt irow[size]; for (int i = 0; i < size; i++) { irow[i] = i; } ierr = VecGetArrayRead(v.vector, &vec); CHKERRV(ierr); ierr = MatSetValuesLocal(matrix, size-1, irow, 1, &col, vec, INSERT_VALUES); CHKERRV(ierr); ierr = VecRestoreArrayRead(v.vector, &vec); CHKERRV(ierr); ierr = MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); CHKERRV(ierr); ierr = MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); CHKERRV(ierr); } v.vector is a Vec, matrix is a Mat. col = 1. It's compiled with mpic++, but started without, just ./a.out. Output is: Vector Size = 20 Matrix Rows = 20 Columns = 5 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatSetValuesLocal line 1950 /home/florian/software/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.1, unknown [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named asaru by florian Mon Sep 1 15:37:32 2014 [0]PETSC ERROR: Configure options --with-c2html=0 [0]PETSC ERROR: #1 User provided function() line 0 in unknown file -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 59. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- Any idea what the problem is? Thanks! Florian From mailinglists at xgm.de Mon Sep 1 08:48:27 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Mon, 01 Sep 2014 15:48:27 +0200 Subject: [petsc-users] Set matrix column to vector In-Reply-To: References: <01ba7f2169a09f581b3a0f127e12f2e8@xgm.de> Message-ID: Am 01.09.2014 15:44, schrieb Florian Lindner: > Am 01.09.2014 12:45, schrieb Matthew Knepley: >> On Mon, Sep 1, 2014 at 4:10 AM, Florian Lindner >> wrote: >> >>> Hello, >>> >>> I want to set the entire column of a N x M matrix to a N vector. What >>> is >>> the best way to do that? >>> >>> My first guess would be to VecGetArray and use that array for >>> MatSetValuesLocal with nrow = VecGetLocalSize. What is the best to >>> say >>> MatSetValuesLocal that I want to set all rows continuesly (same like >>> passing irow = [0, 1, ..., VecGetLocalSize-1]? >>> >>> Any better way? >>> >> >> You are assuming dense storage above, so you can use >> >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatDenseGetArray.html+ > > How can you tell that I'm assuming dense storage. My matrix is > actually dense, but I try to write my code as generic as possible > (being a very petsc newbie). I have that code which crashes at the > moment: > > void set_column_vector(Vector v, int col) > { > PetscErrorCode ierr = 0; > const PetscScalar *vec; > PetscInt size, mat_rows, mat_cols; > VecGetLocalSize(v.vector, &size); > cout << "Vector Size = " << size << endl; > > MatGetSize(matrix, &mat_rows, &mat_cols); > cout << "Matrix Rows = " << mat_rows << " Columns = " << > mat_cols << endl; > PetscInt irow[size]; > for (int i = 0; i < size; i++) { > irow[i] = i; > } > > ierr = VecGetArrayRead(v.vector, &vec); CHKERRV(ierr); > ierr = MatSetValuesLocal(matrix, size-1, irow, 1, &col, vec, > INSERT_VALUES); CHKERRV(ierr); Correction: ierr = MatSetValuesLocal(matrix, size, irow, 1, &col, vec, INSERT_VALUES); size-1 was just one of my debugging experiments. > ierr = VecRestoreArrayRead(v.vector, &vec); CHKERRV(ierr); > ierr = MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); > CHKERRV(ierr); > ierr = MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); CHKERRV(ierr); > } > > v.vector is a Vec, matrix is a Mat. col = 1. It's compiled with > mpic++, but started without, just ./a.out. Output is: > > Vector Size = 20 > Matrix Rows = 20 Columns = 5 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to > find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatSetValuesLocal line 1950 > /home/florian/software/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble > shooting. > [0]PETSC ERROR: Petsc Release Version 3.5.1, unknown > [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named asaru by > florian Mon Sep 1 15:37:32 2014 > [0]PETSC ERROR: Configure options --with-c2html=0 > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 59. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > > > Any idea what the problem is? > > Thanks! > Florian From asmund.ervik at ntnu.no Mon Sep 1 09:02:22 2014 From: asmund.ervik at ntnu.no (=?UTF-8?B?w4VzbXVuZCBFcnZpaw==?=) Date: Mon, 01 Sep 2014 16:02:22 +0200 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> Message-ID: <54047C6E.2010406@ntnu.no> On 01. sep. 2014 14:02, Matthew Knepley wrote: > On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik wrote: > >> >> No, I am not able to, that is why I switched to "--with-hdf5-dir=". Like >> I said in the first email, the version of HDF5 (1.8.10) that ships with >> PETSc 3.5.1 does not compile on my machine. This is apparently a known >> bug that was fixed upstream, cf. >> >> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 >> >> This is the reason why I asked whether I can somehow tell >> "--download-hdf5" to download a more recent version. 1.8.11 should do it. > > > People who leave commented out code in there should be tarred, feathered, > and run out of town on a rail. You can just go in and delete that line, and > then > reconfigure. It will use the source that is already downloaded. Okay, I also had to do the same for two lines in h5dump/h5dump_ddl.c:1344 and then I was able to configure with "--download-hdf5". I am now able to run vec/ex10.c using HDF5 with success! Thanks a lot :) (and double on the tar and feathers for people who use C++ comments in C code). After some tinkering I am now able to produce 3D plots of scalar fields from my code, and this looks good in Visit and in Tec360 Ex. However, it only works for scalars where only a single DOF is associated to a DMDA. Vector valued fields don't work either. But I am able to save two separate scalar fields to the same HDF5 file, as long as both are the single DOF associated to a DMDA. Also, if I call PetscViewerHDF5IncrementTimestep() the resulting HDF5 files can no longer be visualized even for the scalars, both Tec360 and Visit give error messages. Any hints on how to get full-fledged HDF5 files with timestep and vector/multiple scalar support? Thanks again, ?smund -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From knepley at gmail.com Mon Sep 1 09:11:42 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Sep 2014 09:11:42 -0500 Subject: [petsc-users] Set matrix column to vector In-Reply-To: References: <01ba7f2169a09f581b3a0f127e12f2e8@xgm.de> Message-ID: On Mon, Sep 1, 2014 at 8:44 AM, Florian Lindner wrote: > Am 01.09.2014 12:45, schrieb Matthew Knepley: > >> On Mon, Sep 1, 2014 at 4:10 AM, Florian Lindner >> wrote: >> >> Hello, >>> >>> I want to set the entire column of a N x M matrix to a N vector. What is >>> the best way to do that? >>> >>> My first guess would be to VecGetArray and use that array for >>> MatSetValuesLocal with nrow = VecGetLocalSize. What is the best to say >>> MatSetValuesLocal that I want to set all rows continuesly (same like >>> passing irow = [0, 1, ..., VecGetLocalSize-1]? >>> >>> Any better way? >>> >>> >> You are assuming dense storage above, so you can use >> >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/ >> MatDenseGetArray.html+ >> > > How can you tell that I'm assuming dense storage. My matrix is actually > dense, but I try to write my code as generic as possible (being a very > petsc newbie). I have that code which crashes at the moment: > I recommend running using the debugger so you can get a stack trace, and perhaps see exactly what the problem is. You can also run under valgrind as the error says. Matt > void set_column_vector(Vector v, int col) > { > PetscErrorCode ierr = 0; > const PetscScalar *vec; > PetscInt size, mat_rows, mat_cols; > VecGetLocalSize(v.vector, &size); > cout << "Vector Size = " << size << endl; > > MatGetSize(matrix, &mat_rows, &mat_cols); > cout << "Matrix Rows = " << mat_rows << " Columns = " << mat_cols > << endl; > PetscInt irow[size]; > for (int i = 0; i < size; i++) { > irow[i] = i; > } > > ierr = VecGetArrayRead(v.vector, &vec); CHKERRV(ierr); > ierr = MatSetValuesLocal(matrix, size-1, irow, 1, &col, vec, > INSERT_VALUES); CHKERRV(ierr); > ierr = VecRestoreArrayRead(v.vector, &vec); CHKERRV(ierr); > ierr = MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); CHKERRV(ierr); > ierr = MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); CHKERRV(ierr); > } > > v.vector is a Vec, matrix is a Mat. col = 1. It's compiled with mpic++, > but started without, just ./a.out. Output is: > > Vector Size = 20 > Matrix Rows = 20 Columns = 5 > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org > on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatSetValuesLocal line 1950 > /home/florian/software/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.5.1, unknown > [0]PETSC ERROR: ./a.out on a arch-linux2-c-debug named asaru by florian > Mon Sep 1 15:37:32 2014 > [0]PETSC ERROR: Configure options --with-c2html=0 > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 59. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > > > Any idea what the problem is? > > Thanks! > Florian > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 1 09:15:47 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Sep 2014 09:15:47 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <54047C6E.2010406@ntnu.no> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <54047C6E.2010406@ntnu.no> Message-ID: On Mon, Sep 1, 2014 at 9:02 AM, ?smund Ervik wrote: > On 01. sep. 2014 14:02, Matthew Knepley wrote: > > On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik > wrote: > > > >> > >> No, I am not able to, that is why I switched to "--with-hdf5-dir=". Like > >> I said in the first email, the version of HDF5 (1.8.10) that ships with > >> PETSc 3.5.1 does not compile on my machine. This is apparently a known > >> bug that was fixed upstream, cf. > >> > >> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 > >> > >> This is the reason why I asked whether I can somehow tell > >> "--download-hdf5" to download a more recent version. 1.8.11 should do > it. > > > > > > People who leave commented out code in there should be tarred, feathered, > > and run out of town on a rail. You can just go in and delete that line, > and > > then > > reconfigure. It will use the source that is already downloaded. > > Okay, I also had to do the same for two lines in > h5dump/h5dump_ddl.c:1344 and then I was able to configure with > "--download-hdf5". I am now able to run vec/ex10.c using HDF5 with success! > > Thanks a lot :) (and double on the tar and feathers for people who use > C++ comments in C code). > > After some tinkering I am now able to produce 3D plots of scalar fields > from my code, and this looks good in Visit and in Tec360 Ex. However, it > only works for scalars where only a single DOF is associated to a DMDA. > Vector valued fields don't work either. But I am able to save two > separate scalar fields to the same HDF5 file, as long as both are the > single DOF associated to a DMDA. > It seems like these programs expect a different organization than we use for multiple DOF. Do you know what they want? > Also, if I call PetscViewerHDF5IncrementTimestep() the resulting HDF5 > files can no longer be visualized even for the scalars, both Tec360 and > Visit give error messages. > HDF5 is a generic format, and programs like Visit and Tec360 assume some organization inside. For time we just add an array dimension, but they might do something else. I recommend looking at bin/pythonscripts/petsc_ge_xdmf.py, which creates .xdmf to describe our HDF5 organization. Thanks, Matt Any hints on how to get full-fledged HDF5 files with timestep and > vector/multiple scalar support? > > Thanks again, > ?smund > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From websites at el-sites.ru Mon Sep 1 23:37:47 2014 From: websites at el-sites.ru (=?Windows-1251?B?xevl7eA=?=) Date: Tue, 2 Sep 2014 00:37:47 -0400 Subject: [petsc-users] =?windows-1251?b?xfHy/CDv8O7h6+Xs+yDxIPHg6fLu7D8g?= =?windows-1251?b?zPsg7+7s7ubl7CE=?= Message-ID: <20140902043755.11AB76D0807F@fbdbscrub01.att-mail.com> ?????? ?? 15 ???????? 2014 ????! 1. ??????? ? ????? ? ???? RU ?? ???? ??? ?? 1 ??? 2. ?????????? ????? ??????? ? ???? (??????????????? ??????????) 3. ????? ???????? ????????? ? ????? ??????? 4. ???????? ??????????? 5. ???????? 6-8 ??????? ????? ???????? ???? (812) 981-72-40 (952) 368-65-21 info at el-sites.ru www.el-sites.ru Skype: el.sites ???? ?? ?????? ?????????? ?? ????????, ??????? ????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Tue Sep 2 02:08:58 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 2 Sep 2014 07:08:58 +0000 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: References: , , Message-ID: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> Matt, Attached is a small Fortran code that replicates the second problem. Chris [cid:image76463f.JPG at e74baedb.4a899186][cid:imageae8e3c.JPG at 1f0a6dd3.4e92756a] dr. ir. Christiaan Klaij CFD Researcher Research & Development MARIN 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I www.marin.nl MARIN news: MARIN at SMM, Hamburg, September 9-12 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Klaij, Christiaan Sent: Friday, August 29, 2014 4:42 PM To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] PCFieldSplitSetSchurPre in fortran Matt, The small test code (ex70) is in C and it works fine, the problem happens in a big Fortran code. I will try to replicate the problem in a small Fortran code, but that will take some time. Chris ________________________________ From: Matthew Knepley Sent: Friday, August 29, 2014 4:14 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan > wrote: I'm trying PCFieldSplitSetSchurPre with PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to be missing in fortran, I get the compile error: This name does not have a type, and must have an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP] while compilation works fine with _A11, _SELF and _USER. Mark Adams has just fixed this. The second problem is that the call doesn't seem to have any effect. For example, I have CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) CALL PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) This compiles and runs, but ksp_view tells me PC Object:(sys_) 3 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from A11 So changing the factorization from the default FULL to LOWER did work, but changing the preconditioner from A11 to USER didn't. I've also tried to run directly from the command line using -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view This works in the sense that I don't get the "WARNING! There are options you set that were not used!" message, but still ksp_view reports A11 instead of user provided matrix. Can you send a small test code, since I use this everyday here and it works. Thanks, Matt Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: image76463f.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: imageae8e3c.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: fieldsplitbug.F90 Type: text/x-fortran Size: 4419 bytes Desc: fieldsplitbug.F90 URL: From asmund.ervik at ntnu.no Tue Sep 2 03:09:07 2014 From: asmund.ervik at ntnu.no (=?UTF-8?B?w4VzbXVuZCBFcnZpaw==?=) Date: Tue, 02 Sep 2014 10:09:07 +0200 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <54047C6E.2010406@ntnu.no> Message-ID: <54057B23.40500@ntnu.no> On 01. sep. 2014 16:15, Matthew Knepley wrote: > On Mon, Sep 1, 2014 at 9:02 AM, ?smund Ervik wrote: > >> On 01. sep. 2014 14:02, Matthew Knepley wrote: >>> On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik >> wrote: >>> >>>> >>>> No, I am not able to, that is why I switched to "--with-hdf5-dir=". Like >>>> I said in the first email, the version of HDF5 (1.8.10) that ships with >>>> PETSc 3.5.1 does not compile on my machine. This is apparently a known >>>> bug that was fixed upstream, cf. >>>> >>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 >>>> >>>> This is the reason why I asked whether I can somehow tell >>>> "--download-hdf5" to download a more recent version. 1.8.11 should do >> it. >>> >>> >>> People who leave commented out code in there should be tarred, feathered, >>> and run out of town on a rail. You can just go in and delete that line, >> and >>> then >>> reconfigure. It will use the source that is already downloaded. >> >> Okay, I also had to do the same for two lines in >> h5dump/h5dump_ddl.c:1344 and then I was able to configure with >> "--download-hdf5". I am now able to run vec/ex10.c using HDF5 with success! >> >> Thanks a lot :) (and double on the tar and feathers for people who use >> C++ comments in C code). >> >> After some tinkering I am now able to produce 3D plots of scalar fields >> from my code, and this looks good in Visit and in Tec360 Ex. However, it >> only works for scalars where only a single DOF is associated to a DMDA. >> Vector valued fields don't work either. But I am able to save two >> separate scalar fields to the same HDF5 file, as long as both are the >> single DOF associated to a DMDA. >> > > It seems like these programs expect a different organization than we use for > multiple DOF. Do you know what they want? The Tec360 Manual says the following: "The HDF5 loader add-on allows you to import general HDF5 files into Tecplot 360. The loader provides a mechanism for importing generic data from multiple HDF5 datasets or groups. The HDF5 loader will load datasets within user selected groups, load one or more user selected datasets to one zone, load multiple user selected datasets to multiple zones, execute macros after data has been loaded, create implicit X, Y, and Z grid vectors as needed, sub-sample loaded data, and reference user selected vectors for X, Y, and Z grids. Datasets must be ordered data. The HDF5 library used is version 1.8.5." This did not leave me any wiser, but perhaps it means something to you. FYI, the error I get when trying to import vectors or multiple scalars is "rank N data not supported" where N >= 4. > > >> Also, if I call PetscViewerHDF5IncrementTimestep() the resulting HDF5 >> files can no longer be visualized even for the scalars, both Tec360 and >> Visit give error messages. >> > > HDF5 is a generic format, and programs like Visit and Tec360 assume some > organization inside. For time we just add an array dimension, but they > might do > something else. I recommend looking at bin/pythonscripts/petsc_ge_xdmf.py, > which creates .xdmf to describe our HDF5 organization. Xdmf seems handy, and I would like to use this feature, but I'm not really sure how to. I tried running "petsc_ge_xdmf.py Myfile.h5" but I get the error below. Am I using the script wrong? Or perhaps it is assuming that the data is from a DMplex? I'm using DMDA. Traceback (most recent call last): File "./petsc_gen_xdmf.py", line 220, in generateXdmf(sys.argv[1]) File "./petsc_gen_xdmf.py", line 194, in generateXdmf geom = h5['geometry'] File "/usr/lib/python2.7/site-packages/h5py/_hl/group.py", line 153, in __getitem__ oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5o.pyx", line 183, in h5py.h5o.open (h5py/h5o.c:3844) KeyError: "Unable to open object (Object 'geometry' doesn't exist)" ?smund -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From mailinglists at xgm.de Tue Sep 2 03:41:34 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Tue, 02 Sep 2014 10:41:34 +0200 Subject: [petsc-users] Set matrix column to vector In-Reply-To: References: <01ba7f2169a09f581b3a0f127e12f2e8@xgm.de> Message-ID: <70e1366f854f58b79c74299993fb3fd3@xgm.de> Am 01.09.2014 16:11, schrieb Matthew Knepley: > I recommend running using the debugger so you can get a stack trace, > and > perhaps see > exactly what the problem is. You can also run under valgrind as the > error > says. This of course I tried, but had no real success. It crashes in ISLocalToGlobalMappingApply, on the first line: PetscInt i,bs = mapping->bs,Nmax = bs*mapping->n; It crashes only when using MatSetValuesLocal, not when using MatSetValues. I've created an example that compiles just fine: http://pastebin.com/iEkLK9DZ Sorry, I really got no idea what could be the problem her. Thx, Florian #include #include "petscmat.h" #include "petscviewer.h" // Compiling with: mpic++ -g3 -Wall -I ~/software/petsc/include -I ~/software/petsc/arch-linux2-c-debug/include -L ~/software/petsc/arch-linux2-c-debug/lib -lpetsc test.cpp // Running without mpirun. int main(int argc, char **args) { PetscInitialize(&argc, &args, "", NULL); PetscErrorCode ierr = 0; int num_rows = 10; Mat matrix; Vec vector; // Create dense matrix, but code should work for sparse, too (I hope). ierr = MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, num_rows, 4, NULL, &matrix); CHKERRQ(ierr); ierr = MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); ierr = MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); ierr = VecCreate(PETSC_COMM_WORLD, &vector); CHKERRQ(ierr); ierr = VecSetSizes(vector, PETSC_DECIDE, num_rows); CHKERRQ(ierr); ierr = VecSetFromOptions(vector); CHKERRQ(ierr); // Init vector with 0, 1, ... , num_rows-1 PetscScalar *a; PetscInt range_start, range_end, pos = 0; VecGetOwnershipRange(vector, &range_start, &range_end); ierr = VecGetArray(vector, &a); CHKERRQ(ierr); for (PetscInt i = range_start; i < range_end; i++) { a[pos] = pos + range_start; pos++; } VecRestoreArray(vector, &a); // VecAssemblyBegin(vector); VecAssemblyEnd(vector); I don't think it's needed here, changes nothing. ierr = VecView(vector, PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); PetscInt irow[num_rows]; const PetscInt col = 2; const PetscScalar *vec; for (int i = 0; i < num_rows; i++) { irow[i] = i; } ierr = VecGetArrayRead(vector, &vec); CHKERRQ(ierr); // MatSetValuesLocal(Mat mat,PetscInt nrow,const PetscInt irow[],PetscInt ncol,const PetscInt icol[],const PetscScalar y[],InsertMode addv) ierr = MatSetValuesLocal(matrix, num_rows, irow, 1, &col, vec, INSERT_VALUES); CHKERRQ(ierr); // Works fine with MatSetValues ierr = VecRestoreArrayRead(vector, &vec); CHKERRQ(ierr); ierr = MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); ierr = MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); ierr = MatView(matrix, PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); PetscFinalize(); return 0; } From lawrence.mitchell at imperial.ac.uk Tue Sep 2 03:55:44 2014 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Tue, 02 Sep 2014 09:55:44 +0100 Subject: [petsc-users] Set matrix column to vector In-Reply-To: <70e1366f854f58b79c74299993fb3fd3@xgm.de> References: <01ba7f2169a09f581b3a0f127e12f2e8@xgm.de> <70e1366f854f58b79c74299993fb3fd3@xgm.de> Message-ID: <54058610.9000203@imperial.ac.uk> On 02/09/14 09:41, Florian Lindner wrote: > Am 01.09.2014 16:11, schrieb Matthew Knepley: > >> I recommend running using the debugger so you can get a stack trace, and >> perhaps see >> exactly what the problem is. You can also run under valgrind as the error >> says. > > This of course I tried, but had no real success. It crashes in > ISLocalToGlobalMappingApply, on the first line: PetscInt i,bs = > mapping->bs,Nmax = bs*mapping->n; > > It crashes only when using MatSetValuesLocal, not when using MatSetValues. > > I've created an example that compiles just fine: > http://pastebin.com/iEkLK9DZ You never create a local to global mapping and set it on the matrix, so that mapping is NULL inside ISLocalToGlobalMappingApply. You need to do something like: ISLocalToGlobalMappingCreate(..., &rmapping); ISLocalToGlobalMappingCreate(..., &cmapping); MatSetLocalToGlobalMapping(mat, rmapping, cmapping); Then you can use MatSetValuesLocal. Cheers, Lawrence From asmund.ervik at ntnu.no Tue Sep 2 05:13:29 2014 From: asmund.ervik at ntnu.no (=?UTF-8?B?w4VzbXVuZCBFcnZpaw==?=) Date: Tue, 02 Sep 2014 12:13:29 +0200 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <54057B23.40500@ntnu.no> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <54047C6E.2010406@ntnu.no> <54057B23.40500@ntnu.no> Message-ID: <54059849.6040709@ntnu.no> On 02. sep. 2014 10:09, ?smund Ervik wrote: > > > On 01. sep. 2014 16:15, Matthew Knepley wrote: >> On Mon, Sep 1, 2014 at 9:02 AM, ?smund Ervik wrote: >> >>> On 01. sep. 2014 14:02, Matthew Knepley wrote: >>>> On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik >>> wrote: >>>> >>>>> >>>>> No, I am not able to, that is why I switched to "--with-hdf5-dir=". Like >>>>> I said in the first email, the version of HDF5 (1.8.10) that ships with >>>>> PETSc 3.5.1 does not compile on my machine. This is apparently a known >>>>> bug that was fixed upstream, cf. >>>>> >>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 >>>>> >>>>> This is the reason why I asked whether I can somehow tell >>>>> "--download-hdf5" to download a more recent version. 1.8.11 should do >>> it. >>>> >>>> >>>> People who leave commented out code in there should be tarred, feathered, >>>> and run out of town on a rail. You can just go in and delete that line, >>> and >>>> then >>>> reconfigure. It will use the source that is already downloaded. >>> >>> Okay, I also had to do the same for two lines in >>> h5dump/h5dump_ddl.c:1344 and then I was able to configure with >>> "--download-hdf5". I am now able to run vec/ex10.c using HDF5 with success! >>> >>> Thanks a lot :) (and double on the tar and feathers for people who use >>> C++ comments in C code). >>> >>> After some tinkering I am now able to produce 3D plots of scalar fields >>> from my code, and this looks good in Visit and in Tec360 Ex. However, it >>> only works for scalars where only a single DOF is associated to a DMDA. >>> Vector valued fields don't work either. But I am able to save two >>> separate scalar fields to the same HDF5 file, as long as both are the >>> single DOF associated to a DMDA. >>> >> >> It seems like these programs expect a different organization than we use for >> multiple DOF. Do you know what they want? > > The Tec360 Manual says the following: > "The HDF5 loader add-on allows you to import general HDF5 files into > Tecplot 360. The loader provides a mechanism for importing generic data > from multiple HDF5 datasets or groups. The HDF5 loader will load > datasets within user selected groups, load one or more user selected > datasets to one zone, load multiple user selected datasets to multiple > zones, execute macros after data has been loaded, create implicit X, Y, > and Z grid vectors as needed, sub-sample loaded data, and reference user > selected vectors for X, Y, and Z grids. Datasets must be ordered data. > The HDF5 library used is version 1.8.5." > > This did not leave me any wiser, but perhaps it means something to you. > FYI, the error I get when trying to import vectors or multiple scalars > is "rank N data not supported" where N >= 4. > >> >> >>> Also, if I call PetscViewerHDF5IncrementTimestep() the resulting HDF5 >>> files can no longer be visualized even for the scalars, both Tec360 and >>> Visit give error messages. >>> >> >> HDF5 is a generic format, and programs like Visit and Tec360 assume some >> organization inside. For time we just add an array dimension, but they >> might do >> something else. I recommend looking at bin/pythonscripts/petsc_ge_xdmf.py, >> which creates .xdmf to describe our HDF5 organization. > > Xdmf seems handy, and I would like to use this feature, but I'm not > really sure how to. I tried running "petsc_ge_xdmf.py Myfile.h5" but I > get the error below. Am I using the script wrong? Or perhaps it is > assuming that the data is from a DMplex? I'm using DMDA. > > Traceback (most recent call last): > File "./petsc_gen_xdmf.py", line 220, in > generateXdmf(sys.argv[1]) > File "./petsc_gen_xdmf.py", line 194, in generateXdmf > geom = h5['geometry'] > File "/usr/lib/python2.7/site-packages/h5py/_hl/group.py", line 153, > in __getitem__ > oid = h5o.open(self.id, self._e(name), lapl=self._lapl) > File "h5o.pyx", line 183, in h5py.h5o.open (h5py/h5o.c:3844) > KeyError: "Unable to open object (Object 'geometry' doesn't exist)" Update: by using VecStrideGather to split the global vec from a multicomponent DMDA into dof number of global vec's, and then using VecView on these, I am able to visualize everything I need. But this feels like a hacky solution, and I see that VecStrideGather is not optimized for speed, so if you have a good alternative (e.g. using Xdmf correctly) I'm all ears. ?smund -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From knepley at gmail.com Tue Sep 2 05:16:06 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 Sep 2014 05:16:06 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <54057B23.40500@ntnu.no> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <54047C6E.2010406@ntnu.no> <54057B23.40500@ntnu.no> Message-ID: On Tue, Sep 2, 2014 at 3:09 AM, ?smund Ervik wrote: > > > On 01. sep. 2014 16:15, Matthew Knepley wrote: > > On Mon, Sep 1, 2014 at 9:02 AM, ?smund Ervik > wrote: > > > >> On 01. sep. 2014 14:02, Matthew Knepley wrote: > >>> On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik > >> wrote: > >>> > >>>> > >>>> No, I am not able to, that is why I switched to "--with-hdf5-dir=". > Like > >>>> I said in the first email, the version of HDF5 (1.8.10) that ships > with > >>>> PETSc 3.5.1 does not compile on my machine. This is apparently a known > >>>> bug that was fixed upstream, cf. > >>>> > >>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=711777 > >>>> > >>>> This is the reason why I asked whether I can somehow tell > >>>> "--download-hdf5" to download a more recent version. 1.8.11 should do > >> it. > >>> > >>> > >>> People who leave commented out code in there should be tarred, > feathered, > >>> and run out of town on a rail. You can just go in and delete that line, > >> and > >>> then > >>> reconfigure. It will use the source that is already downloaded. > >> > >> Okay, I also had to do the same for two lines in > >> h5dump/h5dump_ddl.c:1344 and then I was able to configure with > >> "--download-hdf5". I am now able to run vec/ex10.c using HDF5 with > success! > >> > >> Thanks a lot :) (and double on the tar and feathers for people who use > >> C++ comments in C code). > >> > >> After some tinkering I am now able to produce 3D plots of scalar fields > >> from my code, and this looks good in Visit and in Tec360 Ex. However, it > >> only works for scalars where only a single DOF is associated to a DMDA. > >> Vector valued fields don't work either. But I am able to save two > >> separate scalar fields to the same HDF5 file, as long as both are the > >> single DOF associated to a DMDA. > >> > > > > It seems like these programs expect a different organization than we use > for > > multiple DOF. Do you know what they want? > > The Tec360 Manual says the following: > "The HDF5 loader add-on allows you to import general HDF5 files into > Tecplot 360. The loader provides a mechanism for importing generic data > from multiple HDF5 datasets or groups. The HDF5 loader will load > datasets within user selected groups, load one or more user selected > datasets to one zone, load multiple user selected datasets to multiple > zones, execute macros after data has been loaded, create implicit X, Y, > and Z grid vectors as needed, sub-sample loaded data, and reference user > selected vectors for X, Y, and Z grids. Datasets must be ordered data. > The HDF5 library used is version 1.8.5." > > This did not leave me any wiser, but perhaps it means something to you. > FYI, the error I get when trying to import vectors or multiple scalars > is "rank N data not supported" where N >= 4. > We add the DOF as another dimension, but clearly in 3D this means a 4D structure, which Tec360 does not support, so they are looking for exactly what you did with VecStrideScatter. That sounds like bad design on their part to me. > > > > > >> Also, if I call PetscViewerHDF5IncrementTimestep() the resulting HDF5 > >> files can no longer be visualized even for the scalars, both Tec360 and > >> Visit give error messages. > >> > > > > HDF5 is a generic format, and programs like Visit and Tec360 assume some > > organization inside. For time we just add an array dimension, but they > > might do > > something else. I recommend looking at > bin/pythonscripts/petsc_ge_xdmf.py, > > which creates .xdmf to describe our HDF5 organization. > > Xdmf seems handy, and I would like to use this feature, but I'm not > really sure how to. I tried running "petsc_ge_xdmf.py Myfile.h5" but I > get the error below. Am I using the script wrong? Or perhaps it is > assuming that the data is from a DMplex? I'm using DMDA. > > Traceback (most recent call last): > File "./petsc_gen_xdmf.py", line 220, in > generateXdmf(sys.argv[1]) > File "./petsc_gen_xdmf.py", line 194, in generateXdmf > geom = h5['geometry'] > File "/usr/lib/python2.7/site-packages/h5py/_hl/group.py", line 153, > in __getitem__ > oid = h5o.open(self.id, self._e(name), lapl=self._lapl) > File "h5o.pyx", line 183, in h5py.h5o.open (h5py/h5o.c:3844) > KeyError: "Unable to open object (Object 'geometry' doesn't exist)" Yes, this is designed for unstructured grids. I just meant you could look at it and modify. It pretty short. Thanks, Matt > > ?smund > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Sep 2 08:24:42 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 02 Sep 2014 07:24:42 -0600 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> Message-ID: <878um2b1et.fsf@jedbrown.org> Matthew Knepley writes: > It did not build src/sys/classes/viewer/impls/hdf5/, clearly from the log. > However, it > should have. We have been trying to understand why Make behaves in a bad way > after configuration failure. It should go away with make clean and another > make. > > Jed, here is another case of this make bug. Need to check PETSC_ARCH/conf/files to find out when the error occurs. I suspect it is something to do with how conf/files is generated because make has no distinction at all between optional and mandatory after conf/files has been written. But I don't know what would be happening in the Python because the logic in there looks mighty deterministic to me. OTOH, make.log does not show the line where that file should be generated, so perhaps the first build was using an old version of the file? I pushed this to 'maint'. We need to take drastic actions to fix 'master', more on that coming to petsc-dev. commit fce4001c127676bb645ba81e8336311793ef2275 Author: Jed Brown Date: Tue Sep 2 07:10:49 2014 -0600 make: PETSC_ARCH/conf/files depends on PETSC_ARCH/include/petscconf.h When a PETSC_ARCH is reconfigured with an additional external package, petscconf.h is updated and we expect conf/files to be regenerated, but there was no explicit dependency causing that to happen. I think this is the cause of the non-reproducible build failures that have been reported. This commit adds the explicit dependency. Reported-by: Kai Song Reported-by: ?smund Ervik diff --git a/gmakefile b/gmakefile index 37a7d09..03ee451 100644 --- a/gmakefile +++ b/gmakefile @@ -73,7 +73,7 @@ else # Show the full command line quiet = $($1) endif -$(PETSC_ARCH)/conf/files : +$(PETSC_ARCH)/conf/files : $(PETSC_ARCH)/include/petscconf.h $(PYTHON) conf/gmakegen.py --petsc-arch=$(PETSC_ARCH) -include $(generated) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From knepley at gmail.com Tue Sep 2 10:42:36 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 Sep 2014 10:42:36 -0500 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> Message-ID: On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan wrote: > Matt, > > Attached is a small Fortran code that replicates the second problem. > This was a Fortran define problem. I fixed it on next https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran and it will be in maint and master tomorrow. Thanks, Matt > Chris > > dr. ir. Christiaan Klaij > > CFD Researcher > Research & Development > > > > *MARIN* > > > 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 > AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I > www.marin.nl > > > > MARIN news: MARIN at SMM, Hamburg, September 9-12 > > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > > > ------------------------------ > *From:* Klaij, Christiaan > *Sent:* Friday, August 29, 2014 4:42 PM > *To:* Matthew Knepley > *Cc:* petsc-users at mcs.anl.gov > *Subject:* RE: [petsc-users] PCFieldSplitSetSchurPre in fortran > > Matt, > > The small test code (ex70) is in C and it works fine, the problem > happens in a big Fortran code. I will try to replicate the > problem in a small Fortran code, but that will take some time. > > Chris > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Friday, August 29, 2014 4:14 PM > *To:* Klaij, Christiaan > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran > > On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan > wrote: > >> I'm trying PCFieldSplitSetSchurPre with >> PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. >> >> The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to >> be missing in fortran, I get the compile error: >> >> This name does not have a type, and must have an explicit type. >> [PC_FIELDSPLIT_SCHUR_PRE_SELFP] >> >> while compilation works fine with _A11, _SELF and _USER. >> > > Mark Adams has just fixed this. > > >> The second problem is that the call doesn't seem to have any >> effect. For example, I have >> >> CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) >> CALL PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) >> >> This compiles and runs, but ksp_view tells me >> >> PC Object:(sys_) 3 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, factorization LOWER >> Preconditioner for the Schur complement formed from A11 >> >> So changing the factorization from the default FULL to LOWER did >> work, but changing the preconditioner from A11 to USER didn't. >> >> I've also tried to run directly from the command line using >> >> -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view >> >> This works in the sense that I don't get the "WARNING! There are >> options you set that were not used!" message, but still ksp_view >> reports A11 instead of user provided matrix. >> > > Can you send a small test code, since I use this everyday here and it > works. > > Thanks, > > Matt > > >> Chris >> >> >> dr. ir. Christiaan Klaij >> CFD Researcher >> Research & Development >> E mailto:C.Klaij at marin.nl >> T +31 317 49 33 44 >> >> >> MARIN >> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands >> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: not available URL: From alpkalpalp at gmail.com Wed Sep 3 03:10:35 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Wed, 3 Sep 2014 11:10:35 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence Message-ID: Hi, I need to investigate the performance of FETI-DP on a heteregenous problem with a proposed preconditioner/scaling and different parameters. I do not want to reinvent the known. I need to investigate the outcomes of my research on top of FETI-DP. I have seen several papers in literature discuss about the FETI-DP implementations in PetSc. However, seems none of them is provided in public. I have come across several functions in PetSc with *FETIDP* names. But, since this algorithm is quite different then the ksp algorithms, I could not figure out how to use them. I have two concrete questions; 1- Is it possible to complete a FETI-DP solution with the provided functions in current PetSc release? 2- If so, what is the calling sequence for these functions? Thanks in advance, Alp -------------- next part -------------- An HTML attachment was scrubbed... URL: From ziad.tllas2014 at syrtel.biz Wed Sep 3 04:19:07 2014 From: ziad.tllas2014 at syrtel.biz (Zaid Tlas) Date: Wed, 3 Sep 2014 09:19:07 +0000 Subject: [petsc-users] Proposal Message-ID: <3260132117344110385453@cloud-server-73> Good day. I have attached a business proposal that I believe will be of mutual benefit to both of us to this email. Please go through the attached proposal and let me know if you are interested in working with us on this project. After reading the proposal, you can email me back on: ziad.tlas2014 at syrtel.biz so that we can go over the details together. If for some reasons you are unable to view the attachment, kindly inform me so that I can resend the proposal in a different format or plain text. Regards Ziad. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Proposal.pdf Type: application/pdf Size: 478132 bytes Desc: not available URL: From knepley at gmail.com Wed Sep 3 06:09:03 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Sep 2014 06:09:03 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: Message-ID: On Wed, Sep 3, 2014 at 3:10 AM, Alp Kalpalp wrote: > Hi, > > I need to investigate the performance of FETI-DP on a heteregenous problem > with a proposed preconditioner/scaling and different parameters. I do not > want to reinvent the known. I need to investigate the outcomes of my > research on top of FETI-DP. > > I have seen several papers in literature discuss about the FETI-DP > implementations in PetSc. However, seems none of them is provided in > public. I have come across several functions in PetSc with *FETIDP* names. > But, since this algorithm is quite different then the ksp algorithms, I > could not figure out how to use them. > > I have two concrete questions; > > 1- Is it possible to complete a FETI-DP solution with the provided > functions in current PetSc release? > There is no FETI-DP in PETSc. However, I recommend mailing Oliver Rheinbach to see about collaboration (http://www.uni-koeln.de/~orheinba/) Thanks, Matt > 2- If so, what is the calling sequence for these functions? > > Thanks in advance, > > Alp > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From niko.karin at gmail.com Wed Sep 3 06:23:27 2014 From: niko.karin at gmail.com (Karin&NiKo) Date: Wed, 3 Sep 2014 13:23:27 +0200 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: Message-ID: You could perhaps have a look at the PCBDDC : http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCBDDC.html contributed by Stefano Zampini. Best, Nicolas 2014-09-03 13:09 GMT+02:00 Matthew Knepley : > On Wed, Sep 3, 2014 at 3:10 AM, Alp Kalpalp wrote: > >> Hi, >> >> I need to investigate the performance of FETI-DP on a heteregenous >> problem with a proposed preconditioner/scaling and different parameters. I >> do not want to reinvent the known. I need to investigate the outcomes of my >> research on top of FETI-DP. >> >> I have seen several papers in literature discuss about the FETI-DP >> implementations in PetSc. However, seems none of them is provided in >> public. I have come across several functions in PetSc with *FETIDP* names. >> But, since this algorithm is quite different then the ksp algorithms, I >> could not figure out how to use them. >> >> I have two concrete questions; >> >> 1- Is it possible to complete a FETI-DP solution with the provided >> functions in current PetSc release? >> > > There is no FETI-DP in PETSc. However, I recommend mailing Oliver > Rheinbach to see about collaboration (http://www.uni-koeln.de/~orheinba/) > > Thanks, > > Matt > > >> 2- If so, what is the calling sequence for these functions? >> >> Thanks in advance, >> >> Alp >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Wed Sep 3 07:00:03 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 3 Sep 2014 12:00:03 +0000 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local>, Message-ID: Matt, Thanks for the fix. If I understand correctly, in an existing install of petsc-3.5.1, I would only need to replace the file "finclude/petscpc.h" by the new file for the fix to work? (instead of downloading dev, configuring, installing on various machines). Chris MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley Sent: Tuesday, September 02, 2014 5:42 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan > wrote: Matt, Attached is a small Fortran code that replicates the second problem. This was a Fortran define problem. I fixed it on next https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran and it will be in maint and master tomorrow. Thanks, Matt Chris [cid:image76463f.JPG at e74baedb.4a899186][cid:imageae8e3c.JPG at 1f0a6dd3.4e92756a] dr. ir. Christiaan Klaij CFD Researcher Research & Development MARIN 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I www.marin.nl MARIN news: MARIN at SMM, Hamburg, September 9-12 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Klaij, Christiaan Sent: Friday, August 29, 2014 4:42 PM To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] PCFieldSplitSetSchurPre in fortran Matt, The small test code (ex70) is in C and it works fine, the problem happens in a big Fortran code. I will try to replicate the problem in a small Fortran code, but that will take some time. Chris ________________________________ From: Matthew Knepley > Sent: Friday, August 29, 2014 4:14 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan > wrote: I'm trying PCFieldSplitSetSchurPre with PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to be missing in fortran, I get the compile error: This name does not have a type, and must have an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP] while compilation works fine with _A11, _SELF and _USER. Mark Adams has just fixed this. The second problem is that the call doesn't seem to have any effect. For example, I have CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) CALL PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) This compiles and runs, but ksp_view tells me PC Object:(sys_) 3 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from A11 So changing the factorization from the default FULL to LOWER did work, but changing the preconditioner from A11 to USER didn't. I've also tried to run directly from the command line using -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view This works in the sense that I don't get the "WARNING! There are options you set that were not used!" message, but still ksp_view reports A11 instead of user provided matrix. Can you send a small test code, since I use this everyday here and it works. Thanks, Matt Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: imageae8e3c.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: image76463f.JPG URL: From knepley at gmail.com Wed Sep 3 07:12:00 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Sep 2014 07:12:00 -0500 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> Message-ID: On Wed, Sep 3, 2014 at 7:00 AM, Klaij, Christiaan wrote: > Matt, > > Thanks for the fix. If I understand correctly, in an existing > install of petsc-3.5.1, I would only need to replace the > file "finclude/petscpc.h" by the new file for the fix to > work? (instead of downloading dev, configuring, installing on > various machines). > Yes Matt > Chris > > > > MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg > > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Tuesday, September 02, 2014 5:42 PM > *To:* Klaij, Christiaan > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran > > On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan > wrote: > >> Matt, >> >> Attached is a small Fortran code that replicates the second problem. >> > > This was a Fortran define problem. I fixed it on next > > > https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran > > and it will be in maint and master tomorrow. > > Thanks, > > Matt > > >> Chris >> >> dr. ir. Christiaan Klaij >> >> CFD Researcher >> Research & Development >> >> >> >> *MARIN* >> >> >> 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 >> AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I >> www.marin.nl >> >> >> >> MARIN news: MARIN at SMM, Hamburg, September 9-12 >> >> >> This e-mail may be confidential, privileged and/or protected by >> copyright. If you are not the intended recipient, you should return it to >> the sender immediately and delete your copy from your system. >> >> >> >> ------------------------------ >> *From:* Klaij, Christiaan >> *Sent:* Friday, August 29, 2014 4:42 PM >> *To:* Matthew Knepley >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* RE: [petsc-users] PCFieldSplitSetSchurPre in fortran >> >> Matt, >> >> The small test code (ex70) is in C and it works fine, the problem >> happens in a big Fortran code. I will try to replicate the >> problem in a small Fortran code, but that will take some time. >> >> Chris >> >> ------------------------------ >> *From:* Matthew Knepley >> *Sent:* Friday, August 29, 2014 4:14 PM >> *To:* Klaij, Christiaan >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >> >> On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan >> wrote: >> >>> I'm trying PCFieldSplitSetSchurPre with >>> PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. >>> >>> The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to >>> be missing in fortran, I get the compile error: >>> >>> This name does not have a type, and must have an explicit type. >>> [PC_FIELDSPLIT_SCHUR_PRE_SELFP] >>> >>> while compilation works fine with _A11, _SELF and _USER. >>> >> >> Mark Adams has just fixed this. >> >> >>> The second problem is that the call doesn't seem to have any >>> effect. For example, I have >>> >>> CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) >>> CALL PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) >>> >>> This compiles and runs, but ksp_view tells me >>> >>> PC Object:(sys_) 3 MPI processes >>> type: fieldsplit >>> FieldSplit with Schur preconditioner, factorization LOWER >>> Preconditioner for the Schur complement formed from A11 >>> >>> So changing the factorization from the default FULL to LOWER did >>> work, but changing the preconditioner from A11 to USER didn't. >>> >>> I've also tried to run directly from the command line using >>> >>> -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view >>> >>> This works in the sense that I don't get the "WARNING! There are >>> options you set that were not used!" message, but still ksp_view >>> reports A11 instead of user provided matrix. >>> >> >> Can you send a small test code, since I use this everyday here and it >> works. >> >> Thanks, >> >> Matt >> >> >>> Chris >>> >>> >>> dr. ir. Christiaan Klaij >>> CFD Researcher >>> Research & Development >>> E mailto:C.Klaij at marin.nl >>> T +31 317 49 33 44 >>> >>> >>> MARIN >>> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands >>> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: not available URL: From alpkalpalp at gmail.com Wed Sep 3 08:51:09 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Wed, 3 Sep 2014 16:51:09 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: Message-ID: Dear Matt, While searcing inside of the petsc bundle, I have seen PCBDDCMatFETIDPGetRHS PCBDDCMatFETIDPGetSolution PCBDDCCreateFETIDPOperators also there is an example in petsc-3.5.1\src\ksp\ksp\examples\tutorials\ex59.c however I could not see it on webbrowser interface. Is this an pre-release code? On Wed, Sep 3, 2014 at 2:09 PM, Matthew Knepley wrote: > On Wed, Sep 3, 2014 at 3:10 AM, Alp Kalpalp wrote: > >> Hi, >> >> I need to investigate the performance of FETI-DP on a heteregenous >> problem with a proposed preconditioner/scaling and different parameters. I >> do not want to reinvent the known. I need to investigate the outcomes of my >> research on top of FETI-DP. >> >> I have seen several papers in literature discuss about the FETI-DP >> implementations in PetSc. However, seems none of them is provided in >> public. I have come across several functions in PetSc with *FETIDP* names. >> But, since this algorithm is quite different then the ksp algorithms, I >> could not figure out how to use them. >> >> I have two concrete questions; >> >> 1- Is it possible to complete a FETI-DP solution with the provided >> functions in current PetSc release? >> > > There is no FETI-DP in PETSc. However, I recommend mailing Oliver > Rheinbach to see about collaboration (http://www.uni-koeln.de/~orheinba/) > > Thanks, > > Matt > > >> 2- If so, what is the calling sequence for these functions? >> >> Thanks in advance, >> >> Alp >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 3 09:18:09 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Sep 2014 09:18:09 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: Message-ID: On Wed, Sep 3, 2014 at 8:51 AM, Alp Kalpalp wrote: > Dear Matt, > > While searcing inside of the petsc bundle, I have seen > > PCBDDCMatFETIDPGetRHS > PCBDDCMatFETIDPGetSolution > PCBDDCCreateFETIDPOperators > > also there is an example in > petsc-3.5.1\src\ksp\ksp\examples\tutorials\ex59.c > > however I could not see it on webbrowser interface. Is this an pre-release > code? > That is my understanding, but you should really talk to Stefano Zampini directly. Matt > > On Wed, Sep 3, 2014 at 2:09 PM, Matthew Knepley wrote: > >> On Wed, Sep 3, 2014 at 3:10 AM, Alp Kalpalp wrote: >> >>> Hi, >>> >>> I need to investigate the performance of FETI-DP on a heteregenous >>> problem with a proposed preconditioner/scaling and different parameters. I >>> do not want to reinvent the known. I need to investigate the outcomes of my >>> research on top of FETI-DP. >>> >>> I have seen several papers in literature discuss about the FETI-DP >>> implementations in PetSc. However, seems none of them is provided in >>> public. I have come across several functions in PetSc with *FETIDP* names. >>> But, since this algorithm is quite different then the ksp algorithms, I >>> could not figure out how to use them. >>> >>> I have two concrete questions; >>> >>> 1- Is it possible to complete a FETI-DP solution with the provided >>> functions in current PetSc release? >>> >> >> There is no FETI-DP in PETSc. However, I recommend mailing Oliver >> Rheinbach to see about collaboration (http://www.uni-koeln.de/~orheinba/) >> >> Thanks, >> >> Matt >> >> >>> 2- If so, what is the calling sequence for these functions? >>> >>> Thanks in advance, >>> >>> Alp >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Sep 3 09:19:12 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 03 Sep 2014 08:19:12 -0600 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: Message-ID: <871trs7pnj.fsf@jedbrown.org> Matthew Knepley writes: >> 1- Is it possible to complete a FETI-DP solution with the provided >> functions in current PetSc release? >> > > There is no FETI-DP in PETSc. Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. You can enable it by configuring --with-pcbddc. This will be turned on by default soon. It is fairly new, so you should use the branch 'master' instead of the release. It has an option to do FETI-DP instead of BDDC. See src/ksp/ksp/examples/tutorials/ex59.c. For either of these methods, you have to assemble a MATIS. If you use MatSetValuesLocal, most of your assembly code can stay the same. Hopefully we can get better examples before the next release. Stefano (the author of PCBDDC, Cc'd) tests mostly with external packages, but we really need more complete tests within PETSc. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From C.Klaij at marin.nl Wed Sep 3 09:23:48 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 3 Sep 2014 14:23:48 +0000 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> , Message-ID: <36ef997c4d7a436da57f5c7311aa74dd@MAR190n2.marin.local> Matt, Thanks, after applying the fix to my petsc-3.5.1 install, the small Fortran program works as expected. Now, I would like to change the fortran strategy to the option "3) Using Fortran modules". So, in the small fortran program I replace these seven lines #include #include #include #include #include #include #include by the two following lines use petscksp #include This still compiles but I get the two old problem back... Chris ________________________________ From: Matthew Knepley Sent: Wednesday, September 03, 2014 2:12 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 7:00 AM, Klaij, Christiaan > wrote: Matt, Thanks for the fix. If I understand correctly, in an existing install of petsc-3.5.1, I would only need to replace the file "finclude/petscpc.h" by the new file for the fix to work? (instead of downloading dev, configuring, installing on various machines). Yes Matt Chris MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Tuesday, September 02, 2014 5:42 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan > wrote: Matt, Attached is a small Fortran code that replicates the second problem. This was a Fortran define problem. I fixed it on next https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran and it will be in maint and master tomorrow. Thanks, Matt Chris [cid:image76463f.JPG at e74baedb.4a899186][cid:imageae8e3c.JPG at 1f0a6dd3.4e92756a] dr. ir. Christiaan Klaij CFD Researcher Research & Development MARIN 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I www.marin.nl MARIN news: MARIN at SMM, Hamburg, September 9-12 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Klaij, Christiaan Sent: Friday, August 29, 2014 4:42 PM To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] PCFieldSplitSetSchurPre in fortran Matt, The small test code (ex70) is in C and it works fine, the problem happens in a big Fortran code. I will try to replicate the problem in a small Fortran code, but that will take some time. Chris ________________________________ From: Matthew Knepley > Sent: Friday, August 29, 2014 4:14 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan > wrote: I'm trying PCFieldSplitSetSchurPre with PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to be missing in fortran, I get the compile error: This name does not have a type, and must have an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP] while compilation works fine with _A11, _SELF and _USER. Mark Adams has just fixed this. The second problem is that the call doesn't seem to have any effect. For example, I have CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) CALL PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) This compiles and runs, but ksp_view tells me PC Object:(sys_) 3 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from A11 So changing the factorization from the default FULL to LOWER did work, but changing the preconditioner from A11 to USER didn't. I've also tried to run directly from the command line using -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view This works in the sense that I don't get the "WARNING! There are options you set that were not used!" message, but still ksp_view reports A11 instead of user provided matrix. Can you send a small test code, since I use this everyday here and it works. Thanks, Matt Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: imageae8e3c.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: image76463f.JPG URL: From knepley at gmail.com Wed Sep 3 09:38:26 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Sep 2014 09:38:26 -0500 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: <36ef997c4d7a436da57f5c7311aa74dd@MAR190n2.marin.local> References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> <36ef997c4d7a436da57f5c7311aa74dd@MAR190n2.marin.local> Message-ID: On Wed, Sep 3, 2014 at 9:23 AM, Klaij, Christiaan wrote: > Matt, > > Thanks, after applying the fix to my petsc-3.5.1 install, the > small Fortran program works as expected. > > Now, I would like to change the fortran strategy to the > option "3) Using Fortran modules". So, in the small fortran > program I replace these seven lines > > #include > #include > #include > #include > #include > #include > #include > > by the two following lines > > use petscksp > #include > > This still compiles but I get the two old problem back... > I have no idea why. Can you look at the values of those enumerations? Matt > Chris > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Wednesday, September 03, 2014 2:12 PM > *To:* Klaij, Christiaan > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran > > On Wed, Sep 3, 2014 at 7:00 AM, Klaij, Christiaan > wrote: > >> Matt, >> >> Thanks for the fix. If I understand correctly, in an existing >> install of petsc-3.5.1, I would only need to replace the >> file "finclude/petscpc.h" by the new file for the fix to >> work? (instead of downloading dev, configuring, installing on >> various machines). >> > > Yes > > Matt > > >> Chris >> >> >> >> MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg >> >> >> This e-mail may be confidential, privileged and/or protected by >> copyright. If you are not the intended recipient, you should return it to >> the sender immediately and delete your copy from your system. >> >> >> >> ------------------------------ >> *From:* Matthew Knepley >> *Sent:* Tuesday, September 02, 2014 5:42 PM >> *To:* Klaij, Christiaan >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >> >> On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan >> wrote: >> >>> Matt, >>> >>> Attached is a small Fortran code that replicates the second problem. >>> >> >> This was a Fortran define problem. I fixed it on next >> >> >> https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran >> >> and it will be in maint and master tomorrow. >> >> Thanks, >> >> Matt >> >> >>> Chris >>> >>> dr. ir. Christiaan Klaij >>> >>> CFD Researcher >>> Research & Development >>> >>> >>> >>> *MARIN* >>> >>> >>> 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 >>> AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I >>> www.marin.nl >>> >>> >>> >>> MARIN news: MARIN at SMM, Hamburg, September 9-12 >>> >>> >>> This e-mail may be confidential, privileged and/or protected by >>> copyright. If you are not the intended recipient, you should return it to >>> the sender immediately and delete your copy from your system. >>> >>> >>> >>> ------------------------------ >>> *From:* Klaij, Christiaan >>> *Sent:* Friday, August 29, 2014 4:42 PM >>> *To:* Matthew Knepley >>> *Cc:* petsc-users at mcs.anl.gov >>> *Subject:* RE: [petsc-users] PCFieldSplitSetSchurPre in fortran >>> >>> Matt, >>> >>> The small test code (ex70) is in C and it works fine, the problem >>> happens in a big Fortran code. I will try to replicate the >>> problem in a small Fortran code, but that will take some time. >>> >>> Chris >>> >>> ------------------------------ >>> *From:* Matthew Knepley >>> *Sent:* Friday, August 29, 2014 4:14 PM >>> *To:* Klaij, Christiaan >>> *Cc:* petsc-users at mcs.anl.gov >>> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >>> >>> On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan >>> wrote: >>> >>>> I'm trying PCFieldSplitSetSchurPre with >>>> PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. >>>> >>>> The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to >>>> be missing in fortran, I get the compile error: >>>> >>>> This name does not have a type, and must have an explicit type. >>>> [PC_FIELDSPLIT_SCHUR_PRE_SELFP] >>>> >>>> while compilation works fine with _A11, _SELF and _USER. >>>> >>> >>> Mark Adams has just fixed this. >>> >>> >>>> The second problem is that the call doesn't seem to have any >>>> effect. For example, I have >>>> >>>> CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) >>>> CALL >>>> PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) >>>> >>>> This compiles and runs, but ksp_view tells me >>>> >>>> PC Object:(sys_) 3 MPI processes >>>> type: fieldsplit >>>> FieldSplit with Schur preconditioner, factorization LOWER >>>> Preconditioner for the Schur complement formed from A11 >>>> >>>> So changing the factorization from the default FULL to LOWER did >>>> work, but changing the preconditioner from A11 to USER didn't. >>>> >>>> I've also tried to run directly from the command line using >>>> >>>> -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view >>>> >>>> This works in the sense that I don't get the "WARNING! There are >>>> options you set that were not used!" message, but still ksp_view >>>> reports A11 instead of user provided matrix. >>>> >>> >>> Can you send a small test code, since I use this everyday here and it >>> works. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Chris >>>> >>>> >>>> dr. ir. Christiaan Klaij >>>> CFD Researcher >>>> Research & Development >>>> E mailto:C.Klaij at marin.nl >>>> T +31 317 49 33 44 >>>> >>>> >>>> MARIN >>>> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands >>>> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: not available URL: From C.Klaij at marin.nl Wed Sep 3 09:43:02 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 3 Sep 2014 14:43:02 +0000 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> <36ef997c4d7a436da57f5c7311aa74dd@MAR190n2.marin.local>, Message-ID: <5b3e7ab3188f4b39937f35e5a6e0c7e7@MAR190n2.marin.local> I'm sorry, how do I do that? Chris MARIN news: Applied Hydrodynamics of Floating Offshore Structures course, Oct 8 - 10, Houston This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley Sent: Wednesday, September 03, 2014 4:38 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 9:23 AM, Klaij, Christiaan > wrote: Matt, Thanks, after applying the fix to my petsc-3.5.1 install, the small Fortran program works as expected. Now, I would like to change the fortran strategy to the option "3) Using Fortran modules". So, in the small fortran program I replace these seven lines #include #include #include #include #include #include #include by the two following lines use petscksp #include This still compiles but I get the two old problem back... I have no idea why. Can you look at the values of those enumerations? Matt Chris ________________________________ From: Matthew Knepley > Sent: Wednesday, September 03, 2014 2:12 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 7:00 AM, Klaij, Christiaan > wrote: Matt, Thanks for the fix. If I understand correctly, in an existing install of petsc-3.5.1, I would only need to replace the file "finclude/petscpc.h" by the new file for the fix to work? (instead of downloading dev, configuring, installing on various machines). Yes Matt Chris MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Tuesday, September 02, 2014 5:42 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan > wrote: Matt, Attached is a small Fortran code that replicates the second problem. This was a Fortran define problem. I fixed it on next https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran and it will be in maint and master tomorrow. Thanks, Matt Chris [cid:image76463f.JPG at e74baedb.4a899186][cid:imageae8e3c.JPG at 1f0a6dd3.4e92756a] dr. ir. Christiaan Klaij CFD Researcher Research & Development MARIN 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I www.marin.nl MARIN news: MARIN at SMM, Hamburg, September 9-12 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Klaij, Christiaan Sent: Friday, August 29, 2014 4:42 PM To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] PCFieldSplitSetSchurPre in fortran Matt, The small test code (ex70) is in C and it works fine, the problem happens in a big Fortran code. I will try to replicate the problem in a small Fortran code, but that will take some time. Chris ________________________________ From: Matthew Knepley > Sent: Friday, August 29, 2014 4:14 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan > wrote: I'm trying PCFieldSplitSetSchurPre with PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to be missing in fortran, I get the compile error: This name does not have a type, and must have an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP] while compilation works fine with _A11, _SELF and _USER. Mark Adams has just fixed this. The second problem is that the call doesn't seem to have any effect. For example, I have CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) CALL PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) This compiles and runs, but ksp_view tells me PC Object:(sys_) 3 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from A11 So changing the factorization from the default FULL to LOWER did work, but changing the preconditioner from A11 to USER didn't. I've also tried to run directly from the command line using -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view This works in the sense that I don't get the "WARNING! There are options you set that were not used!" message, but still ksp_view reports A11 instead of user provided matrix. Can you send a small test code, since I use this everyday here and it works. Thanks, Matt Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: imageae8e3c.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: image76463f.JPG URL: From knepley at gmail.com Wed Sep 3 09:44:40 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Sep 2014 09:44:40 -0500 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: <5b3e7ab3188f4b39937f35e5a6e0c7e7@MAR190n2.marin.local> References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> <36ef997c4d7a436da57f5c7311aa74dd@MAR190n2.marin.local> <5b3e7ab3188f4b39937f35e5a6e0c7e7@MAR190n2.marin.local> Message-ID: On Wed, Sep 3, 2014 at 9:43 AM, Klaij, Christiaan wrote: > I'm sorry, how do I do that? > print it Matt > Chris > > > > MARIN news: Applied Hydrodynamics of Floating Offshore Structures course, > Oct 8 - 10, Houston > > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Wednesday, September 03, 2014 4:38 PM > *To:* Klaij, Christiaan > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran > > On Wed, Sep 3, 2014 at 9:23 AM, Klaij, Christiaan > wrote: > >> Matt, >> >> Thanks, after applying the fix to my petsc-3.5.1 install, the >> small Fortran program works as expected. >> >> Now, I would like to change the fortran strategy to the >> option "3) Using Fortran modules". So, in the small fortran >> program I replace these seven lines >> >> #include >> #include >> #include >> #include >> #include >> #include >> #include >> >> by the two following lines >> >> use petscksp >> #include >> >> This still compiles but I get the two old problem back... >> > > I have no idea why. Can you look at the values of those enumerations? > > Matt > > >> Chris >> >> ------------------------------ >> *From:* Matthew Knepley >> *Sent:* Wednesday, September 03, 2014 2:12 PM >> *To:* Klaij, Christiaan >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >> >> On Wed, Sep 3, 2014 at 7:00 AM, Klaij, Christiaan >> wrote: >> >>> Matt, >>> >>> Thanks for the fix. If I understand correctly, in an existing >>> install of petsc-3.5.1, I would only need to replace the >>> file "finclude/petscpc.h" by the new file for the fix to >>> work? (instead of downloading dev, configuring, installing on >>> various machines). >>> >> >> Yes >> >> Matt >> >> >>> Chris >>> >>> >>> >>> MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg >>> >>> >>> This e-mail may be confidential, privileged and/or protected by >>> copyright. If you are not the intended recipient, you should return it to >>> the sender immediately and delete your copy from your system. >>> >>> >>> >>> ------------------------------ >>> *From:* Matthew Knepley >>> *Sent:* Tuesday, September 02, 2014 5:42 PM >>> *To:* Klaij, Christiaan >>> *Cc:* petsc-users at mcs.anl.gov >>> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >>> >>> On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan >>> wrote: >>> >>>> Matt, >>>> >>>> Attached is a small Fortran code that replicates the second problem. >>>> >>> >>> This was a Fortran define problem. I fixed it on next >>> >>> >>> https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran >>> >>> and it will be in maint and master tomorrow. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Chris >>>> >>>> dr. ir. Christiaan Klaij >>>> >>>> CFD Researcher >>>> Research & Development >>>> >>>> >>>> >>>> *MARIN* >>>> >>>> >>>> 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 >>>> AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I >>>> www.marin.nl >>>> >>>> >>>> >>>> MARIN news: MARIN at SMM, Hamburg, September 9-12 >>>> >>>> >>>> This e-mail may be confidential, privileged and/or protected by >>>> copyright. If you are not the intended recipient, you should return it to >>>> the sender immediately and delete your copy from your system. >>>> >>>> >>>> >>>> ------------------------------ >>>> *From:* Klaij, Christiaan >>>> *Sent:* Friday, August 29, 2014 4:42 PM >>>> *To:* Matthew Knepley >>>> *Cc:* petsc-users at mcs.anl.gov >>>> *Subject:* RE: [petsc-users] PCFieldSplitSetSchurPre in fortran >>>> >>>> Matt, >>>> >>>> The small test code (ex70) is in C and it works fine, the problem >>>> happens in a big Fortran code. I will try to replicate the >>>> problem in a small Fortran code, but that will take some time. >>>> >>>> Chris >>>> >>>> ------------------------------ >>>> *From:* Matthew Knepley >>>> *Sent:* Friday, August 29, 2014 4:14 PM >>>> *To:* Klaij, Christiaan >>>> *Cc:* petsc-users at mcs.anl.gov >>>> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >>>> >>>> On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan >>>> wrote: >>>> >>>>> I'm trying PCFieldSplitSetSchurPre with >>>>> PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. >>>>> >>>>> The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to >>>>> be missing in fortran, I get the compile error: >>>>> >>>>> This name does not have a type, and must have an explicit type. >>>>> [PC_FIELDSPLIT_SCHUR_PRE_SELFP] >>>>> >>>>> while compilation works fine with _A11, _SELF and _USER. >>>>> >>>> >>>> Mark Adams has just fixed this. >>>> >>>> >>>>> The second problem is that the call doesn't seem to have any >>>>> effect. For example, I have >>>>> >>>>> CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) >>>>> CALL >>>>> PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) >>>>> >>>>> This compiles and runs, but ksp_view tells me >>>>> >>>>> PC Object:(sys_) 3 MPI processes >>>>> type: fieldsplit >>>>> FieldSplit with Schur preconditioner, factorization LOWER >>>>> Preconditioner for the Schur complement formed from A11 >>>>> >>>>> So changing the factorization from the default FULL to LOWER did >>>>> work, but changing the preconditioner from A11 to USER didn't. >>>>> >>>>> I've also tried to run directly from the command line using >>>>> >>>>> -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view >>>>> >>>>> This works in the sense that I don't get the "WARNING! There are >>>>> options you set that were not used!" message, but still ksp_view >>>>> reports A11 instead of user provided matrix. >>>>> >>>> >>>> Can you send a small test code, since I use this everyday here and it >>>> works. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Chris >>>>> >>>>> >>>>> dr. ir. Christiaan Klaij >>>>> CFD Researcher >>>>> Research & Development >>>>> E mailto:C.Klaij at marin.nl >>>>> T +31 317 49 33 44 >>>>> >>>>> >>>>> MARIN >>>>> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands >>>>> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: not available URL: From C.Klaij at marin.nl Wed Sep 3 09:46:27 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Wed, 3 Sep 2014 14:46:27 +0000 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> <36ef997c4d7a436da57f5c7311aa74dd@MAR190n2.marin.local> <5b3e7ab3188f4b39937f35e5a6e0c7e7@MAR190n2.marin.local>, Message-ID: print *, PC_FIELDSPLIT_SCHUR_PRE_USER gives: 2. (changing to *_SELFP gives: This name does not have a type, and must have an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP]) MARIN news: MARIN Report 112: Hydro-structural This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley Sent: Wednesday, September 03, 2014 4:44 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 9:43 AM, Klaij, Christiaan > wrote: I'm sorry, how do I do that? print it Matt Chris MARIN news: Applied Hydrodynamics of Floating Offshore Structures course, Oct 8 - 10, Houston This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Wednesday, September 03, 2014 4:38 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 9:23 AM, Klaij, Christiaan > wrote: Matt, Thanks, after applying the fix to my petsc-3.5.1 install, the small Fortran program works as expected. Now, I would like to change the fortran strategy to the option "3) Using Fortran modules". So, in the small fortran program I replace these seven lines #include #include #include #include #include #include #include by the two following lines use petscksp #include This still compiles but I get the two old problem back... I have no idea why. Can you look at the values of those enumerations? Matt Chris ________________________________ From: Matthew Knepley > Sent: Wednesday, September 03, 2014 2:12 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 7:00 AM, Klaij, Christiaan > wrote: Matt, Thanks for the fix. If I understand correctly, in an existing install of petsc-3.5.1, I would only need to replace the file "finclude/petscpc.h" by the new file for the fix to work? (instead of downloading dev, configuring, installing on various machines). Yes Matt Chris MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Tuesday, September 02, 2014 5:42 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan > wrote: Matt, Attached is a small Fortran code that replicates the second problem. This was a Fortran define problem. I fixed it on next https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran and it will be in maint and master tomorrow. Thanks, Matt Chris [cid:image76463f.JPG at e74baedb.4a899186][cid:imageae8e3c.JPG at 1f0a6dd3.4e92756a] dr. ir. Christiaan Klaij CFD Researcher Research & Development MARIN 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I www.marin.nl MARIN news: MARIN at SMM, Hamburg, September 9-12 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Klaij, Christiaan Sent: Friday, August 29, 2014 4:42 PM To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] PCFieldSplitSetSchurPre in fortran Matt, The small test code (ex70) is in C and it works fine, the problem happens in a big Fortran code. I will try to replicate the problem in a small Fortran code, but that will take some time. Chris ________________________________ From: Matthew Knepley > Sent: Friday, August 29, 2014 4:14 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan > wrote: I'm trying PCFieldSplitSetSchurPre with PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to be missing in fortran, I get the compile error: This name does not have a type, and must have an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP] while compilation works fine with _A11, _SELF and _USER. Mark Adams has just fixed this. The second problem is that the call doesn't seem to have any effect. For example, I have CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) CALL PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) This compiles and runs, but ksp_view tells me PC Object:(sys_) 3 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from A11 So changing the factorization from the default FULL to LOWER did work, but changing the preconditioner from A11 to USER didn't. I've also tried to run directly from the command line using -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view This works in the sense that I don't get the "WARNING! There are options you set that were not used!" message, but still ksp_view reports A11 instead of user provided matrix. Can you send a small test code, since I use this everyday here and it works. Thanks, Matt Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: imageae8e3c.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: image76463f.JPG URL: From knepley at gmail.com Wed Sep 3 10:02:12 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Sep 2014 10:02:12 -0500 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> <36ef997c4d7a436da57f5c7311aa74dd@MAR190n2.marin.local> <5b3e7ab3188f4b39937f35e5a6e0c7e7@MAR190n2.marin.local> Message-ID: On Wed, Sep 3, 2014 at 9:46 AM, Klaij, Christiaan wrote: > print *, PC_FIELDSPLIT_SCHUR_PRE_USER > > gives: 2. > > (changing to *_SELFP gives: This name does not have a type, and must have > an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP]) > I have next:/PETSc3/petsc/petsc-pylith$ find include/finclude/ -type f | xargs grep USER find include/finclude/ -type f | xargs grep USER include/finclude//petscerrordef.h:#define PETSC_ERR_USER 83 include/finclude//petscpc.h: PetscEnum PC_FIELDSPLIT_SCHUR_PRE_USER include/finclude//petscpc.h: parameter (PC_FIELDSPLIT_SCHUR_PRE_USER=3) include/finclude//petsctao.h: PetscEnum TAO_CONVERGED_USER include/finclude//petsctao.h: PetscEnum TAO_DIVERGED_USER include/finclude//petsctao.h: parameter ( TAO_CONVERGED_USER = 8) include/finclude//petsctao.h: parameter ( TAO_DIVERGED_USER = -8) Where could it possibly be getting this value from? I think from some compiled module which you need to rebuild. Matt > > > MARIN news: MARIN Report 112: Hydro-structural > > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Wednesday, September 03, 2014 4:44 PM > *To:* Klaij, Christiaan > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran > > On Wed, Sep 3, 2014 at 9:43 AM, Klaij, Christiaan > wrote: > >> I'm sorry, how do I do that? >> > > print it > > Matt > > >> Chris >> >> >> >> MARIN news: Applied Hydrodynamics of Floating Offshore Structures >> course, Oct 8 - 10, Houston >> >> >> This e-mail may be confidential, privileged and/or protected by >> copyright. If you are not the intended recipient, you should return it to >> the sender immediately and delete your copy from your system. >> >> >> >> ------------------------------ >> *From:* Matthew Knepley >> *Sent:* Wednesday, September 03, 2014 4:38 PM >> *To:* Klaij, Christiaan >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >> >> On Wed, Sep 3, 2014 at 9:23 AM, Klaij, Christiaan >> wrote: >> >>> Matt, >>> >>> Thanks, after applying the fix to my petsc-3.5.1 install, the >>> small Fortran program works as expected. >>> >>> Now, I would like to change the fortran strategy to the >>> option "3) Using Fortran modules". So, in the small fortran >>> program I replace these seven lines >>> >>> #include >>> #include >>> #include >>> #include >>> #include >>> #include >>> #include >>> >>> by the two following lines >>> >>> use petscksp >>> #include >>> >>> This still compiles but I get the two old problem back... >>> >> >> I have no idea why. Can you look at the values of those enumerations? >> >> Matt >> >> >>> Chris >>> >>> ------------------------------ >>> *From:* Matthew Knepley >>> *Sent:* Wednesday, September 03, 2014 2:12 PM >>> *To:* Klaij, Christiaan >>> *Cc:* petsc-users at mcs.anl.gov >>> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >>> >>> On Wed, Sep 3, 2014 at 7:00 AM, Klaij, Christiaan >>> wrote: >>> >>>> Matt, >>>> >>>> Thanks for the fix. If I understand correctly, in an existing >>>> install of petsc-3.5.1, I would only need to replace the >>>> file "finclude/petscpc.h" by the new file for the fix to >>>> work? (instead of downloading dev, configuring, installing on >>>> various machines). >>>> >>> >>> Yes >>> >>> Matt >>> >>> >>>> Chris >>>> >>>> >>>> >>>> MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg >>>> >>>> >>>> This e-mail may be confidential, privileged and/or protected by >>>> copyright. If you are not the intended recipient, you should return it to >>>> the sender immediately and delete your copy from your system. >>>> >>>> >>>> >>>> ------------------------------ >>>> *From:* Matthew Knepley >>>> *Sent:* Tuesday, September 02, 2014 5:42 PM >>>> *To:* Klaij, Christiaan >>>> *Cc:* petsc-users at mcs.anl.gov >>>> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >>>> >>>> On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan >>>> wrote: >>>> >>>>> Matt, >>>>> >>>>> Attached is a small Fortran code that replicates the second problem. >>>>> >>>> >>>> This was a Fortran define problem. I fixed it on next >>>> >>>> >>>> https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran >>>> >>>> and it will be in maint and master tomorrow. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Chris >>>>> >>>>> dr. ir. Christiaan Klaij >>>>> >>>>> CFD Researcher >>>>> Research & Development >>>>> >>>>> >>>>> >>>>> *MARIN* >>>>> >>>>> >>>>> 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 >>>>> AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I >>>>> www.marin.nl >>>>> >>>>> >>>>> >>>>> MARIN news: MARIN at SMM, Hamburg, September 9-12 >>>>> >>>>> >>>>> This e-mail may be confidential, privileged and/or protected by >>>>> copyright. If you are not the intended recipient, you should return it to >>>>> the sender immediately and delete your copy from your system. >>>>> >>>>> >>>>> >>>>> ------------------------------ >>>>> *From:* Klaij, Christiaan >>>>> *Sent:* Friday, August 29, 2014 4:42 PM >>>>> *To:* Matthew Knepley >>>>> *Cc:* petsc-users at mcs.anl.gov >>>>> *Subject:* RE: [petsc-users] PCFieldSplitSetSchurPre in fortran >>>>> >>>>> Matt, >>>>> >>>>> The small test code (ex70) is in C and it works fine, the problem >>>>> happens in a big Fortran code. I will try to replicate the >>>>> problem in a small Fortran code, but that will take some time. >>>>> >>>>> Chris >>>>> >>>>> ------------------------------ >>>>> *From:* Matthew Knepley >>>>> *Sent:* Friday, August 29, 2014 4:14 PM >>>>> *To:* Klaij, Christiaan >>>>> *Cc:* petsc-users at mcs.anl.gov >>>>> *Subject:* Re: [petsc-users] PCFieldSplitSetSchurPre in fortran >>>>> >>>>> On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan >>>> > wrote: >>>>> >>>>>> I'm trying PCFieldSplitSetSchurPre with >>>>>> PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. >>>>>> >>>>>> The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to >>>>>> be missing in fortran, I get the compile error: >>>>>> >>>>>> This name does not have a type, and must have an explicit type. >>>>>> [PC_FIELDSPLIT_SCHUR_PRE_SELFP] >>>>>> >>>>>> while compilation works fine with _A11, _SELF and _USER. >>>>>> >>>>> >>>>> Mark Adams has just fixed this. >>>>> >>>>> >>>>>> The second problem is that the call doesn't seem to have any >>>>>> effect. For example, I have >>>>>> >>>>>> CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) >>>>>> CALL >>>>>> PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) >>>>>> >>>>>> This compiles and runs, but ksp_view tells me >>>>>> >>>>>> PC Object:(sys_) 3 MPI processes >>>>>> type: fieldsplit >>>>>> FieldSplit with Schur preconditioner, factorization LOWER >>>>>> Preconditioner for the Schur complement formed from A11 >>>>>> >>>>>> So changing the factorization from the default FULL to LOWER did >>>>>> work, but changing the preconditioner from A11 to USER didn't. >>>>>> >>>>>> I've also tried to run directly from the command line using >>>>>> >>>>>> -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view >>>>>> >>>>>> This works in the sense that I don't get the "WARNING! There are >>>>>> options you set that were not used!" message, but still ksp_view >>>>>> reports A11 instead of user provided matrix. >>>>>> >>>>> >>>>> Can you send a small test code, since I use this everyday here and >>>>> it works. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Chris >>>>>> >>>>>> >>>>>> dr. ir. Christiaan Klaij >>>>>> CFD Researcher >>>>>> Research & Development >>>>>> E mailto:C.Klaij at marin.nl >>>>>> T +31 317 49 33 44 >>>>>> >>>>>> >>>>>> MARIN >>>>>> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands >>>>>> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: not available URL: From stefano.zampini at gmail.com Wed Sep 3 10:18:31 2014 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Wed, 3 Sep 2014 18:18:31 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: <871trs7pnj.fsf@jedbrown.org> References: <871trs7pnj.fsf@jedbrown.org> Message-ID: <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the dual of the other) and it does not have its own classes so far. That said, you can experiment with FETI-DP only after having setup a BDDC preconditioner with the options and customization you prefer. Use http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html for manual pages. For an 'how to' with FETIDP, please see src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to obtain a right-hand side for the FETIDP system and a physical solution from the solution of the FETIDP system. I would recommend you to use the development version of the library and either use the ?next? branch or the ?master' branch after having merged in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also contains the new deluxe scaling operator for BDDC which is not available to use with FETI-DP. If you have any other questions which can be useful for other PETSc users, please use the mailing list; otherwise you can contact me personally. Stefano On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > Matthew Knepley writes: >>> 1- Is it possible to complete a FETI-DP solution with the provided >>> functions in current PetSc release? >>> >> >> There is no FETI-DP in PETSc. > > Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. You > can enable it by configuring --with-pcbddc. This will be turned on by > default soon. It is fairly new, so you should use the branch 'master' > instead of the release. It has an option to do FETI-DP instead of BDDC. > See src/ksp/ksp/examples/tutorials/ex59.c. > > For either of these methods, you have to assemble a MATIS. If you use > MatSetValuesLocal, most of your assembly code can stay the same. > > Hopefully we can get better examples before the next release. Stefano > (the author of PCBDDC, Cc'd) tests mostly with external packages, but we > really need more complete tests within PETSc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwelland at anl.gov Wed Sep 3 13:32:44 2014 From: mwelland at anl.gov (Welland, Michael J.) Date: Wed, 3 Sep 2014 18:32:44 +0000 Subject: [petsc-users] Loosing a flux when using the asm preconditioned for >2 cores Message-ID: Hi all, I'm simulating a problem with small fluxes, using the asm preconditioner and lu as the sub preconditioner. The simulation runs fine using 2 cores, but when I use more the fluxes disappear and the desired effect goes with them. Does anyone have an idea of a suitable tolerance or parameter I should adjust? I am using the snes solver via the FEniCS package. Thanks, Mike I attach an snes terminal output for reference: SNES Object: 16 MPI processes type: newtonls maximum iterations=30, maximum function evaluations=2000 tolerances: relative=0.99, absolute=1e-05, solution=1e-10 total number of linear solver iterations=59 total number of function evaluations=2 SNESLineSearch Object: 16 MPI processes type: basic maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=1 KSP Object: 16 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 16 MPI processes type: asm Additive Schwarz: total subdomain blocks = 16, amount of overlap = 5 Additive Schwarz: restriction/interpolation type - NONE Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 5.25151 Factored matrix follows: Matrix Object: 1 MPI processes type: seqaij rows=4412, cols=4412 package used to perform factorization: petsc total: nonzeros=626736, allocated nonzeros=626736 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 1103 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 1 MPI processes type: seqaij rows=4412, cols=4412 total: nonzeros=119344, allocated nonzeros=119344 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 1103 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: 16 MPI processes type: mpiaij rows=41820, cols=41820, bs=4 total: nonzeros=1161136, allocated nonzeros=1161136 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 638 nodes, limit used is 5 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 3 13:36:07 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 3 Sep 2014 13:36:07 -0500 Subject: [petsc-users] Loosing a flux when using the asm preconditioned for >2 cores In-Reply-To: References: Message-ID: <4F0440DC-7CAD-439D-8DD7-862929C169CA@mcs.anl.gov> Start by running with -snes_converged_reason -snes_monitor -ksp_converged_reason -ksp_monitor_true_residual -snes_linesearch_monitor on two and then three cores. By default, ?not converging? of an iterative solver does not generate an error so the most likely cause is that the iterative scheme is just not converging with more processors. Barry On Sep 3, 2014, at 1:32 PM, Welland, Michael J. wrote: > Hi all, > > I'm simulating a problem with small fluxes, using the asm preconditioner and lu as the sub preconditioner. The simulation runs fine using 2 cores, but when I use more the fluxes disappear and the desired effect goes with them. > > Does anyone have an idea of a suitable tolerance or parameter I should adjust? I am using the snes solver via the FEniCS package. > > Thanks, > Mike > > I attach an snes terminal output for reference: > > SNES Object: 16 MPI processes > type: newtonls > maximum iterations=30, maximum function evaluations=2000 > tolerances: relative=0.99, absolute=1e-05, solution=1e-10 > total number of linear solver iterations=59 > total number of function evaluations=2 > SNESLineSearch Object: 16 MPI processes > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=1 > KSP Object: 16 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 16 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 16, amount of overlap = 5 > Additive Schwarz: restriction/interpolation type - NONE > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 5.25151 > Factored matrix follows: > Matrix Object: 1 MPI processes > type: seqaij > rows=4412, cols=4412 > package used to perform factorization: petsc > total: nonzeros=626736, allocated nonzeros=626736 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 1103 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: 1 MPI processes > type: seqaij > rows=4412, cols=4412 > total: nonzeros=119344, allocated nonzeros=119344 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 1103 nodes, limit used is 5 > linear system matrix = precond matrix: > Matrix Object: 16 MPI processes > type: mpiaij > rows=41820, cols=41820, bs=4 > total: nonzeros=1161136, allocated nonzeros=1161136 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 638 nodes, limit used is 5 > From mwelland at anl.gov Wed Sep 3 14:05:14 2014 From: mwelland at anl.gov (Welland, Michael J.) Date: Wed, 3 Sep 2014 19:05:14 +0000 Subject: [petsc-users] Loosing a flux when using the asm preconditioned for >2 cores Message-ID: Thanks Barry, I attach the terminal outputs. From my reading, it seems like everything converged alright, no? Usually I get an error message when the linear or snes solver doesn't converge. By the way, I don't think it should matter but for your reference I am doing an operator split method in FEniCS where I am alternating between solving two system until their solution vectors stop changing between iterations. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: correct_fluxes.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: lost_fluxes.txt URL: From knepley at gmail.com Wed Sep 3 14:09:32 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Sep 2014 14:09:32 -0500 Subject: [petsc-users] Loosing a flux when using the asm preconditioned for >2 cores In-Reply-To: References: Message-ID: On Wed, Sep 3, 2014 at 2:05 PM, Welland, Michael J. wrote: > Thanks Barry, I attach the terminal outputs. From my reading, it seems > like everything converged alright, no? Usually I get an error message when > the linear or snes solver doesn't converge. > > By the way, I don't think it should matter but for your reference I am > doing an operator split method in FEniCS where I am alternating between > solving two system until their solution vectors stop changing between > iterations. > That is not convergence, it is stagnation. Using that measure, you can "converge" to things that are not solutions of the equation. Matt > Thanks > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 3 15:02:26 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 3 Sep 2014 15:02:26 -0500 Subject: [petsc-users] Loosing a flux when using the asm preconditioned for >2 cores In-Reply-To: References: Message-ID: <623F03E2-21D8-4144-88A2-A5F5A008918A@mcs.anl.gov> Force really tight convergence on the nonlinear solver -snes_rtol 1.e-12 Note also that initial function norm 0 SNES Function norm 2.418625311837e+01 0 SNES Function norm 3.156023513435e+04 is very different in the two cases. This means either you are solving a different nonlinear problem on 2 and 3 processes or the initial conditions are very different (or both). Barry On Sep 3, 2014, at 2:05 PM, Welland, Michael J. wrote: > Thanks Barry, I attach the terminal outputs. From my reading, it seems like everything converged alright, no? Usually I get an error message when the linear or snes solver doesn't converge. > > By the way, I don't think it should matter but for your reference I am doing an operator split method in FEniCS where I am alternating between solving two system until their solution vectors stop changing between iterations. > > Thanks > > From karpeev at mcs.anl.gov Wed Sep 3 16:39:15 2014 From: karpeev at mcs.anl.gov (Dmitry Karpeyev) Date: Wed, 3 Sep 2014 16:39:15 -0500 Subject: [petsc-users] Postdoctoral position at ANL/MCS (CHiMaD) Message-ID: *Postdoctoral position at the Center for Hierarchical Materials Design* We have an opening for a postdoctoral position in the Center for Hierarchical Materials Design (CHiMaD), a joint center between Northwestern University, the University of Chicago, and Argonne National Laboratory, sponsored by NIST, http://chimad.northwestern.edu. We are looking for a person who will help us develop computational methods and codes for mesoscale materials modeling. The goal of the work is to develop efficient, scalable community codes for a variety of phase-field models that allow coupling with other models. We use real-space methods on unstructured irregular meshes to solve coupled partial differential equations, such as Cahn-Hilliard and Allen-Cahn equations, coupled to, for example, Coulombic fields or elastic strain fields. Candidates should have suitable training in physics, materials science, or applied mathematics, with experience in implementing numerical finite-element based methods for partial differential equations. Experience with C/C++ programming, MPI, and libraries such as PETSc, libMesh, and MOOSE is a plus. The candidates would work with diverse community of material scientists, applied mathematicians and numerical analysts in CHiMaD, including Olle Heinonen and Barry Smith (Argonne), Dmitry Karpeyev (U of C) and Peter Voorhees (Northwestern). For more information contact Dmitry Karpeyev (karpeev at uchicago.edu) -------------- next part -------------- An HTML attachment was scrubbed... URL: From elyaskazeeem at syremail.com Wed Sep 3 17:25:50 2014 From: elyaskazeeem at syremail.com (Elyas Kazeem) Date: Wed, 3 Sep 2014 22:25:50 +0000 Subject: [petsc-users] Proposal Message-ID: <2428355464664721524421@CLOUD-SERVER-66> Good day. I have attached a business proposal that I believe will be of mutual benefit to both of us to this email. Please go through the attached proposal and let me know if you are interested in working with us on this project. After reading the proposal, you can email me back on: elyaskazeem at syremail.com so that we can go over the details together. If for some reasons you are unable to view the attachment, kindly inform me so that I can so that I can resend the proposal in a different format or plain text. Regards. Elyas. -------------- next part -------------- A non-text attachment was scrubbed... Name: Proposal.pdf Type: application/pdf Size: 478566 bytes Desc: not available URL: From C.Klaij at marin.nl Thu Sep 4 01:29:38 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 4 Sep 2014 06:29:38 +0000 Subject: [petsc-users] PCFieldSplitSetSchurPre in fortran In-Reply-To: References: <0287e6aceb5a49d1b99709b358b69d6d@MAR190n2.marin.local> <36ef997c4d7a436da57f5c7311aa74dd@MAR190n2.marin.local> <5b3e7ab3188f4b39937f35e5a6e0c7e7@MAR190n2.marin.local> , Message-ID: <58657a9350ff4b32a7754ef90e5f7560@MAR190n2.marin.local> I did a make allclean, make and install. That solved the two problems. So I'm guessing these values are taken from the petscksp module or its predecessors instead of being taken directly from the finclude/petscpc.h file. Case closed. Chris ________________________________ From: Matthew Knepley Sent: Wednesday, September 03, 2014 5:02 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 9:46 AM, Klaij, Christiaan > wrote: print *, PC_FIELDSPLIT_SCHUR_PRE_USER gives: 2. (changing to *_SELFP gives: This name does not have a type, and must have an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP]) I have next:/PETSc3/petsc/petsc-pylith$ find include/finclude/ -type f | xargs grep USER find include/finclude/ -type f | xargs grep USER include/finclude//petscerrordef.h:#define PETSC_ERR_USER 83 include/finclude//petscpc.h: PetscEnum PC_FIELDSPLIT_SCHUR_PRE_USER include/finclude//petscpc.h: parameter (PC_FIELDSPLIT_SCHUR_PRE_USER=3) include/finclude//petsctao.h: PetscEnum TAO_CONVERGED_USER include/finclude//petsctao.h: PetscEnum TAO_DIVERGED_USER include/finclude//petsctao.h: parameter ( TAO_CONVERGED_USER = 8) include/finclude//petsctao.h: parameter ( TAO_DIVERGED_USER = -8) Where could it possibly be getting this value from? I think from some compiled module which you need to rebuild. Matt MARIN news: MARIN Report 112: Hydro-structural This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Wednesday, September 03, 2014 4:44 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 9:43 AM, Klaij, Christiaan > wrote: I'm sorry, how do I do that? print it Matt Chris MARIN news: Applied Hydrodynamics of Floating Offshore Structures course, Oct 8 - 10, Houston This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Wednesday, September 03, 2014 4:38 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 9:23 AM, Klaij, Christiaan > wrote: Matt, Thanks, after applying the fix to my petsc-3.5.1 install, the small Fortran program works as expected. Now, I would like to change the fortran strategy to the option "3) Using Fortran modules". So, in the small fortran program I replace these seven lines #include #include #include #include #include #include #include by the two following lines use petscksp #include This still compiles but I get the two old problem back... I have no idea why. Can you look at the values of those enumerations? Matt Chris ________________________________ From: Matthew Knepley > Sent: Wednesday, September 03, 2014 2:12 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Wed, Sep 3, 2014 at 7:00 AM, Klaij, Christiaan > wrote: Matt, Thanks for the fix. If I understand correctly, in an existing install of petsc-3.5.1, I would only need to replace the file "finclude/petscpc.h" by the new file for the fix to work? (instead of downloading dev, configuring, installing on various machines). Yes Matt Chris MARIN news: Bas Buchner speaker at Lowpex conference at SMM Hamburg This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Tuesday, September 02, 2014 5:42 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Tue, Sep 2, 2014 at 2:08 AM, Klaij, Christiaan > wrote: Matt, Attached is a small Fortran code that replicates the second problem. This was a Fortran define problem. I fixed it on next https://bitbucket.org/petsc/petsc/branch/knepley/fix-pc-fieldsplit-fortran and it will be in maint and master tomorrow. Thanks, Matt Chris [cid:image76463f.JPG at e74baedb.4a899186][cid:imageae8e3c.JPG at 1f0a6dd3.4e92756a] dr. ir. Christiaan Klaij CFD Researcher Research & Development MARIN 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I www.marin.nl MARIN news: MARIN at SMM, Hamburg, September 9-12 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Klaij, Christiaan Sent: Friday, August 29, 2014 4:42 PM To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: RE: [petsc-users] PCFieldSplitSetSchurPre in fortran Matt, The small test code (ex70) is in C and it works fine, the problem happens in a big Fortran code. I will try to replicate the problem in a small Fortran code, but that will take some time. Chris ________________________________ From: Matthew Knepley > Sent: Friday, August 29, 2014 4:14 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] PCFieldSplitSetSchurPre in fortran On Fri, Aug 29, 2014 at 8:55 AM, Klaij, Christiaan > wrote: I'm trying PCFieldSplitSetSchurPre with PC_FIELDSPLIT_SCHUR_PRE_SELFP in petsc-3.5.1 using fortran. The first problem is that PC_FIELDSPLIT_SCHUR_PRE_SELFP seems to be missing in fortran, I get the compile error: This name does not have a type, and must have an explicit type. [PC_FIELDSPLIT_SCHUR_PRE_SELFP] while compilation works fine with _A11, _SELF and _USER. Mark Adams has just fixed this. The second problem is that the call doesn't seem to have any effect. For example, I have CALL PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_USER,aa,ierr) CALL PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr) This compiles and runs, but ksp_view tells me PC Object:(sys_) 3 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from A11 So changing the factorization from the default FULL to LOWER did work, but changing the preconditioner from A11 to USER didn't. I've also tried to run directly from the command line using -sys_pc_fieldsplit_schur_precondition user -sys_ksp_view This works in the sense that I don't get the "WARNING! There are options you set that were not used!" message, but still ksp_view reports A11 instead of user provided matrix. Can you send a small test code, since I use this everyday here and it works. Thanks, Matt Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image76463f.JPG Type: image/jpeg Size: 1069 bytes Desc: image76463f.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageae8e3c.JPG Type: image/jpeg Size: 1622 bytes Desc: imageae8e3c.JPG URL: From alpkalpalp at gmail.com Thu Sep 4 06:42:12 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Thu, 4 Sep 2014 14:42:12 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> Message-ID: Hi, Although I am experienced on C++ development on windows by using SVN, I need to learn too much things to use this algorithm I think.(petsc, debugging in linux, git, hg etc.) So, sorry for the rookie questions in advance. You suggested me to use development version ----> petsc-dev (I believe I will compile and use it similar to petsc version of it), however it seems it has no branch named as "master", "next" or "stefano_zampini/pcbddc-primalfixes" These branches exist in "petsc" repo. Is it possible to merge petsc-dev and "stefano_zampini/pcbddc-primalfixes" petsc-dev is hg, petsc is git ! or am I get it totaly wrong! I am confused. Could someone explain me the development hierarchy in petsc? what is petsc-dev and why it has no branches? On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini wrote: > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the > dual of the other) and it does not have its own classes so far. > > That said, you can experiment with FETI-DP only after having setup a BDDC > preconditioner with the options and customization you prefer. > Use http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html for > manual pages. > > For an 'how to' with FETIDP, please see > src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look > at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented > Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have > F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to > obtain a right-hand side for the FETIDP system and a physical solution from > the solution of the FETIDP system. > > I would recommend you to use the development version of the library and > either use the ?next? branch or the ?master' branch after having merged > in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also > contains the new deluxe scaling operator for BDDC which is not available > to use with FETI-DP. > > If you have any other questions which can be useful for other PETSc users, > please use the mailing list; otherwise you can contact me personally. > > Stefano > > > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > > Matthew Knepley writes: > > 1- Is it possible to complete a FETI-DP solution with the provided > functions in current PetSc release? > > > There is no FETI-DP in PETSc. > > > Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. You > can enable it by configuring --with-pcbddc. This will be turned on by > default soon. It is fairly new, so you should use the branch 'master' > instead of the release. It has an option to do FETI-DP instead of BDDC. > See src/ksp/ksp/examples/tutorials/ex59.c. > > For either of these methods, you have to assemble a MATIS. If you use > MatSetValuesLocal, most of your assembly code can stay the same. > > Hopefully we can get better examples before the next release. Stefano > (the author of PCBDDC, Cc'd) tests mostly with external packages, but we > really need more complete tests within PETSc. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Thu Sep 4 07:06:05 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 4 Sep 2014 12:06:05 +0000 Subject: [petsc-users] fieldsplit_0_ monitor in combination with selfp Message-ID: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local> I'm playing with the selfp option in fieldsplit using snes/examples/tutorials/ex70.c. For example: mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -pc_fieldsplit_schur_precondition selfp \ -fieldsplit_1_inner_ksp_type preonly \ -fieldsplit_1_inner_pc_type jacobi \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 gives the following output 0 KSP Residual norm 1.229687498638e+00 Residual norms for fieldsplit_1_ solve. 0 KSP Residual norm 2.330138480101e+01 1 KSP Residual norm 1.609000846751e+01 1 KSP Residual norm 1.180287268335e+00 To my suprise I don't see anything for the fieldsplit_0_ solve, why? Furthermore, if I understand correctly the above should be exactly equivalent with mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -user_ksp \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 0 KSP Residual norm 1.229687498638e+00 Residual norms for fieldsplit_0_ solve. 0 KSP Residual norm 5.486639587672e-01 1 KSP Residual norm 6.348354253703e-02 Residual norms for fieldsplit_1_ solve. 0 KSP Residual norm 2.321938107977e+01 1 KSP Residual norm 1.605484031258e+01 1 KSP Residual norm 1.183225251166e+00 because -user_ksp replaces the Schur complement by the simple approximation A11 - A10 inv(diag(A00)) A01. Beside the missing fielsplit_0_ part, the numbers are pretty close but not exactly the same. Any explanation? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From bsmith at mcs.anl.gov Thu Sep 4 07:07:55 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 Sep 2014 07:07:55 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> Message-ID: <04CD9EF6-C206-4D8B-8C45-1756546670AB@mcs.anl.gov> If you are referring to https://bitbucket.org/petsc/petsc-dev please ignore it; that it is an outdated item that should have been removed a long time ago. Sorry for the confusion. You should start with the git repository https://bitbucket.org/petsc/petsc that should have all the branches you need. Barry On Sep 4, 2014, at 6:42 AM, Alp Kalpalp wrote: > Hi, > > Although I am experienced on C++ development on windows by using SVN, I need to learn too much things to use this algorithm I think.(petsc, debugging in linux, git, hg etc.) > So, sorry for the rookie questions in advance. > > You suggested me to use development version ----> petsc-dev (I believe I will compile and use it similar to petsc version of it), however it seems it has no branch named as "master", "next" or "stefano_zampini/pcbddc-primalfixes" > > These branches exist in "petsc" repo. Is it possible to merge petsc-dev and "stefano_zampini/pcbddc-primalfixes" > petsc-dev is hg, petsc is git ! > > or am I get it totaly wrong! I am confused. > > Could someone explain me the development hierarchy in petsc? what is petsc-dev and why it has no branches? > > > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini wrote: > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the dual of the other) and it does not have its own classes so far. > > That said, you can experiment with FETI-DP only after having setup a BDDC preconditioner with the options and customization you prefer. > Use http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html for manual pages. > > For an 'how to' with FETIDP, please see src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to obtain a right-hand side for the FETIDP system and a physical solution from the solution of the FETIDP system. > > I would recommend you to use the development version of the library and either use the ?next? branch or the ?master' branch after having merged in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also contains the new deluxe scaling operator for BDDC which is not available to use with FETI-DP. > > If you have any other questions which can be useful for other PETSc users, please use the mailing list; otherwise you can contact me personally. > > Stefano > > > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > >> Matthew Knepley writes: >>>> 1- Is it possible to complete a FETI-DP solution with the provided >>>> functions in current PetSc release? >>>> >>> >>> There is no FETI-DP in PETSc. >> >> Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. You >> can enable it by configuring --with-pcbddc. This will be turned on by >> default soon. It is fairly new, so you should use the branch 'master' >> instead of the release. It has an option to do FETI-DP instead of BDDC. >> See src/ksp/ksp/examples/tutorials/ex59.c. >> >> For either of these methods, you have to assemble a MATIS. If you use >> MatSetValuesLocal, most of your assembly code can stay the same. >> >> Hopefully we can get better examples before the next release. Stefano >> (the author of PCBDDC, Cc'd) tests mostly with external packages, but we >> really need more complete tests within PETSc. > > From alpkalpalp at gmail.com Thu Sep 4 07:17:23 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Thu, 4 Sep 2014 15:17:23 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: <04CD9EF6-C206-4D8B-8C45-1756546670AB@mcs.anl.gov> References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <04CD9EF6-C206-4D8B-8C45-1756546670AB@mcs.anl.gov> Message-ID: ok, so it is clear now, thank you On Thu, Sep 4, 2014 at 3:07 PM, Barry Smith wrote: > > If you are referring to https://bitbucket.org/petsc/petsc-dev please > ignore it; that it is an outdated item that should have been removed a long > time ago. Sorry for the confusion. > > You should start with the git repository > https://bitbucket.org/petsc/petsc that should have all the branches you > need. > > Barry > > > On Sep 4, 2014, at 6:42 AM, Alp Kalpalp wrote: > > > Hi, > > > > Although I am experienced on C++ development on windows by using SVN, I > need to learn too much things to use this algorithm I think.(petsc, > debugging in linux, git, hg etc.) > > So, sorry for the rookie questions in advance. > > > > You suggested me to use development version ----> petsc-dev (I believe I > will compile and use it similar to petsc version of it), however it seems > it has no branch named as "master", "next" or > "stefano_zampini/pcbddc-primalfixes" > > > > These branches exist in "petsc" repo. Is it possible to merge petsc-dev > and "stefano_zampini/pcbddc-primalfixes" > > petsc-dev is hg, petsc is git ! > > > > or am I get it totaly wrong! I am confused. > > > > Could someone explain me the development hierarchy in petsc? what is > petsc-dev and why it has no branches? > > > > > > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini < > stefano.zampini at gmail.com> wrote: > > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the > dual of the other) and it does not have its own classes so far. > > > > That said, you can experiment with FETI-DP only after having setup a > BDDC preconditioner with the options and customization you prefer. > > Use > http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html > for manual pages. > > > > For an 'how to' with FETIDP, please see > src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look > at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented > Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have > F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to > obtain a right-hand side for the FETIDP system and a physical solution from > the solution of the FETIDP system. > > > > I would recommend you to use the development version of the library and > either use the ?next? branch or the ?master' branch after having merged in > the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also > contains the new deluxe scaling operator for BDDC which is not available to > use with FETI-DP. > > > > If you have any other questions which can be useful for other PETSc > users, please use the mailing list; otherwise you can contact me personally. > > > > Stefano > > > > > > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > > > >> Matthew Knepley writes: > >>>> 1- Is it possible to complete a FETI-DP solution with the provided > >>>> functions in current PetSc release? > >>>> > >>> > >>> There is no FETI-DP in PETSc. > >> > >> Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. You > >> can enable it by configuring --with-pcbddc. This will be turned on by > >> default soon. It is fairly new, so you should use the branch 'master' > >> instead of the release. It has an option to do FETI-DP instead of BDDC. > >> See src/ksp/ksp/examples/tutorials/ex59.c. > >> > >> For either of these methods, you have to assemble a MATIS. If you use > >> MatSetValuesLocal, most of your assembly code can stay the same. > >> > >> Hopefully we can get better examples before the next release. Stefano > >> (the author of PCBDDC, Cc'd) tests mostly with external packages, but we > >> really need more complete tests within PETSc. > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 4 07:20:17 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 Sep 2014 07:20:17 -0500 Subject: [petsc-users] fieldsplit_0_ monitor in combination with selfp In-Reply-To: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local> References: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local> Message-ID: On Thu, Sep 4, 2014 at 7:06 AM, Klaij, Christiaan wrote: > I'm playing with the selfp option in fieldsplit using > snes/examples/tutorials/ex70.c. For example: > > mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ > -ksp_type fgmres \ > -pc_type fieldsplit \ > -pc_fieldsplit_type schur \ > -pc_fieldsplit_schur_fact_type lower \ > -pc_fieldsplit_schur_precondition selfp \ > -fieldsplit_1_inner_ksp_type preonly \ > -fieldsplit_1_inner_pc_type jacobi \ > -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ > -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ > -ksp_monitor -ksp_max_it 1 > > gives the following output > > 0 KSP Residual norm 1.229687498638e+00 > Residual norms for fieldsplit_1_ solve. > 0 KSP Residual norm 2.330138480101e+01 > 1 KSP Residual norm 1.609000846751e+01 > 1 KSP Residual norm 1.180287268335e+00 > > To my suprise I don't see anything for the fieldsplit_0_ solve, > why? > Always run with -ksp_view for any solver question. Thanks, Matt > Furthermore, if I understand correctly the above should be > exactly equivalent with > > mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ > -ksp_type fgmres \ > -pc_type fieldsplit \ > -pc_fieldsplit_type schur \ > -pc_fieldsplit_schur_fact_type lower \ > -user_ksp \ > -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ > -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ > -ksp_monitor -ksp_max_it 1 > > 0 KSP Residual norm 1.229687498638e+00 > Residual norms for fieldsplit_0_ solve. > 0 KSP Residual norm 5.486639587672e-01 > 1 KSP Residual norm 6.348354253703e-02 > Residual norms for fieldsplit_1_ solve. > 0 KSP Residual norm 2.321938107977e+01 > 1 KSP Residual norm 1.605484031258e+01 > 1 KSP Residual norm 1.183225251166e+00 > > because -user_ksp replaces the Schur complement by the simple > approximation A11 - A10 inv(diag(A00)) A01. Beside the missing > fielsplit_0_ part, the numbers are pretty close but not exactly > the same. Any explanation? > > Chris > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Thu Sep 4 07:26:32 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 4 Sep 2014 12:26:32 +0000 Subject: [petsc-users] fieldsplit_0_ monitor in combination with selfp In-Reply-To: References: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local>, Message-ID: <326013535d8a4af4ad43bc7ab4945f92@MAR190n2.marin.local> Sorry, here's the ksp_view. I'm expecting -fieldsplit_1_inner_ksp_type preonly to set the ksp(A00) in the Schur complement only, but it seems to set it in the inv(A00) of the diagonal as well. Chris ? 0 KSP Residual norm 1.229687498638e+00 ??? Residual norms for fieldsplit_1_ solve. ??? 0 KSP Residual norm 7.185799114488e+01 ??? 1 KSP Residual norm 3.873274154012e+01 ? 1 KSP Residual norm 1.107969383366e+00 KSP Object: 1 MPI processes ? type: fgmres ??? GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement ??? GMRES: happy breakdown tolerance 1e-30 ? maximum iterations=1, initial guess is zero ? tolerances:? relative=1e-05, absolute=1e-50, divergence=10000 ? right preconditioning ? using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes ? type: fieldsplit ??? FieldSplit with Schur preconditioner, factorization LOWER ??? Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse ??? Split info: ??? Split number 0 Defined by IS ??? Split number 1 Defined by IS ??? KSP solver for A00 block ????? KSP Object:????? (fieldsplit_0_)?????? 1 MPI processes ??????? type: preonly ??????? maximum iterations=1, initial guess is zero ??????? tolerances:? relative=1e-05, absolute=1e-50, divergence=10000 ??????? left preconditioning ??????? using NONE norm type for convergence test ????? PC Object:????? (fieldsplit_0_)?????? 1 MPI processes ??????? type: bjacobi ????????? block Jacobi: number of blocks = 1 ????????? Local solve is same for all blocks, in the following KSP and PC objects: ????????? KSP Object:????????? (fieldsplit_0_sub_)?????????? 1 MPI processes ??????????? type: preonly ??????????? maximum iterations=10000, initial guess is zero ??????????? tolerances:? relative=1e-05, absolute=1e-50, divergence=10000 ??????????? left preconditioning ??????????? using NONE norm type for convergence test ????????? PC Object:????????? (fieldsplit_0_sub_)?????????? 1 MPI processes ??????????? type: ilu ????????????? ILU: out-of-place factorization ????????????? 0 levels of fill ????????????? tolerance for zero pivot 2.22045e-14 ????????????? using diagonal shift on blocks to prevent zero pivot [INBLOCKS] ????????????? matrix ordering: natural ????????????? factor fill ratio given 1, needed 1 ??????????????? Factored matrix follows: ????????????????? Mat Object:?????????????????? 1 MPI processes ??????????????????? type: seqaij ??????????????????? rows=48, cols=48 ??????????????????? package used to perform factorization: petsc ??????????????????? total: nonzeros=200, allocated nonzeros=200 ??????????????????? total number of mallocs used during MatSetValues calls =0 ????????????????????? not using I-node routines ??????????? linear system matrix = precond matrix: ??????????? Mat Object:??????????? (fieldsplit_0_)???????????? 1 MPI processes ????????????? type: seqaij ????????????? rows=48, cols=48 ????????????? total: nonzeros=200, allocated nonzeros=240 ????????????? total number of mallocs used during MatSetValues calls =0 ??????????????? not using I-node routines ??????? linear system matrix = precond matrix: ??????? Mat Object:??????? (fieldsplit_0_)???????? 1 MPI processes ????????? type: mpiaij ????????? rows=48, cols=48 ????????? total: nonzeros=200, allocated nonzeros=480 ????????? total number of mallocs used during MatSetValues calls =0 ??????????? not using I-node (on process 0) routines ??? KSP solver for S = A11 - A10 inv(A00) A01 ????? KSP Object:????? (fieldsplit_1_)?????? 1 MPI processes ??????? type: gmres ????????? GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement ????????? GMRES: happy breakdown tolerance 1e-30 ??????? maximum iterations=1, initial guess is zero ??????? tolerances:? relative=1e-05, absolute=1e-50, divergence=10000 ??????? left preconditioning ??????? using PRECONDITIONED norm type for convergence test ????? PC Object:????? (fieldsplit_1_)?????? 1 MPI processes ??????? type: bjacobi ????????? block Jacobi: number of blocks = 1 ????????? Local solve is same for all blocks, in the following KSP and PC objects: ????????? KSP Object:????????? (fieldsplit_1_sub_)?????????? 1 MPI processes ??????????? type: preonly ??????????? maximum iterations=10000, initial guess is zero ??????????? tolerances:? relative=1e-05, absolute=1e-50, divergence=10000 ??????????? left preconditioning ??????????? using NONE norm type for convergence test ????????? PC Object:????????? (fieldsplit_1_sub_)?????????? 1 MPI processes ??????????? type: bjacobi ????????????? block Jacobi: number of blocks = 1 ????????????? Local solve is same for all blocks, in the following KSP and PC objects: ????????????? KSP Object:????????????? (fieldsplit_1_sub_sub_)?????????????? 1 MPI processes ??????????????? type: preonly ??????????????? maximum iterations=10000, initial guess is zero ??????????????? tolerances:? relative=1e-05, absolute=1e-50, divergence=10000 ??????????????? left preconditioning ??????????????? using NONE norm type for convergence test ????????????? PC Object:????????????? (fieldsplit_1_sub_sub_)?????????????? 1 MPI processes ??????????????? type: ilu ????????????????? ILU: out-of-place factorization ????????????????? 0 levels of fill ????????????????? tolerance for zero pivot 2.22045e-14 ????????????????? using diagonal shift on blocks to prevent zero pivot [INBLOCKS] ????????????????? matrix ordering: natural ????????????????? factor fill ratio given 1, needed 1 ??????????????????? Factored matrix follows: ????????????????????? Mat Object:?????????????????????? 1 MPI processes ??????????????????????? type: seqaij ??????????????????????? rows=24, cols=24 ??????????????????????? package used to perform factorization: petsc ??????????????????????? total: nonzeros=120, allocated nonzeros=120 ??????????????????????? total number of mallocs used during MatSetValues calls =0 ????????????????????????? not using I-node routines ??????????????? linear system matrix = precond matrix: ??????????????? Mat Object:???????????????? 1 MPI processes ????????????????? type: seqaij ????????????????? rows=24, cols=24 ????????????????? total: nonzeros=120, allocated nonzeros=120 ????????????????? total number of mallocs used during MatSetValues calls =0 ??????????????????? not using I-node routines ??????????? linear system matrix = precond matrix: ??????????? Mat Object:???????????? 1 MPI processes ????????????? type: mpiaij ????????????? rows=24, cols=24 ????????????? total: nonzeros=120, allocated nonzeros=120 ????????????? total number of mallocs used during MatSetValues calls =0 ??????????????? not using I-node (on process 0) routines ??????? linear system matrix followed by preconditioner matrix: ??????? Mat Object:??????? (fieldsplit_1_)???????? 1 MPI processes ????????? type: schurcomplement ????????? rows=24, cols=24 ??????????? Schur complement A11 - A10 inv(A00) A01 ??????????? A11 ????????????? Mat Object:????????????? (fieldsplit_1_)?????????????? 1 MPI processes ??????????????? type: mpiaij ??????????????? rows=24, cols=24 ??????????????? total: nonzeros=0, allocated nonzeros=0 ??????????????? total number of mallocs used during MatSetValues calls =0 ????????????????? using I-node (on process 0) routines: found 5 nodes, limit used is 5 ??????????? A10 ????????????? Mat Object:????????????? (a10_)?????????????? 1 MPI processes ??????????????? type: mpiaij ??????????????? rows=24, cols=48 ??????????????? total: nonzeros=96, allocated nonzeros=96 ??????????????? total number of mallocs used during MatSetValues calls =0 ????????????????? not using I-node (on process 0) routines ??????????? KSP of A00 ????????????? KSP Object:????????????? (fieldsplit_1_inner_)?????????????? 1 MPI processes ??????????????? type: preonly ??????????????? maximum iterations=1, initial guess is zero ??????????????? tolerances:? relative=1e-05, absolute=1e-50, divergence=10000 ??????????????? left preconditioning ??????????????? using NONE norm type for convergence test ????????????? PC Object:????????????? (fieldsplit_1_inner_)?????????????? 1 MPI processes ??????????????? type: jacobi ??????????????? linear system matrix = precond matrix: ??????????????? Mat Object:??????????????? (fieldsplit_0_)???????????????? 1 MPI processes ????????????????? type: mpiaij ????????????????? rows=48, cols=48 ????????????????? total: nonzeros=200, allocated nonzeros=480 ????????????????? total number of mallocs used during MatSetValues calls =0 ??????????????????? not using I-node (on process 0) routines ??????????? A01 ????????????? Mat Object:????????????? (a01_)?????????????? 1 MPI processes ??????????????? type: mpiaij ??????????????? rows=48, cols=24 ??????????????? total: nonzeros=96, allocated nonzeros=480 ??????????????? total number of mallocs used during MatSetValues calls =0 ????????????????? not using I-node (on process 0) routines ??????? Mat Object:???????? 1 MPI processes ????????? type: mpiaij ????????? rows=24, cols=24 ????????? total: nonzeros=120, allocated nonzeros=120 ????????? total number of mallocs used during MatSetValues calls =0 ??????????? not using I-node (on process 0) routines ? linear system matrix = precond matrix: ? Mat Object:?? 1 MPI processes ??? type: nest ??? rows=72, cols=72 ????? Matrix object: ??????? type=nest, rows=2, cols=2 ??????? MatNest structure: ??????? (0,0) : prefix="fieldsplit_0_", type=mpiaij, rows=48, cols=48 ??????? (0,1) : prefix="a01_", type=mpiaij, rows=48, cols=24 ??????? (1,0) : prefix="a10_", type=mpiaij, rows=24, cols=48 ??????? (1,1) : prefix="fieldsplit_1_", type=mpiaij, rows=24, cols=24 From: Matthew Knepley Sent: Thursday, September 04, 2014 2:20 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp ? On Thu, Sep 4, 2014 at 7:06 AM, Klaij, Christiaan wrote: I'm playing with the selfp option in fieldsplit using snes/examples/tutorials/ex70.c. For example: mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -pc_fieldsplit_schur_precondition selfp \ -fieldsplit_1_inner_ksp_type preonly \ -fieldsplit_1_inner_pc_type jacobi \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 gives the following output ? 0 KSP Residual norm 1.229687498638e+00 ? ? Residual norms for fieldsplit_1_ solve. ? ? 0 KSP Residual norm 2.330138480101e+01 ? ? 1 KSP Residual norm 1.609000846751e+01 ? 1 KSP Residual norm 1.180287268335e+00 To my suprise I don't see anything for the fieldsplit_0_ solve, why? Always run with -ksp_view for any solver question. ? Thanks, ? ? Matt ? Furthermore, if I understand correctly the above should be exactly equivalent with mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -user_ksp \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 ? 0 KSP Residual norm 1.229687498638e+00 ? ? Residual norms for fieldsplit_0_ solve. ? ? 0 KSP Residual norm 5.486639587672e-01 ? ? 1 KSP Residual norm 6.348354253703e-02 ? ? Residual norms for fieldsplit_1_ solve. ? ? 0 KSP Residual norm 2.321938107977e+01 ? ? 1 KSP Residual norm 1.605484031258e+01 ? 1 KSP Residual norm 1.183225251166e+00 because -user_ksp replaces the Schur complement by the simple approximation A11 - A10 inv(diag(A00)) A01. Beside the missing fielsplit_0_ part, the numbers are pretty close but not exactly the same. Any explanation? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From alpkalpalp at gmail.com Thu Sep 4 08:52:38 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Thu, 4 Sep 2014 16:52:38 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> Message-ID: Dear Stefano, I have checked out "master" branch and merged with your stefano_zampini/pcbddc-primalfixe branch. configured, compiled and all tests are completed successfully. Then, I tried to compile ex59 with make ex59, it results in unresolved external error. I believe your bddc_feti files are not included in compilation. Since I am not experienced on how to solve issues in Petsc, I need to ask several questions; 1-) Are there any global settings to add additonal directories to compilation (src\ksp\pc\impls\bddc) 2-) or should I include these files on top of ex59 (AFAIK, including .c files is not a good thing) 3-) and finally what is the better way of helping you (creating another branch from yours or what) Thanks in advance On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini wrote: > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the > dual of the other) and it does not have its own classes so far. > > That said, you can experiment with FETI-DP only after having setup a BDDC > preconditioner with the options and customization you prefer. > Use http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html for > manual pages. > > For an 'how to' with FETIDP, please see > src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look > at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented > Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have > F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to > obtain a right-hand side for the FETIDP system and a physical solution from > the solution of the FETIDP system. > > I would recommend you to use the development version of the library and > either use the ?next? branch or the ?master' branch after having merged > in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also > contains the new deluxe scaling operator for BDDC which is not available > to use with FETI-DP. > > If you have any other questions which can be useful for other PETSc users, > please use the mailing list; otherwise you can contact me personally. > > Stefano > > > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > > Matthew Knepley writes: > > 1- Is it possible to complete a FETI-DP solution with the provided > functions in current PetSc release? > > > There is no FETI-DP in PETSc. > > > Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. You > can enable it by configuring --with-pcbddc. This will be turned on by > default soon. It is fairly new, so you should use the branch 'master' > instead of the release. It has an option to do FETI-DP instead of BDDC. > See src/ksp/ksp/examples/tutorials/ex59.c. > > For either of these methods, you have to assemble a MATIS. If you use > MatSetValuesLocal, most of your assembly code can stay the same. > > Hopefully we can get better examples before the next release. Stefano > (the author of PCBDDC, Cc'd) tests mostly with external packages, but we > really need more complete tests within PETSc. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 4 09:37:42 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 Sep 2014 09:37:42 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> Message-ID: <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> This is likely due to the horrible horrible fact that some of the bddc files only get compiled if ./configure is run with the option --with-pcbddc you will need to rerun ./configure and then make with that option. I pray that someone removes that horrible confusing configure option. Barry On Sep 4, 2014, at 8:52 AM, Alp Kalpalp wrote: > Dear Stefano, > > I have checked out "master" branch and merged with your stefano_zampini/pcbddc-primalfixe branch. configured, compiled and all tests are completed successfully. > Then, I tried to compile ex59 with make ex59, it results in unresolved external error. I believe your bddc_feti files are not included in compilation. > Since I am not experienced on how to solve issues in Petsc, I need to ask several questions; > > 1-) Are there any global settings to add additonal directories to compilation (src\ksp\pc\impls\bddc) > 2-) or should I include these files on top of ex59 (AFAIK, including .c files is not a good thing) > 3-) and finally what is the better way of helping you (creating another branch from yours or what) > > Thanks in advance > > > > > > > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini wrote: > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the dual of the other) and it does not have its own classes so far. > > That said, you can experiment with FETI-DP only after having setup a BDDC preconditioner with the options and customization you prefer. > Use http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html for manual pages. > > For an 'how to' with FETIDP, please see src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to obtain a right-hand side for the FETIDP system and a physical solution from the solution of the FETIDP system. > > I would recommend you to use the development version of the library and either use the ?next? branch or the ?master' branch after having merged in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also contains the new deluxe scaling operator for BDDC which is not available to use with FETI-DP. > > If you have any other questions which can be useful for other PETSc users, please use the mailing list; otherwise you can contact me personally. > > Stefano > > > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > >> Matthew Knepley writes: >>>> 1- Is it possible to complete a FETI-DP solution with the provided >>>> functions in current PetSc release? >>>> >>> >>> There is no FETI-DP in PETSc. >> >> Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. You >> can enable it by configuring --with-pcbddc. This will be turned on by >> default soon. It is fairly new, so you should use the branch 'master' >> instead of the release. It has an option to do FETI-DP instead of BDDC. >> See src/ksp/ksp/examples/tutorials/ex59.c. >> >> For either of these methods, you have to assemble a MATIS. If you use >> MatSetValuesLocal, most of your assembly code can stay the same. >> >> Hopefully we can get better examples before the next release. Stefano >> (the author of PCBDDC, Cc'd) tests mostly with external packages, but we >> really need more complete tests within PETSc. > > From C.Klaij at marin.nl Thu Sep 4 09:59:30 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 4 Sep 2014 14:59:30 +0000 Subject: [petsc-users] suggestion for PCFieldSplitGetSubKSP Message-ID: <82ceac8e1eb6489fb51603fc5c1d190a@MAR190n2.marin.local> I'm using PCFieldSplitGetSubKSP to access and modify the various KSP's that occur in a Schur preconditioner. Now, PCFieldSplitGetSubKSP returns two KSP's: one corresponding to A00 and one corresponding to the Schur complement. However, in the LDU factorization, ksp(A00) occurs 4 times: in the upper block, in the diagonal block, in the Schur complement of the diagonal block and in the lower block. So far, I'm using the runtime option -fieldsplit_1_inner to access the ksp(A00) in the Schur complement, but it would make sence if PCFieldSplitGetSubKSP returns all 5 KSP's involved. Or is there another subroutine that I could call? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From salazardetroya at gmail.com Thu Sep 4 12:21:01 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Thu, 4 Sep 2014 12:21:01 -0500 Subject: [petsc-users] Initial estimation on SNES and KSP Message-ID: Dear all SNES uses internally a KSP to solve the linear system of equations right? Now the case that we had a linear system of equations that we are solving with SNES, how could we set the initial estimation for the KSP? If we just included the option -ksp_initial_guess_nonzero, the KSP will grab the vector X we passed to the SNES? Thanks in advance. Miguel -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Sep 4 12:28:30 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 04 Sep 2014 11:28:30 -0600 Subject: [petsc-users] Initial estimation on SNES and KSP In-Reply-To: References: Message-ID: <878ulz2t35.fsf@jedbrown.org> Miguel Angel Salazar de Troya writes: > Dear all > > SNES uses internally a KSP to solve the linear system of equations right? > Now the case that we had a linear system of equations that we are solving > with SNES, how could we set the initial estimation for the KSP? If we just > included the option -ksp_initial_guess_nonzero, the KSP will grab the > vector X we passed to the SNES? You definitely don't want this for Newton-type methods. If you have a nonzero guess for the solution of the linear system, you should have evaluated the Jacobian at that point. A zero initial guess is optimal for Newton-type methods. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From knepley at gmail.com Thu Sep 4 12:53:48 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 Sep 2014 12:53:48 -0500 Subject: [petsc-users] Initial estimation on SNES and KSP In-Reply-To: <878ulz2t35.fsf@jedbrown.org> References: <878ulz2t35.fsf@jedbrown.org> Message-ID: On Thu, Sep 4, 2014 at 12:28 PM, Jed Brown wrote: > Miguel Angel Salazar de Troya writes: > > > Dear all > > > > SNES uses internally a KSP to solve the linear system of equations right? > > Now the case that we had a linear system of equations that we are solving > > with SNES, how could we set the initial estimation for the KSP? If we > just > > included the option -ksp_initial_guess_nonzero, the KSP will grab the > > vector X we passed to the SNES? > > You definitely don't want this for Newton-type methods. If you have a > nonzero guess for the solution of the linear system, you should have > evaluated the Jacobian at that point. A zero initial guess is optimal > for Newton-type methods. > Notice that Newton is solving for the correction, not the solution itself. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 4 17:36:39 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 Sep 2014 17:36:39 -0500 Subject: [petsc-users] fieldsplit_0_ monitor in combination with selfp In-Reply-To: <326013535d8a4af4ad43bc7ab4945f92@MAR190n2.marin.local> References: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local> <326013535d8a4af4ad43bc7ab4945f92@MAR190n2.marin.local> Message-ID: On Thu, Sep 4, 2014 at 7:26 AM, Klaij, Christiaan wrote: > Sorry, here's the ksp_view. I'm expecting > > -fieldsplit_1_inner_ksp_type preonly > > to set the ksp(A00) in the Schur complement only, but it seems to set it > in the inv(A00) of the diagonal as well. > I think something is wrong in your example (we strongly advise against using MatNest directly). I cannot reproduce this using SNES ex62: ./config/builder2.py check src/snes/examples/tutorials/ex62.c --testnum=36 --args="-fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi" which translates to ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -ksp_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi gives Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=20 total number of function evaluations=2 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 right preconditioning has attached null space using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 3.45047 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=962, cols=962 package used to perform factorization: petsc total: nonzeros=68692, allocated nonzeros=68692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 456 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000 left preconditioning has attached null space using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=145, cols=145 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=145, cols=962 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_pressure_inner_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_pressure_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=962, cols=145 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1107, cols=1107 total: nonzeros=29785, allocated nonzeros=29785 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 513 nodes, limit used is 5 Matt > Chris > > 0 KSP Residual norm 1.229687498638e+00 > Residual norms for fieldsplit_1_ solve. > 0 KSP Residual norm 7.185799114488e+01 > 1 KSP Residual norm 3.873274154012e+01 > 1 KSP Residual norm 1.107969383366e+00 > KSP Object: 1 MPI processes > type: fgmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization LOWER > Preconditioner for the Schur complement formed from Sp, an assembled > approximation to S, which uses (the lumped) A00's diagonal's inverse > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (fieldsplit_0_) 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_) 1 MPI processes > type: bjacobi > block Jacobi: number of blocks = 1 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object: (fieldsplit_0_sub_) 1 MPI > processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_0_sub_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=48, cols=48 > package used to perform factorization: petsc > total: nonzeros=200, allocated nonzeros=200 > total number of mallocs used during MatSetValues calls > =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI > processes > type: seqaij > rows=48, cols=48 > total: nonzeros=200, allocated nonzeros=240 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) 1 MPI processes > type: mpiaij > rows=48, cols=48 > total: nonzeros=200, allocated nonzeros=480 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (fieldsplit_1_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (fieldsplit_1_) 1 MPI processes > type: bjacobi > block Jacobi: number of blocks = 1 > Local solve is same for all blocks, in the following KSP and PC > objects: > KSP Object: (fieldsplit_1_sub_) 1 MPI > processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_sub_) 1 MPI processes > type: bjacobi > block Jacobi: number of blocks = 1 > Local solve is same for all blocks, in the following KSP and > PC objects: > KSP Object: > (fieldsplit_1_sub_sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: > (fieldsplit_1_sub_sub_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=24, cols=24 > package used to perform factorization: petsc > total: nonzeros=120, allocated nonzeros=120 > total number of mallocs used during MatSetValues > calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=24, cols=24 > total: nonzeros=120, allocated nonzeros=120 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mpiaij > rows=24, cols=24 > total: nonzeros=120, allocated nonzeros=120 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > linear system matrix followed by preconditioner matrix: > Mat Object: (fieldsplit_1_) 1 MPI processes > type: schurcomplement > rows=24, cols=24 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (fieldsplit_1_) 1 MPI > processes > type: mpiaij > rows=24, cols=24 > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 5 nodes, > limit used is 5 > A10 > Mat Object: (a10_) 1 MPI processes > type: mpiaij > rows=24, cols=48 > total: nonzeros=96, allocated nonzeros=96 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > KSP of A00 > KSP Object: (fieldsplit_1_inner_) > 1 MPI processes > type: preonly > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_inner_) > 1 MPI processes > type: jacobi > linear system matrix = precond matrix: > Mat Object: (fieldsplit_0_) > 1 MPI processes > type: mpiaij > rows=48, cols=48 > total: nonzeros=200, allocated nonzeros=480 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > A01 > Mat Object: (a01_) 1 MPI processes > type: mpiaij > rows=48, cols=24 > total: nonzeros=96, allocated nonzeros=480 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > Mat Object: 1 MPI processes > type: mpiaij > rows=24, cols=24 > total: nonzeros=120, allocated nonzeros=120 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: nest > rows=72, cols=72 > Matrix object: > type=nest, rows=2, cols=2 > MatNest structure: > (0,0) : prefix="fieldsplit_0_", type=mpiaij, rows=48, cols=48 > (0,1) : prefix="a01_", type=mpiaij, rows=48, cols=24 > (1,0) : prefix="a10_", type=mpiaij, rows=24, cols=48 > (1,1) : prefix="fieldsplit_1_", type=mpiaij, rows=24, cols=24 > > > From: Matthew Knepley > Sent: Thursday, September 04, 2014 2:20 PM > To: Klaij, Christiaan > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp > > > > > On Thu, Sep 4, 2014 at 7:06 AM, Klaij, Christiaan > wrote: > I'm playing with the selfp option in fieldsplit using > snes/examples/tutorials/ex70.c. For example: > > mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ > -ksp_type fgmres \ > -pc_type fieldsplit \ > -pc_fieldsplit_type schur \ > -pc_fieldsplit_schur_fact_type lower \ > -pc_fieldsplit_schur_precondition selfp \ > -fieldsplit_1_inner_ksp_type preonly \ > -fieldsplit_1_inner_pc_type jacobi \ > -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ > -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ > -ksp_monitor -ksp_max_it 1 > > gives the following output > > 0 KSP Residual norm 1.229687498638e+00 > Residual norms for fieldsplit_1_ solve. > 0 KSP Residual norm 2.330138480101e+01 > 1 KSP Residual norm 1.609000846751e+01 > 1 KSP Residual norm 1.180287268335e+00 > > To my suprise I don't see anything for the fieldsplit_0_ solve, > why? > > > > Always run with -ksp_view for any solver question. > > > Thanks, > > > Matt > Furthermore, if I understand correctly the above should be > exactly equivalent with > > mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ > -ksp_type fgmres \ > -pc_type fieldsplit \ > -pc_fieldsplit_type schur \ > -pc_fieldsplit_schur_fact_type lower \ > -user_ksp \ > -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ > -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ > -ksp_monitor -ksp_max_it 1 > > 0 KSP Residual norm 1.229687498638e+00 > Residual norms for fieldsplit_0_ solve. > 0 KSP Residual norm 5.486639587672e-01 > 1 KSP Residual norm 6.348354253703e-02 > Residual norms for fieldsplit_1_ solve. > 0 KSP Residual norm 2.321938107977e+01 > 1 KSP Residual norm 1.605484031258e+01 > 1 KSP Residual norm 1.183225251166e+00 > > because -user_ksp replaces the Schur complement by the simple > approximation A11 - A10 inv(diag(A00)) A01. Beside the missing > fielsplit_0_ part, the numbers are pretty close but not exactly > the same. Any explanation? > > Chris > > > dr. ir. Christiaan Klaij > CFD Researcher > Research & Development > E mailto:C.Klaij at marin.nl > T +31 317 49 33 44 > > > MARIN > 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands > T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From alpkalpalp at gmail.com Fri Sep 5 04:27:07 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Fri, 5 Sep 2014 12:27:07 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> Message-ID: As I said before, I have checked out "master" branch and merged with your stefano_zampini/pcbddc-primalfixe branch. configured, compiled successfully. I have used --with-pcbddc option in configure as Barry suggested. However, tests are failed with following reason: akalpalp at a-kalpalp ~/petsc $ make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug test Running test examples to verify correct installation Using PETSC_DIR=/home/akalpalp/petsc and PETSC_ARCH=arch-mswin-c-debug *******************Error detected during compile or link!******************* See http://www.mcs.anl.gov/petsc/documentation/faq.html /home/akalpalp/petsc/src/snes/examples/tutorials ex19 ********************************************************************************* /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -o ex19.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -I/home/akalpalp/petsc/include -I/home/akalpalp/petsc/arch-mswin-c-debug/include `pwd`/ex19.c /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -o ex19 ex19.o -L/home/akalpalp/petsc/arch-mswin-c-debug/lib -lpetsc -Wl,-rpath,/home/akalpalp/petsc/arch-mswin-c-debug/lib -lf2clapack -lf2cblas -lpthread -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl /home/akalpalp/petsc/arch-mswin-c-debug/lib/libpetsc.a(pcregis.o):pcregis.c:(.rdata$.refptr.PCCreate_BDDC[.refptr.PCCreate_BDDC]+0x0): undefined reference to `PCCreate_BDDC' collect2: error: ld returned 1 exit status makefile:108: recipe for target 'ex19' failed make[3]: [ex19] Error 1 (ignored) /usr/bin/rm -f ex19.o Completed test examples ========================================= Now to evaluate the computer systems you plan use - do: make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug streams NPMAX= On Thu, Sep 4, 2014 at 5:37 PM, Barry Smith wrote: > > This is likely due to the horrible horrible fact that some of the bddc > files only get compiled if ./configure is run with the option --with-pcbddc > you will need to rerun ./configure and then make with that option. > > I pray that someone removes that horrible confusing configure option. > > Barry > > On Sep 4, 2014, at 8:52 AM, Alp Kalpalp wrote: > > > Dear Stefano, > > > > I have checked out "master" branch and merged with your > stefano_zampini/pcbddc-primalfixe branch. configured, compiled and all > tests are completed successfully. > > Then, I tried to compile ex59 with make ex59, it results in unresolved > external error. I believe your bddc_feti files are not included in > compilation. > > Since I am not experienced on how to solve issues in Petsc, I need to > ask several questions; > > > > 1-) Are there any global settings to add additonal directories to > compilation (src\ksp\pc\impls\bddc) > > 2-) or should I include these files on top of ex59 (AFAIK, including .c > files is not a good thing) > > 3-) and finally what is the better way of helping you (creating another > branch from yours or what) > > > > Thanks in advance > > > > > > > > > > > > > > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini < > stefano.zampini at gmail.com> wrote: > > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the > dual of the other) and it does not have its own classes so far. > > > > That said, you can experiment with FETI-DP only after having setup a > BDDC preconditioner with the options and customization you prefer. > > Use > http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html > for manual pages. > > > > For an 'how to' with FETIDP, please see > src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look > at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented > Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have > F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to > obtain a right-hand side for the FETIDP system and a physical solution from > the solution of the FETIDP system. > > > > I would recommend you to use the development version of the library and > either use the ?next? branch or the ?master' branch after having merged in > the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also > contains the new deluxe scaling operator for BDDC which is not available to > use with FETI-DP. > > > > If you have any other questions which can be useful for other PETSc > users, please use the mailing list; otherwise you can contact me personally. > > > > Stefano > > > > > > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > > > >> Matthew Knepley writes: > >>>> 1- Is it possible to complete a FETI-DP solution with the provided > >>>> functions in current PetSc release? > >>>> > >>> > >>> There is no FETI-DP in PETSc. > >> > >> Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. You > >> can enable it by configuring --with-pcbddc. This will be turned on by > >> default soon. It is fairly new, so you should use the branch 'master' > >> instead of the release. It has an option to do FETI-DP instead of BDDC. > >> See src/ksp/ksp/examples/tutorials/ex59.c. > >> > >> For either of these methods, you have to assemble a MATIS. If you use > >> MatSetValuesLocal, most of your assembly code can stay the same. > >> > >> Hopefully we can get better examples before the next release. Stefano > >> (the author of PCBDDC, Cc'd) tests mostly with external packages, but we > >> really need more complete tests within PETSc. > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 5 06:51:49 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 5 Sep 2014 06:51:49 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> Message-ID: Please send your configure.log and make.log Thanks, Matt On Fri, Sep 5, 2014 at 4:27 AM, Alp Kalpalp wrote: > As I said before, I have checked out "master" branch and merged with your > stefano_zampini/pcbddc-primalfixe branch. configured, compiled successfully. > I have used --with-pcbddc option in configure as Barry suggested. > > However, tests are failed with following reason: > > akalpalp at a-kalpalp ~/petsc > $ make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug test > Running test examples to verify correct installation > Using PETSC_DIR=/home/akalpalp/petsc and PETSC_ARCH=arch-mswin-c-debug > *******************Error detected during compile or > link!******************* > See http://www.mcs.anl.gov/petsc/documentation/faq.html > /home/akalpalp/petsc/src/snes/examples/tutorials ex19 > > ********************************************************************************* > /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -o ex19.o -c -Wall > -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 > -I/home/akalpalp/petsc/include > -I/home/akalpalp/petsc/arch-mswin-c-debug/include `pwd`/ex19.c > /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -Wall -Wwrite-strings > -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -o ex19 ex19.o > -L/home/akalpalp/petsc/arch-mswin-c-debug/lib -lpetsc > -Wl,-rpath,/home/akalpalp/petsc/arch-mswin-c-debug/lib -lf2clapack > -lf2cblas -lpthread -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl > /home/akalpalp/petsc/arch-mswin-c-debug/lib/libpetsc.a(pcregis.o):pcregis.c:(.rdata$.refptr.PCCreate_BDDC[.refptr.PCCreate_BDDC]+0x0): > undefined reference to `PCCreate_BDDC' > collect2: error: ld returned 1 exit status > makefile:108: recipe for target 'ex19' failed > make[3]: [ex19] Error 1 (ignored) > /usr/bin/rm -f ex19.o > Completed test examples > ========================================= > Now to evaluate the computer systems you plan use - do: > make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug streams > NPMAX= > > > > On Thu, Sep 4, 2014 at 5:37 PM, Barry Smith wrote: > >> >> This is likely due to the horrible horrible fact that some of the bddc >> files only get compiled if ./configure is run with the option --with-pcbddc >> you will need to rerun ./configure and then make with that option. >> >> I pray that someone removes that horrible confusing configure option. >> >> Barry >> >> On Sep 4, 2014, at 8:52 AM, Alp Kalpalp wrote: >> >> > Dear Stefano, >> > >> > I have checked out "master" branch and merged with your >> stefano_zampini/pcbddc-primalfixe branch. configured, compiled and all >> tests are completed successfully. >> > Then, I tried to compile ex59 with make ex59, it results in unresolved >> external error. I believe your bddc_feti files are not included in >> compilation. >> > Since I am not experienced on how to solve issues in Petsc, I need to >> ask several questions; >> > >> > 1-) Are there any global settings to add additonal directories to >> compilation (src\ksp\pc\impls\bddc) >> > 2-) or should I include these files on top of ex59 (AFAIK, including .c >> files is not a good thing) >> > 3-) and finally what is the better way of helping you (creating another >> branch from yours or what) >> > >> > Thanks in advance >> > >> > >> > >> > >> > >> > >> > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini < >> stefano.zampini at gmail.com> wrote: >> > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is >> the dual of the other) and it does not have its own classes so far. >> > >> > That said, you can experiment with FETI-DP only after having setup a >> BDDC preconditioner with the options and customization you prefer. >> > Use >> http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html >> for manual pages. >> > >> > For an 'how to' with FETIDP, please see >> src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look >> at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented >> Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have >> F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to >> obtain a right-hand side for the FETIDP system and a physical solution from >> the solution of the FETIDP system. >> > >> > I would recommend you to use the development version of the library and >> either use the ?next? branch or the ?master' branch after having merged in >> the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also >> contains the new deluxe scaling operator for BDDC which is not available to >> use with FETI-DP. >> > >> > If you have any other questions which can be useful for other PETSc >> users, please use the mailing list; otherwise you can contact me personally. >> > >> > Stefano >> > >> > >> > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: >> > >> >> Matthew Knepley writes: >> >>>> 1- Is it possible to complete a FETI-DP solution with the provided >> >>>> functions in current PetSc release? >> >>>> >> >>> >> >>> There is no FETI-DP in PETSc. >> >> >> >> Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. >> You >> >> can enable it by configuring --with-pcbddc. This will be turned on by >> >> default soon. It is fairly new, so you should use the branch 'master' >> >> instead of the release. It has an option to do FETI-DP instead of >> BDDC. >> >> See src/ksp/ksp/examples/tutorials/ex59.c. >> >> >> >> For either of these methods, you have to assemble a MATIS. If you use >> >> MatSetValuesLocal, most of your assembly code can stay the same. >> >> >> >> Hopefully we can get better examples before the next release. Stefano >> >> (the author of PCBDDC, Cc'd) tests mostly with external packages, but >> we >> >> really need more complete tests within PETSc. >> > >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 5 07:10:26 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 5 Sep 2014 07:10:26 -0500 Subject: [petsc-users] fieldsplit_0_ monitor in combination with selfp In-Reply-To: <1f81bb0885e94ce59a1f4aa683619cbb@MAR190N1.marin.local> References: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local> <326013535d8a4af4ad43bc7ab4945f92@MAR190n2.marin.local> <1f81bb0885e94ce59a1f4aa683619cbb@MAR190N1.marin.local> Message-ID: On Fri, Sep 5, 2014 at 1:34 AM, Klaij, Christiaan wrote: > Matt, > > I think the problem is somehow related to > -pc_fieldsplit_schur_precondition selfp. In the example below your are not > using that option. > Here is the selfp output. It retains the A00 solver. ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -ksp_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi -pc_fieldsplit_schur_precondition selfp SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=20 total number of function evaluations=2 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 right preconditioning has attached null space using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 3.45047 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=962, cols=962 package used to perform factorization: petsc total: nonzeros=68692, allocated nonzeros=68692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 456 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=145, cols=145 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=145, cols=962 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_pressure_inner_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_pressure_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=962, cols=145 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 Mat Object: 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=2601, allocated nonzeros=2601 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1107, cols=1107 total: nonzeros=29785, allocated nonzeros=29785 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 513 nodes, limit used is 5 Thanks, Matt > Chris > > dr. ir. Christiaan Klaij > > CFD Researcher > Research & Development > > > > *MARIN* > > > 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 > AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I > www.marin.nl > > > > MARIN news: MARIN at SMM, Hamburg, September 9-12 > > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Friday, September 05, 2014 12:36 AM > *To:* Klaij, Christiaan > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] fieldsplit_0_ monitor in combination with > selfp > > On Thu, Sep 4, 2014 at 7:26 AM, Klaij, Christiaan > wrote: > >> Sorry, here's the ksp_view. I'm expecting >> >> -fieldsplit_1_inner_ksp_type preonly >> >> to set the ksp(A00) in the Schur complement only, but it seems to set it >> in the inv(A00) of the diagonal as well. >> > > I think something is wrong in your example (we strongly advise against > using MatNest directly). I cannot reproduce this using SNES ex62: > > ./config/builder2.py check src/snes/examples/tutorials/ex62.c > --testnum=36 --args="-fieldsplit_pressure_inner_ksp_type preonly > -fieldsplit_pressure_inner_pc_type jacobi" > > which translates to > > ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet > -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type > fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit > -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full > -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres > -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi > -snes_monitor_short -ksp_monitor_short -snes_converged_reason > -ksp_converged_reason -snes_view -show_solution 0 > -fieldsplit_pressure_inner_ksp_type preonly > -fieldsplit_pressure_inner_pc_type jacobi > > gives > > Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 > SNES Object: 1 MPI processes > type: newtonls > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 > total number of linear solver iterations=20 > total number of function evaluations=2 > SNESLineSearch Object: 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI processes > type: fgmres > GMRES: restart=100, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-09, absolute=1e-50, divergence=10000 > right preconditioning > has attached null space > using UNPRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (fieldsplit_velocity_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (fieldsplit_velocity_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 3.45047 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=962, cols=962 > package used to perform factorization: petsc > total: nonzeros=68692, allocated nonzeros=68692 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 456 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (fieldsplit_velocity_) 1 MPI > processes > type: seqaij > rows=962, cols=962 > total: nonzeros=19908, allocated nonzeros=19908 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 481 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (fieldsplit_pressure_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-10, absolute=1e-50, divergence=10000 > left preconditioning > has attached null space > using PRECONDITIONED norm type for convergence test > PC Object: (fieldsplit_pressure_) 1 MPI processes > type: jacobi > linear system matrix followed by preconditioner matrix: > Mat Object: (fieldsplit_pressure_) 1 MPI > processes > type: schurcomplement > rows=145, cols=145 > has attached null space > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (fieldsplit_pressure_) > 1 MPI processes > type: seqaij > rows=145, cols=145 > total: nonzeros=945, allocated nonzeros=945 > total number of mallocs used during MatSetValues calls =0 > has attached null space > not using I-node routines > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=145, cols=962 > total: nonzeros=4466, allocated nonzeros=4466 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP of A00 > KSP Object: (fieldsplit_pressure_inner_) > 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-09, absolute=1e-50, > divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_pressure_inner_) > 1 MPI processes > type: jacobi > linear system matrix = precond matrix: > Mat Object: (fieldsplit_velocity_) > 1 MPI processes > type: seqaij > rows=962, cols=962 > total: nonzeros=19908, allocated nonzeros=19908 > total number of mallocs used during MatSetValues calls > =0 > using I-node routines: found 481 nodes, limit used > is 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=962, cols=145 > total: nonzeros=4466, allocated nonzeros=4466 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 481 nodes, limit used is 5 > Mat Object: (fieldsplit_pressure_) 1 MPI > processes > type: seqaij > rows=145, cols=145 > total: nonzeros=945, allocated nonzeros=945 > total number of mallocs used during MatSetValues calls =0 > has attached null space > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=1107, cols=1107 > total: nonzeros=29785, allocated nonzeros=29785 > total number of mallocs used during MatSetValues calls =0 > has attached null space > using I-node routines: found 513 nodes, limit used is 5 > > Matt > > >> Chris >> >> 0 KSP Residual norm 1.229687498638e+00 >> Residual norms for fieldsplit_1_ solve. >> 0 KSP Residual norm 7.185799114488e+01 >> 1 KSP Residual norm 3.873274154012e+01 >> 1 KSP Residual norm 1.107969383366e+00 >> KSP Object: 1 MPI processes >> type: fgmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> right preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, factorization LOWER >> Preconditioner for the Schur complement formed from Sp, an assembled >> approximation to S, which uses (the lumped) A00's diagonal's inverse >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (fieldsplit_0_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_) 1 MPI processes >> type: bjacobi >> block Jacobi: number of blocks = 1 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object: (fieldsplit_0_sub_) 1 MPI >> processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_0_sub_) 1 MPI >> processes >> type: ilu >> ILU: out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> using diagonal shift on blocks to prevent zero pivot >> [INBLOCKS] >> matrix ordering: natural >> factor fill ratio given 1, needed 1 >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=48, cols=48 >> package used to perform factorization: petsc >> total: nonzeros=200, allocated nonzeros=200 >> total number of mallocs used during MatSetValues >> calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI >> processes >> type: seqaij >> rows=48, cols=48 >> total: nonzeros=200, allocated nonzeros=240 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=48, cols=48 >> total: nonzeros=200, allocated nonzeros=480 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (fieldsplit_1_) 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (fieldsplit_1_) 1 MPI processes >> type: bjacobi >> block Jacobi: number of blocks = 1 >> Local solve is same for all blocks, in the following KSP and PC >> objects: >> KSP Object: (fieldsplit_1_sub_) 1 MPI >> processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_sub_) 1 MPI >> processes >> type: bjacobi >> block Jacobi: number of blocks = 1 >> Local solve is same for all blocks, in the following KSP >> and PC objects: >> KSP Object: >> (fieldsplit_1_sub_sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: >> (fieldsplit_1_sub_sub_) 1 MPI processes >> type: ilu >> ILU: out-of-place factorization >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> using diagonal shift on blocks to prevent zero pivot >> [INBLOCKS] >> matrix ordering: natural >> factor fill ratio given 1, needed 1 >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=24, cols=24 >> package used to perform factorization: petsc >> total: nonzeros=120, allocated nonzeros=120 >> total number of mallocs used during MatSetValues >> calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=24, cols=24 >> total: nonzeros=120, allocated nonzeros=120 >> total number of mallocs used during MatSetValues calls >> =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=24, cols=24 >> total: nonzeros=120, allocated nonzeros=120 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> linear system matrix followed by preconditioner matrix: >> Mat Object: (fieldsplit_1_) 1 MPI processes >> type: schurcomplement >> rows=24, cols=24 >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (fieldsplit_1_) 1 >> MPI processes >> type: mpiaij >> rows=24, cols=24 >> total: nonzeros=0, allocated nonzeros=0 >> total number of mallocs used during MatSetValues calls =0 >> using I-node (on process 0) routines: found 5 nodes, >> limit used is 5 >> A10 >> Mat Object: (a10_) 1 MPI >> processes >> type: mpiaij >> rows=24, cols=48 >> total: nonzeros=96, allocated nonzeros=96 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> KSP of A00 >> KSP Object: >> (fieldsplit_1_inner_) 1 MPI processes >> type: preonly >> maximum iterations=1, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_1_inner_) >> 1 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Mat Object: >> (fieldsplit_0_) 1 MPI processes >> type: mpiaij >> rows=48, cols=48 >> total: nonzeros=200, allocated nonzeros=480 >> total number of mallocs used during MatSetValues calls >> =0 >> not using I-node (on process 0) routines >> A01 >> Mat Object: (a01_) 1 MPI >> processes >> type: mpiaij >> rows=48, cols=24 >> total: nonzeros=96, allocated nonzeros=480 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> Mat Object: 1 MPI processes >> type: mpiaij >> rows=24, cols=24 >> total: nonzeros=120, allocated nonzeros=120 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node (on process 0) routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: nest >> rows=72, cols=72 >> Matrix object: >> type=nest, rows=2, cols=2 >> MatNest structure: >> (0,0) : prefix="fieldsplit_0_", type=mpiaij, rows=48, cols=48 >> (0,1) : prefix="a01_", type=mpiaij, rows=48, cols=24 >> (1,0) : prefix="a10_", type=mpiaij, rows=24, cols=48 >> (1,1) : prefix="fieldsplit_1_", type=mpiaij, rows=24, cols=24 >> >> >> From: Matthew Knepley >> Sent: Thursday, September 04, 2014 2:20 PM >> To: Klaij, Christiaan >> Cc: petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp >> >> >> >> >> On Thu, Sep 4, 2014 at 7:06 AM, Klaij, Christiaan >> wrote: >> I'm playing with the selfp option in fieldsplit using >> snes/examples/tutorials/ex70.c. For example: >> >> mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ >> -ksp_type fgmres \ >> -pc_type fieldsplit \ >> -pc_fieldsplit_type schur \ >> -pc_fieldsplit_schur_fact_type lower \ >> -pc_fieldsplit_schur_precondition selfp \ >> -fieldsplit_1_inner_ksp_type preonly \ >> -fieldsplit_1_inner_pc_type jacobi \ >> -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ >> -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ >> -ksp_monitor -ksp_max_it 1 >> >> gives the following output >> >> 0 KSP Residual norm 1.229687498638e+00 >> Residual norms for fieldsplit_1_ solve. >> 0 KSP Residual norm 2.330138480101e+01 >> 1 KSP Residual norm 1.609000846751e+01 >> 1 KSP Residual norm 1.180287268335e+00 >> >> To my suprise I don't see anything for the fieldsplit_0_ solve, >> why? >> >> >> >> Always run with -ksp_view for any solver question. >> >> >> Thanks, >> >> >> Matt >> Furthermore, if I understand correctly the above should be >> exactly equivalent with >> >> mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ >> -ksp_type fgmres \ >> -pc_type fieldsplit \ >> -pc_fieldsplit_type schur \ >> -pc_fieldsplit_schur_fact_type lower \ >> -user_ksp \ >> -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ >> -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ >> -ksp_monitor -ksp_max_it 1 >> >> 0 KSP Residual norm 1.229687498638e+00 >> Residual norms for fieldsplit_0_ solve. >> 0 KSP Residual norm 5.486639587672e-01 >> 1 KSP Residual norm 6.348354253703e-02 >> Residual norms for fieldsplit_1_ solve. >> 0 KSP Residual norm 2.321938107977e+01 >> 1 KSP Residual norm 1.605484031258e+01 >> 1 KSP Residual norm 1.183225251166e+00 >> >> because -user_ksp replaces the Schur complement by the simple >> approximation A11 - A10 inv(diag(A00)) A01. Beside the missing >> fielsplit_0_ part, the numbers are pretty close but not exactly >> the same. Any explanation? >> >> Chris >> >> >> dr. ir. Christiaan Klaij >> CFD Researcher >> Research & Development >> E mailto:C.Klaij at marin.nl >> T +31 317 49 33 44 >> >> >> MARIN >> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands >> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image4a95d2.JPG Type: image/jpeg Size: 1622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagea7316f.JPG Type: image/jpeg Size: 1069 bytes Desc: not available URL: From C.Klaij at marin.nl Fri Sep 5 07:31:29 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Fri, 5 Sep 2014 12:31:29 +0000 Subject: [petsc-users] fieldsplit_0_ monitor in combination with selfp In-Reply-To: References: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local> <326013535d8a4af4ad43bc7ab4945f92@MAR190n2.marin.local> <1f81bb0885e94ce59a1f4aa683619cbb@MAR190N1.marin.local>, Message-ID: <62638b7e069743d09cc9f5f7e0a4ece3@MAR190N1.marin.local> Thanks! I've spotted another difference: you are setting the fieldsplit_0_ksp_type and I'm not, just relying on the default instead. If I add -fieldsplit_0_ksp_type gmres then is also get the correct answer. Probably, you will get my problem if you remove -fieldsplit_velocity. mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -pc_fieldsplit_schur_precondition selfp \ -fieldsplit_1_inner_ksp_type preonly \ -fieldsplit_1_inner_pc_type jacobi \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 \ -fieldsplit_0_ksp_type gmres -ksp_view KSP Object: 2 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 2 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 2 MPI processes type: gmres MARIN news: Development of a Scaled-Down Floating Wind Turbine for Offshore Basin Testing This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley Sent: Friday, September 05, 2014 2:10 PM To: Klaij, Christiaan; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp On Fri, Sep 5, 2014 at 1:34 AM, Klaij, Christiaan > wrote: Matt, I think the problem is somehow related to -pc_fieldsplit_schur_precondition selfp. In the example below your are not using that option. Here is the selfp output. It retains the A00 solver. ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -ksp_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi -pc_fieldsplit_schur_precondition selfp SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=20 total number of function evaluations=2 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 right preconditioning has attached null space using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 3.45047 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=962, cols=962 package used to perform factorization: petsc total: nonzeros=68692, allocated nonzeros=68692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 456 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=145, cols=145 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=145, cols=962 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_pressure_inner_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_pressure_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=962, cols=145 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 Mat Object: 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=2601, allocated nonzeros=2601 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1107, cols=1107 total: nonzeros=29785, allocated nonzeros=29785 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 513 nodes, limit used is 5 Thanks, Matt Chris [cid:imagea7316f.JPG at f7909bee.44832f74][cid:image4a95d2.JPG at 67db08fd.4e9b7a1c] dr. ir. Christiaan Klaij CFD Researcher Research & Development MARIN 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I www.marin.nl MARIN news: MARIN at SMM, Hamburg, September 9-12 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Friday, September 05, 2014 12:36 AM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp On Thu, Sep 4, 2014 at 7:26 AM, Klaij, Christiaan > wrote: Sorry, here's the ksp_view. I'm expecting -fieldsplit_1_inner_ksp_type preonly to set the ksp(A00) in the Schur complement only, but it seems to set it in the inv(A00) of the diagonal as well. I think something is wrong in your example (we strongly advise against using MatNest directly). I cannot reproduce this using SNES ex62: ./config/builder2.py check src/snes/examples/tutorials/ex62.c --testnum=36 --args="-fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi" which translates to ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -ksp_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi gives Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=20 total number of function evaluations=2 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 right preconditioning has attached null space using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 3.45047 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=962, cols=962 package used to perform factorization: petsc total: nonzeros=68692, allocated nonzeros=68692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 456 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000 left preconditioning has attached null space using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=145, cols=145 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=145, cols=962 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_pressure_inner_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_pressure_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=962, cols=145 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1107, cols=1107 total: nonzeros=29785, allocated nonzeros=29785 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 513 nodes, limit used is 5 Matt Chris 0 KSP Residual norm 1.229687498638e+00 Residual norms for fieldsplit_1_ solve. 0 KSP Residual norm 7.185799114488e+01 1 KSP Residual norm 3.873274154012e+01 1 KSP Residual norm 1.107969383366e+00 KSP Object: 1 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (fieldsplit_0_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=48, cols=48 package used to perform factorization: petsc total: nonzeros=200, allocated nonzeros=200 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=48, cols=48 total: nonzeros=200, allocated nonzeros=240 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=48, cols=48 total: nonzeros=200, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_1_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (fieldsplit_1_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_sub_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (fieldsplit_1_sub_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_sub_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=24, cols=24 package used to perform factorization: petsc total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=24, cols=24 total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=24, cols=24 total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: schurcomplement rows=24, cols=24 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_1_) 1 MPI processes type: mpiaij rows=24, cols=24 total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 5 nodes, limit used is 5 A10 Mat Object: (a10_) 1 MPI processes type: mpiaij rows=24, cols=48 total: nonzeros=96, allocated nonzeros=96 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines KSP of A00 KSP Object: (fieldsplit_1_inner_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=48, cols=48 total: nonzeros=200, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines A01 Mat Object: (a01_) 1 MPI processes type: mpiaij rows=48, cols=24 total: nonzeros=96, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Mat Object: 1 MPI processes type: mpiaij rows=24, cols=24 total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=72, cols=72 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="fieldsplit_0_", type=mpiaij, rows=48, cols=48 (0,1) : prefix="a01_", type=mpiaij, rows=48, cols=24 (1,0) : prefix="a10_", type=mpiaij, rows=24, cols=48 (1,1) : prefix="fieldsplit_1_", type=mpiaij, rows=24, cols=24 From: Matthew Knepley > Sent: Thursday, September 04, 2014 2:20 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp On Thu, Sep 4, 2014 at 7:06 AM, Klaij, Christiaan > wrote: I'm playing with the selfp option in fieldsplit using snes/examples/tutorials/ex70.c. For example: mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -pc_fieldsplit_schur_precondition selfp \ -fieldsplit_1_inner_ksp_type preonly \ -fieldsplit_1_inner_pc_type jacobi \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 gives the following output 0 KSP Residual norm 1.229687498638e+00 Residual norms for fieldsplit_1_ solve. 0 KSP Residual norm 2.330138480101e+01 1 KSP Residual norm 1.609000846751e+01 1 KSP Residual norm 1.180287268335e+00 To my suprise I don't see anything for the fieldsplit_0_ solve, why? Always run with -ksp_view for any solver question. Thanks, Matt Furthermore, if I understand correctly the above should be exactly equivalent with mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -user_ksp \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 0 KSP Residual norm 1.229687498638e+00 Residual norms for fieldsplit_0_ solve. 0 KSP Residual norm 5.486639587672e-01 1 KSP Residual norm 6.348354253703e-02 Residual norms for fieldsplit_1_ solve. 0 KSP Residual norm 2.321938107977e+01 1 KSP Residual norm 1.605484031258e+01 1 KSP Residual norm 1.183225251166e+00 because -user_ksp replaces the Schur complement by the simple approximation A11 - A10 inv(diag(A00)) A01. Beside the missing fielsplit_0_ part, the numbers are pretty close but not exactly the same. Any explanation? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image4a95d2.JPG Type: image/jpeg Size: 1622 bytes Desc: image4a95d2.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagea7316f.JPG Type: image/jpeg Size: 1069 bytes Desc: imagea7316f.JPG URL: From mailinglists at xgm.de Fri Sep 5 08:20:04 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Fri, 05 Sep 2014 15:20:04 +0200 Subject: [petsc-users] Putting petsc in a namespace Message-ID: Hello, This may be rather a C/C++ question, but ... I encapsulate some petsc functions into c++ classes. Since I don't want to pull all petsc symbols into the global namespace for anyone using my classes I try to put petsc into it's own namespace: Header petsc.h: namespace petsc { #include "petscmat.h" } class Vector { petsc::Vec vector; } Implementation petsc.cpp: #include "petsc.h" namespace petsc { #include "petscviewer.h" } using namespace petsc; User: #include "petsc.h" #include // if the user wants he can import parts of petsc of course But this gives a massive amount of error messsages like: mpic++ -o petsc.o -c -O0 -g3 -Wall -I/home/florian/software/petsc/include -I/home/florian/software/petsc/arch-linux2-c-debug/include petsc.cpp mpic++ -o prbf.o -c -O0 -g3 -Wall -I/home/florian/software/petsc/include -I/home/florian/software/petsc/arch-linux2-c-debug/include prbf.cpp In file included from /home/florian/software/petsc/include/petscksp.h:6:0, from prbf.cpp:9: /home/florian/software/petsc/include/petscpc.h:9:14: error: 'PetscErrorCode' does not name a type PETSC_EXTERN PetscErrorCode PCInitializePackage(void); Is there a way to achieve what I want? Thanks, Florian From evanum at gmail.com Fri Sep 5 10:06:30 2014 From: evanum at gmail.com (Evan Um) Date: Fri, 5 Sep 2014 08:06:30 -0700 Subject: [petsc-users] Using MUMPS and (PT)SCOTCH with PETSC Message-ID: Dear PETSC users, I tried to use SCOTCH 5.1.12b in my PETSC codes since MUMPS has compatibility issues with the latest SCOTCH library. I was told that SCOTCH/6.0.0 that comes with PETSC/3.5.0 is automatically downloaded and installed. Is it still possible to use old SCOTCH library 5.1.2b in PETSC? As mentioned in MUMPS's FAQ, MUMPS has compatibility issues with the latest SCOTCH. MUMPS developers suggest that MUMPS should work with SCOTCH 5.1.12b. In advance, thanks for your kind comments. Regards, Evan Errors from MUMPS with SCOTCH 6.0.0 and PETSC 3.5.0: (5): ERROR: stratParserParse: invalid method parameter name "type", before "h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}}}},ole=s,ose=s,osq=n{sep=/(vert>120)?m{type=h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}};,ole=f{cmin=15,cmax=100000,frat=0.0},ose=g}} -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 5 10:38:37 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 5 Sep 2014 10:38:37 -0500 Subject: [petsc-users] fieldsplit_0_ monitor in combination with selfp In-Reply-To: <62638b7e069743d09cc9f5f7e0a4ece3@MAR190N1.marin.local> References: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local> <326013535d8a4af4ad43bc7ab4945f92@MAR190n2.marin.local> <1f81bb0885e94ce59a1f4aa683619cbb@MAR190N1.marin.local> <62638b7e069743d09cc9f5f7e0a4ece3@MAR190N1.marin.local> Message-ID: On Fri, Sep 5, 2014 at 7:31 AM, Klaij, Christiaan wrote: > Thanks! I've spotted another difference: you are setting the > fieldsplit_0_ksp_type and I'm not, just relying on the default > instead. If I add -fieldsplit_0_ksp_type gmres then is also get > the correct answer. Probably, you will get my problem if you > remove -fieldsplit_velocity. > This is not a bug. The default solver for A00 is preonly, unless it is used as the inner solver as well, in which case it defaults to GMRES so as not to give an inexact Schur complement by default. ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type gmres -fieldsplit_pressure_inner_ksp_max_it 1 -fieldsplit_pressure_inner_pc_type jacobi -pc_fieldsplit_schur_precondition selfp SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=77 total number of function evaluations=2 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 right preconditioning has attached null space using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=962, cols=962 package used to perform factorization: petsc total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=145, cols=145 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=145, cols=962 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_pressure_inner_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=962, cols=145 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 Mat Object: 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=2601, allocated nonzeros=2601 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1107, cols=1107 total: nonzeros=29785, allocated nonzeros=29785 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 513 nodes, limit used is 5 Matt > mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ > -ksp_type fgmres \ > -pc_type fieldsplit \ > -pc_fieldsplit_type schur \ > -pc_fieldsplit_schur_fact_type lower \ > -pc_fieldsplit_schur_precondition selfp \ > -fieldsplit_1_inner_ksp_type preonly \ > -fieldsplit_1_inner_pc_type jacobi \ > -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ > -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ > -ksp_monitor -ksp_max_it 1 \ > -fieldsplit_0_ksp_type gmres -ksp_view > > > KSP Object: 2 MPI processes > type: fgmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=1, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: 2 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization LOWER > Preconditioner for the Schur complement formed from Sp, an assembled > approximation to S, which uses (the lumped) A00's diagonal's inverse > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (fieldsplit_0_) 2 MPI processes > type: gmres > > > > > > > MARIN news: Development of a Scaled-Down Floating Wind Turbine for > Offshore Basin Testing > > > This e-mail may be confidential, privileged and/or protected by copyright. > If you are not the intended recipient, you should return it to the sender > immediately and delete your copy from your system. > > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Friday, September 05, 2014 2:10 PM > *To:* Klaij, Christiaan; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] fieldsplit_0_ monitor in combination with > selfp > > On Fri, Sep 5, 2014 at 1:34 AM, Klaij, Christiaan > wrote: > >> Matt, >> >> I think the problem is somehow related to >> -pc_fieldsplit_schur_precondition selfp. In the example below your are not >> using that option. >> > > Here is the selfp output. It retains the A00 solver. > > ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet > -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type > fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit > -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full > -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres > -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi > -snes_monitor_short -ksp_monitor_short -snes_converged_reason > -ksp_converged_reason -snes_view -show_solution 0 > -fieldsplit_pressure_inner_ksp_type preonly > -fieldsplit_pressure_inner_pc_type jacobi -pc_fieldsplit_schur_precondition > selfp > > SNES Object: 1 MPI processes > type: newtonls > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 > total number of linear solver iterations=20 > total number of function evaluations=2 > SNESLineSearch Object: 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI processes > type: fgmres > GMRES: restart=100, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-09, absolute=1e-50, divergence=10000 > right preconditioning > has attached null space > using UNPRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from Sp, an assembled > approximation to S, which uses (the lumped) A00's diagonal's inverse > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (fieldsplit_velocity_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (fieldsplit_velocity_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5, needed 3.45047 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=962, cols=962 > package used to perform factorization: petsc > total: nonzeros=68692, allocated nonzeros=68692 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 456 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (fieldsplit_velocity_) 1 MPI > processes > type: seqaij > rows=962, cols=962 > total: nonzeros=19908, allocated nonzeros=19908 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 481 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (fieldsplit_pressure_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-10, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (fieldsplit_pressure_) 1 MPI processes > type: jacobi > linear system matrix followed by preconditioner matrix: > Mat Object: (fieldsplit_pressure_) 1 MPI > processes > type: schurcomplement > rows=145, cols=145 > has attached null space > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (fieldsplit_pressure_) > 1 MPI processes > type: seqaij > rows=145, cols=145 > total: nonzeros=945, allocated nonzeros=945 > total number of mallocs used during MatSetValues calls =0 > has attached null space > not using I-node routines > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=145, cols=962 > total: nonzeros=4466, allocated nonzeros=4466 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP of A00 > KSP Object: (fieldsplit_pressure_inner_) > 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-09, absolute=1e-50, > divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_pressure_inner_) > 1 MPI processes > type: jacobi > linear system matrix = precond matrix: > Mat Object: (fieldsplit_velocity_) > 1 MPI processes > type: seqaij > rows=962, cols=962 > total: nonzeros=19908, allocated nonzeros=19908 > total number of mallocs used during MatSetValues calls > =0 > using I-node routines: found 481 nodes, limit used > is 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=962, cols=145 > total: nonzeros=4466, allocated nonzeros=4466 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 481 nodes, limit used is 5 > Mat Object: 1 MPI processes > type: seqaij > rows=145, cols=145 > total: nonzeros=2601, allocated nonzeros=2601 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=1107, cols=1107 > total: nonzeros=29785, allocated nonzeros=29785 > total number of mallocs used during MatSetValues calls =0 > has attached null space > using I-node routines: found 513 nodes, limit used is 5 > > Thanks, > > Matt > > >> Chris >> >> dr. ir. Christiaan Klaij >> >> CFD Researcher >> Research & Development >> >> >> >> *MARIN* >> >> >> 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 >> AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I >> www.marin.nl >> >> >> >> MARIN news: MARIN at SMM, Hamburg, September 9-12 >> >> >> This e-mail may be confidential, privileged and/or protected by >> copyright. If you are not the intended recipient, you should return it to >> the sender immediately and delete your copy from your system. >> >> >> >> ------------------------------ >> *From:* Matthew Knepley >> *Sent:* Friday, September 05, 2014 12:36 AM >> *To:* Klaij, Christiaan >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] fieldsplit_0_ monitor in combination with >> selfp >> >> On Thu, Sep 4, 2014 at 7:26 AM, Klaij, Christiaan >> wrote: >> >>> Sorry, here's the ksp_view. I'm expecting >>> >>> -fieldsplit_1_inner_ksp_type preonly >>> >>> to set the ksp(A00) in the Schur complement only, but it seems to set it >>> in the inv(A00) of the diagonal as well. >>> >> >> I think something is wrong in your example (we strongly advise against >> using MatNest directly). I cannot reproduce this using SNES ex62: >> >> ./config/builder2.py check src/snes/examples/tutorials/ex62.c >> --testnum=36 --args="-fieldsplit_pressure_inner_ksp_type preonly >> -fieldsplit_pressure_inner_pc_type jacobi" >> >> which translates to >> >> ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet >> -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type >> fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit >> -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full >> -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres >> -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi >> -snes_monitor_short -ksp_monitor_short -snes_converged_reason >> -ksp_converged_reason -snes_view -show_solution 0 >> -fieldsplit_pressure_inner_ksp_type preonly >> -fieldsplit_pressure_inner_pc_type jacobi >> >> gives >> >> Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 >> SNES Object: 1 MPI processes >> type: newtonls >> maximum iterations=50, maximum function evaluations=10000 >> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 >> total number of linear solver iterations=20 >> total number of function evaluations=2 >> SNESLineSearch Object: 1 MPI processes >> type: bt >> interpolation: cubic >> alpha=1.000000e-04 >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI processes >> type: fgmres >> GMRES: restart=100, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000 >> right preconditioning >> has attached null space >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, factorization FULL >> Preconditioner for the Schur complement formed from A11 >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (fieldsplit_velocity_) 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (fieldsplit_velocity_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5, needed 3.45047 >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=962, cols=962 >> package used to perform factorization: petsc >> total: nonzeros=68692, allocated nonzeros=68692 >> total number of mallocs used during MatSetValues calls >> =0 >> using I-node routines: found 456 nodes, limit used is >> 5 >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_velocity_) 1 MPI >> processes >> type: seqaij >> rows=962, cols=962 >> total: nonzeros=19908, allocated nonzeros=19908 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 481 nodes, limit used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (fieldsplit_pressure_) 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-10, absolute=1e-50, divergence=10000 >> left preconditioning >> has attached null space >> using PRECONDITIONED norm type for convergence test >> PC Object: (fieldsplit_pressure_) 1 MPI processes >> type: jacobi >> linear system matrix followed by preconditioner matrix: >> Mat Object: (fieldsplit_pressure_) 1 MPI >> processes >> type: schurcomplement >> rows=145, cols=145 >> has attached null space >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: (fieldsplit_pressure_) >> 1 MPI processes >> type: seqaij >> rows=145, cols=145 >> total: nonzeros=945, allocated nonzeros=945 >> total number of mallocs used during MatSetValues calls >> =0 >> has attached null space >> not using I-node routines >> A10 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=145, cols=962 >> total: nonzeros=4466, allocated nonzeros=4466 >> total number of mallocs used during MatSetValues calls >> =0 >> not using I-node routines >> KSP of A00 >> KSP Object: (fieldsplit_pressure_inner_) >> 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-09, absolute=1e-50, >> divergence=10000 >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (fieldsplit_pressure_inner_) >> 1 MPI processes >> type: jacobi >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_velocity_) >> 1 MPI processes >> type: seqaij >> rows=962, cols=962 >> total: nonzeros=19908, allocated nonzeros=19908 >> total number of mallocs used during MatSetValues >> calls =0 >> using I-node routines: found 481 nodes, limit used >> is 5 >> A01 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=962, cols=145 >> total: nonzeros=4466, allocated nonzeros=4466 >> total number of mallocs used during MatSetValues calls >> =0 >> using I-node routines: found 481 nodes, limit used is >> 5 >> Mat Object: (fieldsplit_pressure_) 1 MPI >> processes >> type: seqaij >> rows=145, cols=145 >> total: nonzeros=945, allocated nonzeros=945 >> total number of mallocs used during MatSetValues calls =0 >> has attached null space >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=1107, cols=1107 >> total: nonzeros=29785, allocated nonzeros=29785 >> total number of mallocs used during MatSetValues calls =0 >> has attached null space >> using I-node routines: found 513 nodes, limit used is 5 >> >> Matt >> >> >>> Chris >>> >>> 0 KSP Residual norm 1.229687498638e+00 >>> Residual norms for fieldsplit_1_ solve. >>> 0 KSP Residual norm 7.185799114488e+01 >>> 1 KSP Residual norm 3.873274154012e+01 >>> 1 KSP Residual norm 1.107969383366e+00 >>> KSP Object: 1 MPI processes >>> type: fgmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> right preconditioning >>> using UNPRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI processes >>> type: fieldsplit >>> FieldSplit with Schur preconditioner, factorization LOWER >>> Preconditioner for the Schur complement formed from Sp, an assembled >>> approximation to S, which uses (the lumped) A00's diagonal's inverse >>> Split info: >>> Split number 0 Defined by IS >>> Split number 1 Defined by IS >>> KSP solver for A00 block >>> KSP Object: (fieldsplit_0_) 1 MPI processes >>> type: preonly >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_) 1 MPI processes >>> type: bjacobi >>> block Jacobi: number of blocks = 1 >>> Local solve is same for all blocks, in the following KSP and >>> PC objects: >>> KSP Object: (fieldsplit_0_sub_) 1 MPI >>> processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_0_sub_) 1 MPI >>> processes >>> type: ilu >>> ILU: out-of-place factorization >>> 0 levels of fill >>> tolerance for zero pivot 2.22045e-14 >>> using diagonal shift on blocks to prevent zero pivot >>> [INBLOCKS] >>> matrix ordering: natural >>> factor fill ratio given 1, needed 1 >>> Factored matrix follows: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=48, cols=48 >>> package used to perform factorization: petsc >>> total: nonzeros=200, allocated nonzeros=200 >>> total number of mallocs used during MatSetValues >>> calls =0 >>> not using I-node routines >>> linear system matrix = precond matrix: >>> Mat Object: (fieldsplit_0_) 1 MPI >>> processes >>> type: seqaij >>> rows=48, cols=48 >>> total: nonzeros=200, allocated nonzeros=240 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node routines >>> linear system matrix = precond matrix: >>> Mat Object: (fieldsplit_0_) 1 MPI processes >>> type: mpiaij >>> rows=48, cols=48 >>> total: nonzeros=200, allocated nonzeros=480 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> KSP solver for S = A11 - A10 inv(A00) A01 >>> KSP Object: (fieldsplit_1_) 1 MPI processes >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: (fieldsplit_1_) 1 MPI processes >>> type: bjacobi >>> block Jacobi: number of blocks = 1 >>> Local solve is same for all blocks, in the following KSP and >>> PC objects: >>> KSP Object: (fieldsplit_1_sub_) 1 MPI >>> processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (fieldsplit_1_sub_) 1 MPI >>> processes >>> type: bjacobi >>> block Jacobi: number of blocks = 1 >>> Local solve is same for all blocks, in the following KSP >>> and PC objects: >>> KSP Object: >>> (fieldsplit_1_sub_sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: >>> (fieldsplit_1_sub_sub_) 1 MPI processes >>> type: ilu >>> ILU: out-of-place factorization >>> 0 levels of fill >>> tolerance for zero pivot 2.22045e-14 >>> using diagonal shift on blocks to prevent zero pivot >>> [INBLOCKS] >>> matrix ordering: natural >>> factor fill ratio given 1, needed 1 >>> Factored matrix follows: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=24, cols=24 >>> package used to perform factorization: petsc >>> total: nonzeros=120, allocated nonzeros=120 >>> total number of mallocs used during MatSetValues >>> calls =0 >>> not using I-node routines >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=24, cols=24 >>> total: nonzeros=120, allocated nonzeros=120 >>> total number of mallocs used during MatSetValues calls >>> =0 >>> not using I-node routines >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=24, cols=24 >>> total: nonzeros=120, allocated nonzeros=120 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> linear system matrix followed by preconditioner matrix: >>> Mat Object: (fieldsplit_1_) 1 MPI processes >>> type: schurcomplement >>> rows=24, cols=24 >>> Schur complement A11 - A10 inv(A00) A01 >>> A11 >>> Mat Object: (fieldsplit_1_) 1 >>> MPI processes >>> type: mpiaij >>> rows=24, cols=24 >>> total: nonzeros=0, allocated nonzeros=0 >>> total number of mallocs used during MatSetValues calls =0 >>> using I-node (on process 0) routines: found 5 nodes, >>> limit used is 5 >>> A10 >>> Mat Object: (a10_) 1 MPI >>> processes >>> type: mpiaij >>> rows=24, cols=48 >>> total: nonzeros=96, allocated nonzeros=96 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> KSP of A00 >>> KSP Object: >>> (fieldsplit_1_inner_) 1 MPI processes >>> type: preonly >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, >>> divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: >>> (fieldsplit_1_inner_) 1 MPI processes >>> type: jacobi >>> linear system matrix = precond matrix: >>> Mat Object: >>> (fieldsplit_0_) 1 MPI processes >>> type: mpiaij >>> rows=48, cols=48 >>> total: nonzeros=200, allocated nonzeros=480 >>> total number of mallocs used during MatSetValues calls >>> =0 >>> not using I-node (on process 0) routines >>> A01 >>> Mat Object: (a01_) 1 MPI >>> processes >>> type: mpiaij >>> rows=48, cols=24 >>> total: nonzeros=96, allocated nonzeros=480 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> Mat Object: 1 MPI processes >>> type: mpiaij >>> rows=24, cols=24 >>> total: nonzeros=120, allocated nonzeros=120 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: nest >>> rows=72, cols=72 >>> Matrix object: >>> type=nest, rows=2, cols=2 >>> MatNest structure: >>> (0,0) : prefix="fieldsplit_0_", type=mpiaij, rows=48, cols=48 >>> (0,1) : prefix="a01_", type=mpiaij, rows=48, cols=24 >>> (1,0) : prefix="a10_", type=mpiaij, rows=24, cols=48 >>> (1,1) : prefix="fieldsplit_1_", type=mpiaij, rows=24, cols=24 >>> >>> >>> From: Matthew Knepley >>> Sent: Thursday, September 04, 2014 2:20 PM >>> To: Klaij, Christiaan >>> Cc: petsc-users at mcs.anl.gov >>> Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with >>> selfp >>> >>> >>> >>> >>> On Thu, Sep 4, 2014 at 7:06 AM, Klaij, Christiaan >>> wrote: >>> I'm playing with the selfp option in fieldsplit using >>> snes/examples/tutorials/ex70.c. For example: >>> >>> mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ >>> -ksp_type fgmres \ >>> -pc_type fieldsplit \ >>> -pc_fieldsplit_type schur \ >>> -pc_fieldsplit_schur_fact_type lower \ >>> -pc_fieldsplit_schur_precondition selfp \ >>> -fieldsplit_1_inner_ksp_type preonly \ >>> -fieldsplit_1_inner_pc_type jacobi \ >>> -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ >>> -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ >>> -ksp_monitor -ksp_max_it 1 >>> >>> gives the following output >>> >>> 0 KSP Residual norm 1.229687498638e+00 >>> Residual norms for fieldsplit_1_ solve. >>> 0 KSP Residual norm 2.330138480101e+01 >>> 1 KSP Residual norm 1.609000846751e+01 >>> 1 KSP Residual norm 1.180287268335e+00 >>> >>> To my suprise I don't see anything for the fieldsplit_0_ solve, >>> why? >>> >>> >>> >>> Always run with -ksp_view for any solver question. >>> >>> >>> Thanks, >>> >>> >>> Matt >>> Furthermore, if I understand correctly the above should be >>> exactly equivalent with >>> >>> mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ >>> -ksp_type fgmres \ >>> -pc_type fieldsplit \ >>> -pc_fieldsplit_type schur \ >>> -pc_fieldsplit_schur_fact_type lower \ >>> -user_ksp \ >>> -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ >>> -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ >>> -ksp_monitor -ksp_max_it 1 >>> >>> 0 KSP Residual norm 1.229687498638e+00 >>> Residual norms for fieldsplit_0_ solve. >>> 0 KSP Residual norm 5.486639587672e-01 >>> 1 KSP Residual norm 6.348354253703e-02 >>> Residual norms for fieldsplit_1_ solve. >>> 0 KSP Residual norm 2.321938107977e+01 >>> 1 KSP Residual norm 1.605484031258e+01 >>> 1 KSP Residual norm 1.183225251166e+00 >>> >>> because -user_ksp replaces the Schur complement by the simple >>> approximation A11 - A10 inv(diag(A00)) A01. Beside the missing >>> fielsplit_0_ part, the numbers are pretty close but not exactly >>> the same. Any explanation? >>> >>> Chris >>> >>> >>> dr. ir. Christiaan Klaij >>> CFD Researcher >>> Research & Development >>> E mailto:C.Klaij at marin.nl >>> T +31 317 49 33 44 >>> >>> >>> MARIN >>> 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands >>> T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl >>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image4a95d2.JPG Type: image/jpeg Size: 1622 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagea7316f.JPG Type: image/jpeg Size: 1069 bytes Desc: not available URL: From jed at jedbrown.org Fri Sep 5 11:00:44 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 05 Sep 2014 10:00:44 -0600 Subject: [petsc-users] Putting petsc in a namespace In-Reply-To: References: Message-ID: <87mwaeys43.fsf@jedbrown.org> Florian Lindner writes: > Hello, > > This may be rather a C/C++ question, but ... > > I encapsulate some petsc functions into c++ classes. Since I don't want > to pull all petsc symbols into the global namespace for anyone using my > classes I try to put petsc into it's own namespace: > > Header petsc.h: Note that there is already a petsc.h in the PETSc distribution. It is different from yours. > namespace petsc { > #include "petscmat.h" > } > > class Vector { > petsc::Vec vector; > } > > > Implementation petsc.cpp: > > #include "petsc.h" > > namespace petsc { > #include "petscviewer.h" > } > > using namespace petsc; > > > User: > > #include "petsc.h" > #include // if the user wants he can import parts of petsc > of course > > > But this gives a massive amount of error messsages like: > > mpic++ -o petsc.o -c -O0 -g3 -Wall > -I/home/florian/software/petsc/include > -I/home/florian/software/petsc/arch-linux2-c-debug/include petsc.cpp > mpic++ -o prbf.o -c -O0 -g3 -Wall -I/home/florian/software/petsc/include > -I/home/florian/software/petsc/arch-linux2-c-debug/include prbf.cpp > In file included from > /home/florian/software/petsc/include/petscksp.h:6:0, > from prbf.cpp:9: > /home/florian/software/petsc/include/petscpc.h:9:14: error: > 'PetscErrorCode' does not name a type > PETSC_EXTERN PetscErrorCode PCInitializePackage(void); What do you expect when you put some things in the namespace and include dependencies outside the namespace? namespaces don't play well with macros (including header guards). I don't think what you are attempting is a good use of time, but if you do it, I would recommend creating a public interface that does not include any PETSc headers, thus completely hiding the PETSc interface. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From knepley at gmail.com Fri Sep 5 11:11:51 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 5 Sep 2014 11:11:51 -0500 Subject: [petsc-users] How to run SNES ex62 In-Reply-To: References: Message-ID: On Wed, Aug 13, 2014 at 2:35 PM, Justin Chang wrote: > Hi all, > > This might seem like a silly question, but whenever I try running ./ex62 i > seem to always get a zero solution. I didn't see any script for running > ex62 in the makefile so I have tried all the runtime combination of options > listed in the Paris tutorial 2012 slides (e.g., block jacobi, gauss seidel, > uzawa, schur complement, etc). They either give me a zero pivot for LU or > errors like Petsc has generated inconsistent data. Can someone show me how > to get solutions for this problem? > All my options for ex62 are in the builder.py script in config/. Here is how I run an example: ./config/builder2.py check src/snes/examples/tutorials/ex62.c --testnum=0 You can run all tests by omitting --testnum, and see the other options using --help. This requires Python 2.7. Let me know if anything goes wrong, or you can't get what you want working. Thanks, Matt > Thanks, > Justin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 5 11:25:16 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 5 Sep 2014 11:25:16 -0500 Subject: [petsc-users] Using MUMPS and (PT)SCOTCH with PETSC In-Reply-To: References: Message-ID: Set a new value for PETSC_ARCH Obtain the Scotch 5.1.12b tar ball. Used ./configure ?download-ptscotch=nameoftarball.tar.gz ?download-mumps ?download-scalapack etc It will use the provided tar ball instead of downloading 6.0.0 Send errors to petsc-maint at mcs.anl.gov Barry On Sep 5, 2014, at 10:06 AM, Evan Um wrote: > Dear PETSC users, > > I tried to use SCOTCH 5.1.12b in my PETSC codes since MUMPS has compatibility issues with the latest SCOTCH library. > I was told that SCOTCH/6.0.0 that comes with PETSC/3.5.0 is automatically downloaded and installed. Is it still possible to use old SCOTCH library 5.1.2b in PETSC? As mentioned in MUMPS's FAQ, MUMPS has compatibility issues with the latest SCOTCH. MUMPS developers suggest that MUMPS should work with SCOTCH 5.1.12b. In advance, thanks for your kind comments. > > Regards, > Evan > > Errors from MUMPS with SCOTCH 6.0.0 and PETSC 3.5.0: > (5): ERROR: stratParserParse: invalid method parameter name "type", before "h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}}}},ole=s,ose=s,osq=n{sep=/(vert>120)?m{type=h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}};,ole=f{cmin=15,cmax=100000,frat=0.0},ose=g}} From ksong at lbl.gov Fri Sep 5 14:17:50 2014 From: ksong at lbl.gov (Kai Song) Date: Fri, 5 Sep 2014 14:17:50 -0500 Subject: [petsc-users] Using MUMPS and (PT)SCOTCH with PETSC In-Reply-To: References: Message-ID: Hi Barry, Thanks for the suggestion. I downloaded the scotch tar ball in the petsc source directory, and set up the configure flag as suggested, but I got the following error: ================ make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis'make: *** No rule to make target `ptesmumps'. Stop. ******************************************************************************* =============================================================================== Trying to download file://scotch_5.1.12b.tar.gz for PTSCOTCH =============================================================================== =============================================================================== Compiling PTScotch; this may take several minutes =============================================================================== ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Error running make on PTScotch: Could not execute "cd /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src && make clean ptesmumps": /bin/mkdir -p ../bin /bin/mkdir -p ../include /bin/mkdir -p ../lib (cd libscotch ; make clean) make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotch' rm -f *~ *.o lib*.a parser_yy.c parser_ly.h parser_ll.c *scotch.h *scotchf.h y.output dummysizes make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotch' (cd scotch ; make clean) make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/scotch' rm -f *~ *.o acpl amk_ccc amk_fft2 amk_grf amk_hy amk_m2 amk_p2 atst gbase gcv *ggath *gmap gmk_hy gmk_m2 gmk_m3 gmk_msh gmk_ub2 gmtst *gord gotst gout *gpart *gscat *gtst mcv mmk_m2 mmk_m3 mord mtst make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/scotch' (cd libscotchmetis ; make clean) make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis' rm -f *~ *.o lib*.a make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis'make: *** No rule to make target `ptesmumps'. Stop. ******************************************************************************* makefile:15: arch-linux2-c-debug/conf/petscvariables: No such file or directory /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/variables:117: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscvariables: No such file or directory /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/rules:993: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules: No such file or directory make: *** No rule to make target `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules'. Stop. makefile:15: arch-linux2-c-debug/conf/petscvariables: No such file or directory /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/variables:117: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscvariables: No such file or directory /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/rules:993: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules: No such file or directory make: *** No rule to make target `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules'. Stop. makefile:15: conf/petscvariables: No such file or directory make: *** No rule to make target `conf/petscvariables'. Stop. ================ Do you know what am I missing? Thanks, Kai On Fri, Sep 5, 2014 at 11:25 AM, Barry Smith wrote: > > Set a new value for PETSC_ARCH > > Obtain the Scotch 5.1.12b tar ball. Used ./configure > ?download-ptscotch=nameoftarball.tar.gz ?download-mumps ?download-scalapack > etc > > It will use the provided tar ball instead of downloading 6.0.0 Send > errors to petsc-maint at mcs.anl.gov > > Barry > > > > On Sep 5, 2014, at 10:06 AM, Evan Um wrote: > > > Dear PETSC users, > > > > I tried to use SCOTCH 5.1.12b in my PETSC codes since MUMPS has > compatibility issues with the latest SCOTCH library. > > > I was told that SCOTCH/6.0.0 that comes with PETSC/3.5.0 is > automatically downloaded and installed. Is it still possible to use old > SCOTCH library 5.1.2b in PETSC? As mentioned in MUMPS's FAQ, MUMPS has > compatibility issues with the latest SCOTCH. MUMPS developers suggest that > MUMPS should work with SCOTCH 5.1.12b. In advance, thanks for your kind > comments. > > > > Regards, > > Evan > > > > Errors from MUMPS with SCOTCH 6.0.0 and PETSC 3.5.0: > > (5): ERROR: stratParserParse: invalid method parameter name "type", > before > "h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}}}},ole=s,ose=s,osq=n{sep=/(vert>120)?m{type=h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}};,ole=f{cmin=15,cmax=100000,frat=0.0},ose=g}} > > -- Kai Song 1.510.495.2180 1 Cyclotron Rd. Berkeley, CA94720, MS-50B 3209 High Performance Computing Services (HPCS) Lawrence Berkeley National Laboratory - http://scs.lbl.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 5 14:50:16 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 5 Sep 2014 14:50:16 -0500 Subject: [petsc-users] Using MUMPS and (PT)SCOTCH with PETSC In-Reply-To: References: Message-ID: <45D3C7B4-D972-4EF6-96D0-BBFE00DA45F7@mcs.anl.gov> You need the https://gforge.inria.fr/frs/download.php/file/28934/scotch_5.1.12a_esmumps.tar.gz version with mumps in the tar ball name Barry On Sep 5, 2014, at 2:17 PM, Kai Song wrote: > Hi Barry, > > Thanks for the suggestion. I downloaded the scotch tar ball in the petsc source directory, and set up the configure flag as suggested, but I got the following error: > ================ > make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis'make: *** No rule to make target `ptesmumps'. Stop. > ******************************************************************************* > =============================================================================== > Trying to download file://scotch_5.1.12b.tar.gz for PTSCOTCH =============================================================================== =============================================================================== > Compiling PTScotch; this may take several minutes =============================================================================== ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > ------------------------------------------------------------------------------- > Error running make on PTScotch: Could not execute "cd /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src && make clean ptesmumps": > /bin/mkdir -p ../bin > /bin/mkdir -p ../include > /bin/mkdir -p ../lib > (cd libscotch ; make clean) > make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotch' > rm -f *~ *.o lib*.a parser_yy.c parser_ly.h parser_ll.c *scotch.h *scotchf.h y.output dummysizes > make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotch' > (cd scotch ; make clean) > make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/scotch' > rm -f *~ *.o acpl amk_ccc amk_fft2 amk_grf amk_hy amk_m2 amk_p2 atst gbase gcv *ggath *gmap gmk_hy gmk_m2 gmk_m3 gmk_msh gmk_ub2 gmtst *gord gotst gout *gpart *gscat *gtst mcv mmk_m2 mmk_m3 mord mtst > make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/scotch' > (cd libscotchmetis ; make clean) > make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis' > rm -f *~ *.o lib*.a > make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis'make: *** No rule to make target `ptesmumps'. Stop. > ******************************************************************************* > > makefile:15: arch-linux2-c-debug/conf/petscvariables: No such file or directory > /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/variables:117: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscvariables: No such file or directory > /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/rules:993: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules: No such file or directory > make: *** No rule to make target `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules'. Stop. > makefile:15: arch-linux2-c-debug/conf/petscvariables: No such file or directory > /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/variables:117: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscvariables: No such file or directory > /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/rules:993: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules: No such file or directory > make: *** No rule to make target `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules'. Stop. > makefile:15: conf/petscvariables: No such file or directory > make: *** No rule to make target `conf/petscvariables'. Stop. > > ================ > > Do you know what am I missing? > > Thanks, > > Kai > > > > > On Fri, Sep 5, 2014 at 11:25 AM, Barry Smith wrote: > > Set a new value for PETSC_ARCH > > Obtain the Scotch 5.1.12b tar ball. Used ./configure ?download-ptscotch=nameoftarball.tar.gz ?download-mumps ?download-scalapack etc > > It will use the provided tar ball instead of downloading 6.0.0 Send errors to petsc-maint at mcs.anl.gov > > Barry > > > > On Sep 5, 2014, at 10:06 AM, Evan Um wrote: > > > Dear PETSC users, > > > > I tried to use SCOTCH 5.1.12b in my PETSC codes since MUMPS has compatibility issues with the latest SCOTCH library. > > > I was told that SCOTCH/6.0.0 that comes with PETSC/3.5.0 is automatically downloaded and installed. Is it still possible to use old SCOTCH library 5.1.2b in PETSC? As mentioned in MUMPS's FAQ, MUMPS has compatibility issues with the latest SCOTCH. MUMPS developers suggest that MUMPS should work with SCOTCH 5.1.12b. In advance, thanks for your kind comments. > > > > Regards, > > Evan > > > > Errors from MUMPS with SCOTCH 6.0.0 and PETSC 3.5.0: > > (5): ERROR: stratParserParse: invalid method parameter name "type", before "h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}}}},ole=s,ose=s,osq=n{sep=/(vert>120)?m{type=h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}};,ole=f{cmin=15,cmax=100000,frat=0.0},ose=g}} > > > > > -- > Kai Song > 1.510.495.2180 > 1 Cyclotron Rd. Berkeley, CA94720, MS-50B 3209 > High Performance Computing Services (HPCS) > Lawrence Berkeley National Laboratory - http://scs.lbl.gov From ksong at lbl.gov Fri Sep 5 14:59:02 2014 From: ksong at lbl.gov (Kai Song) Date: Fri, 5 Sep 2014 14:59:02 -0500 Subject: [petsc-users] Using MUMPS and (PT)SCOTCH with PETSC In-Reply-To: <45D3C7B4-D972-4EF6-96D0-BBFE00DA45F7@mcs.anl.gov> References: <45D3C7B4-D972-4EF6-96D0-BBFE00DA45F7@mcs.anl.gov> Message-ID: Hi Barry, I got the similar error for scotch_5.1.12b_esmumps.tar.gz: ============= =============================================================================== Trying to download file://scotch_5.1.12b_esmumps.tar.gz for PTSCOTCH =============================================================================== =============================================================================== Compiling PTScotch; this may take several minutes =============================================================================== ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- Error running make on PTScotch: Could not execute "cd /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src && make clean ptesmumps": /bin/mkdir -p ../bin /bin/mkdir -p ../include /bin/mkdir -p ../lib (cd libscotch ; make clean) make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src/libscotch' rm -f *~ *.o lib*.a parser_yy.c parser_ly.h parser_ll.c *scotch.h *scotchf.h y.output dummysizes make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src/libscotch' (cd scotch ; make clean) make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src/scotch' rm -f *~ *.o acpl amk_ccc amk_fft2 amk_grf amk_hy amk_m2 amk_p2 atst gbase gcv *ggath *gmap gmk_hy gmk_m2 gmk_m3 gmk_msh gmk_ub2 gmtst *gord gotst gout *gpart *gscat *gtst mcv mmk_m2 mmk_m3 mord mtst make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src/scotch' (cd libscotchmetis ; make clean) make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src/libscotchmetis' rm -f *~ *.o lib*.a make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src/libscotchmetis' (cd esmumps ; make clean) make[1]: Entering directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src/esmumps' rm -f *~ common.h *.o lib*.a main_esmumps make[1]: Leaving directory `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12_esmumps/src/esmumps'make: *** No rule to make target `ptesmumps'. Stop. ******************************************************************************* makefile:15: arch-linux2-c-debug/conf/petscvariables: No such file or directory /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/variables:117: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscvariables: No such file or directory /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/rules:993: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules: No such file or directory make: *** No rule to make target `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules'. Stop. makefile:15: arch-linux2-c-debug/conf/petscvariables: No such file or directory /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/variables:117: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscvariables: No such file or directory /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/rules:993: /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules: No such file or directory make: *** No rule to make target `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules'. Stop. makefile:15: conf/petscvariables: No such file or directory make: *** No rule to make target `conf/petscvariables'. Stop. ============= Feel free to let me know if you need any additional information. My configure line looks like this: ./configure --prefix=/clusterfs/voltaire/home/software/modules/petsc/3.5.0 --download-fblaslapack=1 --download-mumps=1 --download-parmetis=parmetis-4.0.3.tar.gz --download-ptscotch=scotch_5.1.12b_esmumps.tar.gz --download-scalapack --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --with-mpi-dir=/global/software/sl-6.x86_64/modules/gcc/4.4.7/openmpi/1.6.5-gcc/ Thanks, Kai On Fri, Sep 5, 2014 at 2:50 PM, Barry Smith wrote: > > You need the > https://gforge.inria.fr/frs/download.php/file/28934/scotch_5.1.12a_esmumps.tar.gz > version with mumps in the tar ball name > > Barry > > On Sep 5, 2014, at 2:17 PM, Kai Song wrote: > > > Hi Barry, > > > > Thanks for the suggestion. I downloaded the scotch tar ball in the petsc > source directory, and set up the configure flag as suggested, but I got the > following error: > > ================ > > make[1]: Leaving directory > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis'make: > *** No rule to make target `ptesmumps'. Stop. > > > ******************************************************************************* > > > =============================================================================== > > Trying to download file://scotch_5.1.12b.tar.gz for PTSCOTCH > > =============================================================================== > > =============================================================================== > > Compiling PTScotch; this may take several minutes > > =============================================================================== > > ******************************************************************************* > > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > for details): > > > ------------------------------------------------------------------------------- > > Error running make on PTScotch: Could not execute "cd > /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src > && make clean ptesmumps": > > /bin/mkdir -p ../bin > > /bin/mkdir -p ../include > > /bin/mkdir -p ../lib > > (cd libscotch ; make clean) > > make[1]: Entering directory > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotch' > > rm -f *~ *.o lib*.a parser_yy.c parser_ly.h parser_ll.c *scotch.h > *scotchf.h y.output dummysizes > > make[1]: Leaving directory > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotch' > > (cd scotch ; make clean) > > make[1]: Entering directory > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/scotch' > > rm -f *~ *.o acpl amk_ccc amk_fft2 amk_grf amk_hy amk_m2 amk_p2 atst > gbase gcv *ggath *gmap gmk_hy gmk_m2 gmk_m3 gmk_msh gmk_ub2 gmtst *gord > gotst gout *gpart *gscat *gtst mcv mmk_m2 mmk_m3 mord mtst > > make[1]: Leaving directory > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/scotch' > > (cd libscotchmetis ; make clean) > > make[1]: Entering directory > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis' > > rm -f *~ *.o lib*.a > > make[1]: Leaving directory > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/externalpackages/scotch_5.1.12/src/libscotchmetis'make: > *** No rule to make target `ptesmumps'. Stop. > > > ******************************************************************************* > > > > makefile:15: arch-linux2-c-debug/conf/petscvariables: No such file or > directory > > /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/variables:117: > /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscvariables: > No such file or directory > > /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/rules:993: > /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules: > No such file or directory > > make: *** No rule to make target > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules'. > Stop. > > makefile:15: arch-linux2-c-debug/conf/petscvariables: No such file or > directory > > /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/variables:117: > /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscvariables: > No such file or directory > > /clusterfs/voltaire/home/software/source/petsc-3.5.0/conf/rules:993: > /clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules: > No such file or directory > > make: *** No rule to make target > `/clusterfs/voltaire/home/software/source/petsc-3.5.0/arch-linux2-c-debug/conf/petscrules'. > Stop. > > makefile:15: conf/petscvariables: No such file or directory > > make: *** No rule to make target `conf/petscvariables'. Stop. > > > > ================ > > > > Do you know what am I missing? > > > > Thanks, > > > > Kai > > > > > > > > > > On Fri, Sep 5, 2014 at 11:25 AM, Barry Smith wrote: > > > > Set a new value for PETSC_ARCH > > > > Obtain the Scotch 5.1.12b tar ball. Used ./configure > ?download-ptscotch=nameoftarball.tar.gz ?download-mumps ?download-scalapack > etc > > > > It will use the provided tar ball instead of downloading 6.0.0 Send > errors to petsc-maint at mcs.anl.gov > > > > Barry > > > > > > > > On Sep 5, 2014, at 10:06 AM, Evan Um wrote: > > > > > Dear PETSC users, > > > > > > I tried to use SCOTCH 5.1.12b in my PETSC codes since MUMPS has > compatibility issues with the latest SCOTCH library. > > > > > I was told that SCOTCH/6.0.0 that comes with PETSC/3.5.0 is > automatically downloaded and installed. Is it still possible to use old > SCOTCH library 5.1.2b in PETSC? As mentioned in MUMPS's FAQ, MUMPS has > compatibility issues with the latest SCOTCH. MUMPS developers suggest that > MUMPS should work with SCOTCH 5.1.12b. In advance, thanks for your kind > comments. > > > > > > Regards, > > > Evan > > > > > > Errors from MUMPS with SCOTCH 6.0.0 and PETSC 3.5.0: > > > (5): ERROR: stratParserParse: invalid method parameter name "type", > before > "h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}}}},ole=s,ose=s,osq=n{sep=/(vert>120)?m{type=h,vert=100,low=h{pass=10},asc=b{width=3,bnd=f{bal=0.2},org=h{pass=10}f{bal=0.2}}};,ole=f{cmin=15,cmax=100000,frat=0.0},ose=g}} > > > > > > > > > > -- > > Kai Song > > 1.510.495.2180 > > 1 Cyclotron Rd. Berkeley, CA94720, MS-50B 3209 > > High Performance Computing Services (HPCS) > > Lawrence Berkeley National Laboratory - http://scs.lbl.gov > > -- Kai Song 1.510.495.2180 1 Cyclotron Rd. Berkeley, CA94720, MS-50B 3209 High Performance Computing Services (HPCS) Lawrence Berkeley National Laboratory - http://scs.lbl.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Sep 5 14:59:31 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 05 Sep 2014 13:59:31 -0600 Subject: [petsc-users] Using MUMPS and (PT)SCOTCH with PETSC In-Reply-To: <45D3C7B4-D972-4EF6-96D0-BBFE00DA45F7@mcs.anl.gov> References: <45D3C7B4-D972-4EF6-96D0-BBFE00DA45F7@mcs.anl.gov> Message-ID: <8761h1zvmk.fsf@jedbrown.org> Barry Smith writes: > You need the https://gforge.inria.fr/frs/download.php/file/28934/scotch_5.1.12a_esmumps.tar.gz version with mumps in the tar ball name And yes, this is absurd, so don't forget to complain to upstream. One can only hope that overwhelming user outcry may be sufficient for them to reconsider this distribution decision. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jed at jedbrown.org Fri Sep 5 15:06:04 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 05 Sep 2014 14:06:04 -0600 Subject: [petsc-users] Using MUMPS and (PT)SCOTCH with PETSC In-Reply-To: References: <45D3C7B4-D972-4EF6-96D0-BBFE00DA45F7@mcs.anl.gov> Message-ID: <8738c5zvbn.fsf@jedbrown.org> Kai Song writes: > Hi Barry, > > I got the similar error for scotch_5.1.12b_esmumps.tar.gz: Looks like this commit was only valid with 6.0, so you'll have to revert it to use 5.1.12b. https://bitbucket.org/petsc/petsc/commits/4620623451f619bc2d23a7c3de5bdf5ee5fd0ff2 Is there a reason you prefer Scotch over ParMETIS? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From alpkalpalp at gmail.com Fri Sep 5 17:48:14 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Sat, 6 Sep 2014 01:48:14 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> Message-ID: Hi, Sorry for the late response. I tried the same sequence. But I need to say that I use git pull before it. Error is now during make all you may find the logs in the attachment. thanks beforehand regards, On Fri, Sep 5, 2014 at 2:51 PM, Matthew Knepley wrote: > Please send your configure.log and make.log > > Thanks, > > Matt > > > On Fri, Sep 5, 2014 at 4:27 AM, Alp Kalpalp wrote: > >> As I said before, I have checked out "master" branch and merged with your >> stefano_zampini/pcbddc-primalfixe branch. configured, compiled successfully. >> I have used --with-pcbddc option in configure as Barry suggested. >> >> However, tests are failed with following reason: >> >> akalpalp at a-kalpalp ~/petsc >> $ make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug test >> Running test examples to verify correct installation >> Using PETSC_DIR=/home/akalpalp/petsc and PETSC_ARCH=arch-mswin-c-debug >> *******************Error detected during compile or >> link!******************* >> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> /home/akalpalp/petsc/src/snes/examples/tutorials ex19 >> >> ********************************************************************************* >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -o ex19.o -c -Wall >> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 >> -I/home/akalpalp/petsc/include >> -I/home/akalpalp/petsc/arch-mswin-c-debug/include `pwd`/ex19.c >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -Wall -Wwrite-strings >> -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -o ex19 ex19.o >> -L/home/akalpalp/petsc/arch-mswin-c-debug/lib -lpetsc >> -Wl,-rpath,/home/akalpalp/petsc/arch-mswin-c-debug/lib -lf2clapack >> -lf2cblas -lpthread -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl >> /home/akalpalp/petsc/arch-mswin-c-debug/lib/libpetsc.a(pcregis.o):pcregis.c:(.rdata$.refptr.PCCreate_BDDC[.refptr.PCCreate_BDDC]+0x0): >> undefined reference to `PCCreate_BDDC' >> collect2: error: ld returned 1 exit status >> makefile:108: recipe for target 'ex19' failed >> make[3]: [ex19] Error 1 (ignored) >> /usr/bin/rm -f ex19.o >> Completed test examples >> ========================================= >> Now to evaluate the computer systems you plan use - do: >> make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug streams >> NPMAX= >> >> >> >> On Thu, Sep 4, 2014 at 5:37 PM, Barry Smith wrote: >> >>> >>> This is likely due to the horrible horrible fact that some of the >>> bddc files only get compiled if ./configure is run with the option >>> --with-pcbddc you will need to rerun ./configure and then make with that >>> option. >>> >>> I pray that someone removes that horrible confusing configure option. >>> >>> Barry >>> >>> On Sep 4, 2014, at 8:52 AM, Alp Kalpalp wrote: >>> >>> > Dear Stefano, >>> > >>> > I have checked out "master" branch and merged with your >>> stefano_zampini/pcbddc-primalfixe branch. configured, compiled and all >>> tests are completed successfully. >>> > Then, I tried to compile ex59 with make ex59, it results in unresolved >>> external error. I believe your bddc_feti files are not included in >>> compilation. >>> > Since I am not experienced on how to solve issues in Petsc, I need to >>> ask several questions; >>> > >>> > 1-) Are there any global settings to add additonal directories to >>> compilation (src\ksp\pc\impls\bddc) >>> > 2-) or should I include these files on top of ex59 (AFAIK, including >>> .c files is not a good thing) >>> > 3-) and finally what is the better way of helping you (creating >>> another branch from yours or what) >>> > >>> > Thanks in advance >>> > >>> > >>> > >>> > >>> > >>> > >>> > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini < >>> stefano.zampini at gmail.com> wrote: >>> > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is >>> the dual of the other) and it does not have its own classes so far. >>> > >>> > That said, you can experiment with FETI-DP only after having setup a >>> BDDC preconditioner with the options and customization you prefer. >>> > Use >>> http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html >>> for manual pages. >>> > >>> > For an 'how to' with FETIDP, please see >>> src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look >>> at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented >>> Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have >>> F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to >>> obtain a right-hand side for the FETIDP system and a physical solution from >>> the solution of the FETIDP system. >>> > >>> > I would recommend you to use the development version of the library >>> and either use the ?next? branch or the ?master' branch after having merged >>> in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also >>> contains the new deluxe scaling operator for BDDC which is not available to >>> use with FETI-DP. >>> > >>> > If you have any other questions which can be useful for other PETSc >>> users, please use the mailing list; otherwise you can contact me personally. >>> > >>> > Stefano >>> > >>> > >>> > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: >>> > >>> >> Matthew Knepley writes: >>> >>>> 1- Is it possible to complete a FETI-DP solution with the provided >>> >>>> functions in current PetSc release? >>> >>>> >>> >>> >>> >>> There is no FETI-DP in PETSc. >>> >> >>> >> Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. >>> You >>> >> can enable it by configuring --with-pcbddc. This will be turned on by >>> >> default soon. It is fairly new, so you should use the branch 'master' >>> >> instead of the release. It has an option to do FETI-DP instead of >>> BDDC. >>> >> See src/ksp/ksp/examples/tutorials/ex59.c. >>> >> >>> >> For either of these methods, you have to assemble a MATIS. If you use >>> >> MatSetValuesLocal, most of your assembly code can stay the same. >>> >> >>> >> Hopefully we can get better examples before the next release. Stefano >>> >> (the author of PCBDDC, Cc'd) tests mostly with external packages, but >>> we >>> >> really need more complete tests within PETSc. >>> > >>> > >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2499120 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 149282 bytes Desc: not available URL: From john.m.alletto at lmco.com Fri Sep 5 17:54:00 2014 From: john.m.alletto at lmco.com (Alletto, John M) Date: Fri, 5 Sep 2014 22:54:00 +0000 Subject: [petsc-users] FW: LaPlacian 3D example 34.c Message-ID: All, LaPlacian 3D example 34.c - how do you invoke multiGrid ? In general I am also having to try by trial and error which solvers and preconditioners support 3D objects. Is there a list somewhere? Respectfullt John Alletto -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Fri Sep 5 18:03:44 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 05 Sep 2014 17:03:44 -0600 Subject: [petsc-users] FW: LaPlacian 3D example 34.c In-Reply-To: References: Message-ID: <87r3zpy8j3.fsf@jedbrown.org> "Alletto, John M" writes: > All, > > LaPlacian 3D example 34.c - how do you invoke multiGrid ? Did you read my comment on your scicomp post? I recommend using src/ksp/ksp/examples/tutorials/ex45.c since it was written by us (Barry) and is more controllable via run-time options. ./ex45 -da_refine 3 -pc_type mg > In general I am also having to try by trial and error which solvers and preconditioners support 3D objects. What do you mean? What have you tried? Exact command lines and error messages would be useful, otherwise I have to gaze into my crystal ball to guess what issues you might be running into. 3D is no different from other dimensions at the algebraic level that most of PETSc operates at. > Is there a list somewhere? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Fri Sep 5 18:44:48 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 5 Sep 2014 18:44:48 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> Message-ID: perhaps unrelated - but are you sure you had the latest master? >>>> PETSC_VERSION_RELEASE 0 PETSC_VERSION_MAJOR 3 PETSC_VERSION_MINOR 5 PETSC_VERSION_SUBMINOR 0 PETSC_VERSION_PATCH 0 <<<<<<< PETSC_VERSION_PATCH should be 1 >>>>> CLINKER /home/alp/petsc/arch-mswin-c-debug/lib/libpetsc.so.3.05.0 Warning: corrupt .drectve at end of def file Warning: corrupt .drectve at end of def file Warning: corrupt .drectve at end of def file rch-mswin-c-debug/obj/src/sys/utils/mpiu.o:(.text+0xdd2): undefined reference to `__security_cookie' arch-mswin-c-debug/obj/src/sys/utils/mpiu.o:(.text+0xdd2): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__security_cookie' <<<<<< Hm - must be related to --with-shared-libraries option you are specifying. This is untested on windows. Can you remove it and retry? Satish On Fri, 5 Sep 2014, Alp Kalpalp wrote: > Hi, > > Sorry for the late response. I tried the same sequence. But I need to say > that I use git pull before it. Error is now during make all > > you may find the logs in the attachment. > > thanks beforehand > > regards, > > > On Fri, Sep 5, 2014 at 2:51 PM, Matthew Knepley wrote: > > > Please send your configure.log and make.log > > > > Thanks, > > > > Matt > > > > > > On Fri, Sep 5, 2014 at 4:27 AM, Alp Kalpalp wrote: > > > >> As I said before, I have checked out "master" branch and merged with your > >> stefano_zampini/pcbddc-primalfixe branch. configured, compiled successfully. > >> I have used --with-pcbddc option in configure as Barry suggested. > >> > >> However, tests are failed with following reason: > >> > >> akalpalp at a-kalpalp ~/petsc > >> $ make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug test > >> Running test examples to verify correct installation > >> Using PETSC_DIR=/home/akalpalp/petsc and PETSC_ARCH=arch-mswin-c-debug > >> *******************Error detected during compile or > >> link!******************* > >> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >> /home/akalpalp/petsc/src/snes/examples/tutorials ex19 > >> > >> ********************************************************************************* > >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -o ex19.o -c -Wall > >> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 > >> -I/home/akalpalp/petsc/include > >> -I/home/akalpalp/petsc/arch-mswin-c-debug/include `pwd`/ex19.c > >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -Wall -Wwrite-strings > >> -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -o ex19 ex19.o > >> -L/home/akalpalp/petsc/arch-mswin-c-debug/lib -lpetsc > >> -Wl,-rpath,/home/akalpalp/petsc/arch-mswin-c-debug/lib -lf2clapack > >> -lf2cblas -lpthread -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl > >> /home/akalpalp/petsc/arch-mswin-c-debug/lib/libpetsc.a(pcregis.o):pcregis.c:(.rdata$.refptr.PCCreate_BDDC[.refptr.PCCreate_BDDC]+0x0): > >> undefined reference to `PCCreate_BDDC' > >> collect2: error: ld returned 1 exit status > >> makefile:108: recipe for target 'ex19' failed > >> make[3]: [ex19] Error 1 (ignored) > >> /usr/bin/rm -f ex19.o > >> Completed test examples > >> ========================================= > >> Now to evaluate the computer systems you plan use - do: > >> make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug streams > >> NPMAX= > >> > >> > >> > >> On Thu, Sep 4, 2014 at 5:37 PM, Barry Smith wrote: > >> > >>> > >>> This is likely due to the horrible horrible fact that some of the > >>> bddc files only get compiled if ./configure is run with the option > >>> --with-pcbddc you will need to rerun ./configure and then make with that > >>> option. > >>> > >>> I pray that someone removes that horrible confusing configure option. > >>> > >>> Barry > >>> > >>> On Sep 4, 2014, at 8:52 AM, Alp Kalpalp wrote: > >>> > >>> > Dear Stefano, > >>> > > >>> > I have checked out "master" branch and merged with your > >>> stefano_zampini/pcbddc-primalfixe branch. configured, compiled and all > >>> tests are completed successfully. > >>> > Then, I tried to compile ex59 with make ex59, it results in unresolved > >>> external error. I believe your bddc_feti files are not included in > >>> compilation. > >>> > Since I am not experienced on how to solve issues in Petsc, I need to > >>> ask several questions; > >>> > > >>> > 1-) Are there any global settings to add additonal directories to > >>> compilation (src\ksp\pc\impls\bddc) > >>> > 2-) or should I include these files on top of ex59 (AFAIK, including > >>> .c files is not a good thing) > >>> > 3-) and finally what is the better way of helping you (creating > >>> another branch from yours or what) > >>> > > >>> > Thanks in advance > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini < > >>> stefano.zampini at gmail.com> wrote: > >>> > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is > >>> the dual of the other) and it does not have its own classes so far. > >>> > > >>> > That said, you can experiment with FETI-DP only after having setup a > >>> BDDC preconditioner with the options and customization you prefer. > >>> > Use > >>> http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html > >>> for manual pages. > >>> > > >>> > For an 'how to' with FETIDP, please see > >>> src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look > >>> at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented > >>> Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have > >>> F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to > >>> obtain a right-hand side for the FETIDP system and a physical solution from > >>> the solution of the FETIDP system. > >>> > > >>> > I would recommend you to use the development version of the library > >>> and either use the ?next? branch or the ?master' branch after having merged > >>> in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also > >>> contains the new deluxe scaling operator for BDDC which is not available to > >>> use with FETI-DP. > >>> > > >>> > If you have any other questions which can be useful for other PETSc > >>> users, please use the mailing list; otherwise you can contact me personally. > >>> > > >>> > Stefano > >>> > > >>> > > >>> > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > >>> > > >>> >> Matthew Knepley writes: > >>> >>>> 1- Is it possible to complete a FETI-DP solution with the provided > >>> >>>> functions in current PetSc release? > >>> >>>> > >>> >>> > >>> >>> There is no FETI-DP in PETSc. > >>> >> > >>> >> Wrong. There is PCBDDC, which has the same eigenvalues as FETI-DP. > >>> You > >>> >> can enable it by configuring --with-pcbddc. This will be turned on by > >>> >> default soon. It is fairly new, so you should use the branch 'master' > >>> >> instead of the release. It has an option to do FETI-DP instead of > >>> BDDC. > >>> >> See src/ksp/ksp/examples/tutorials/ex59.c. > >>> >> > >>> >> For either of these methods, you have to assemble a MATIS. If you use > >>> >> MatSetValuesLocal, most of your assembly code can stay the same. > >>> >> > >>> >> Hopefully we can get better examples before the next release. Stefano > >>> >> (the author of PCBDDC, Cc'd) tests mostly with external packages, but > >>> we > >>> >> really need more complete tests within PETSc. > >>> > > >>> > > >>> > >>> > >> > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which their > > experiments lead. > > -- Norbert Wiener > > > From balay at mcs.anl.gov Fri Sep 5 19:08:27 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 5 Sep 2014 19:08:27 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> Message-ID: On Wed, 3 Sep 2014, Stefano Zampini wrote: > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the dual of the other) and it does not have its own classes so far. > > That said, you can experiment with FETI-DP only after having setup a BDDC preconditioner with the options and customization you prefer. > Use http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html for manual pages. > > For an 'how to' with FETIDP, please see src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to obtain a right-hand side for the FETIDP system and a physical solution from the solution of the FETIDP system. > > I would recommend you to use the development version of the library and either use the ?next? branch or the ?master' branch after having merged in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also contains the new deluxe scaling operator for BDDC which is not available to use with FETI-DP. > > If you have any other questions which can be useful for other PETSc users, please use the mailing list; otherwise you can contact me personally. > > Stefano Hm - this example is crashing for me.. [both with next and master+stefano_zampini/pcbddc-primalfixes] Needs some debugging.. Satish >>>>>>>>>>> balay at asterix /home/balay/petsc/src/ksp/ksp/examples/tutorials (test) $ make runex59 12a13,96 > [1]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [2]PETSC ERROR: Null argument, when expecting valid pointer > [2]PETSC ERROR: Null Object: Parameter # 1 > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [2]PETSC ERROR: Petsc Development GIT revision: v3.5.1-212-g160c54e GIT Date: 2014-09-05 18:38:50 -0500 > [2]PETSC ERROR: [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [3]PETSC ERROR: Null argument, when expecting valid pointer > [3]PETSC ERROR: Null Object: Parameter # 1 > [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [3]PETSC ERROR: Petsc Development GIT revision: v3.5.1-212-g160c54e GIT Date: 2014-09-05 18:38:50 -0500 > [3]PETSC ERROR: ./ex59 on a arch-pcbddc named asterix.mcs.anl.gov by balay Fri Sep 5 19:04:27 2014 > [3]PETSC ERROR: Configure options --with-pcbddc=1 PETSC_ARCH=arch-pcbddc > ./ex59 on a arch-pcbddc named asterix.mcs.anl.gov by balay Fri Sep 5 19:04:27 2014 > [2]PETSC ERROR: Configure options --with-pcbddc=1 PETSC_ARCH=arch-pcbddc > [2]PETSC ERROR: #1 ISLocalToGlobalMappingRestoreBlockInfo() line 1146 in /home/balay/petsc/src/vec/is/utils/isltog.c > [3]PETSC ERROR: #1 ISLocalToGlobalMappingRestoreBlockInfo() line 1146 in /home/balay/petsc/src/vec/is/utils/isltog.c > [3]PETSC ERROR: [2]PETSC ERROR: #2 ISLocalToGlobalMappingRestoreInfo() line 1245 in /home/balay/petsc/src/vec/is/utils/isltog.c > [2]PETSC ERROR: #2 ISLocalToGlobalMappingRestoreInfo() line 1245 in /home/balay/petsc/src/vec/is/utils/isltog.c > [3]PETSC ERROR: #3 PCISDestroy() line 381 in /home/balay/petsc/src/ksp/pc/impls/is/pcis.c > #3 PCISDestroy() line 381 in /home/balay/petsc/src/ksp/pc/impls/is/pcis.c > [2]PETSC ERROR: [3]PETSC ERROR: #4 PCDestroy_BDDC() line 1373 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddc.c > #4 PCDestroy_BDDC() line 1373 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddc.c > [2]PETSC ERROR: #5 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > [3]PETSC ERROR: #5 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > [2]PETSC ERROR: #6 PCBDDCDestroyFETIDPPC() line 77 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddcfetidp.c > [3]PETSC ERROR: #6 PCBDDCDestroyFETIDPPC() line 77 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddcfetidp.c > [3]PETSC ERROR: [2]PETSC ERROR: #7 PCDestroy_Shell() line 194 in /home/balay/petsc/src/ksp/pc/impls/shell/shellpc.c > [2]PETSC ERROR: #7 PCDestroy_Shell() line 194 in /home/balay/petsc/src/ksp/pc/impls/shell/shellpc.c > [3]PETSC ERROR: #8 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > #8 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > [2]PETSC ERROR: #9 KSPDestroy() line 808 in /home/balay/petsc/src/ksp/ksp/interface/itfunc.c > [3]PETSC ERROR: #9 KSPDestroy() line 808 in /home/balay/petsc/src/ksp/ksp/interface/itfunc.c > [2]PETSC ERROR: #10 main() line 1088 in /home/balay/petsc/src/ksp/ksp/examples/tutorials/ex59.c > [3]PETSC ERROR: #10 main() line 1088 in /home/balay/petsc/src/ksp/ksp/examples/tutorials/ex59.c > [3]PETSC ERROR: [2]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 2 > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 3 > --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Null argument, when expecting valid pointer > [1]PETSC ERROR: Null Object: Parameter # 1 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Development GIT revision: v3.5.1-212-g160c54e GIT Date: 2014-09-05 18:38:50 -0500 > [1]PETSC ERROR: ./ex59 on a arch-pcbddc named asterix.mcs.anl.gov by balay Fri Sep 5 19:04:27 2014 > [1]PETSC ERROR: Configure options --with-pcbddc=1 PETSC_ARCH=arch-pcbddc > [1]PETSC ERROR: #1 ISLocalToGlobalMappingRestoreBlockInfo() line 1146 in /home/balay/petsc/src/vec/is/utils/isltog.c > [1]PETSC ERROR: #2 ISLocalToGlobalMappingRestoreInfo() line 1245 in /home/balay/petsc/src/vec/is/utils/isltog.c > [1]PETSC ERROR: #3 PCISDestroy() line 381 in /home/balay/petsc/src/ksp/pc/impls/is/pcis.c > [1]PETSC ERROR: #4 PCDestroy_BDDC() line 1373 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddc.c > [1]PETSC ERROR: #5 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > [1]PETSC ERROR: #6 PCBDDCDestroyFETIDPPC() line 77 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddcfetidp.c > [1]PETSC ERROR: #7 PCDestroy_Shell() line 194 in /home/balay/petsc/src/ksp/pc/impls/shell/shellpc.c > [1]PETSC ERROR: #8 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > [1]PETSC ERROR: #9 KSPDestroy() line 808 in /home/balay/petsc/src/ksp/ksp/interface/itfunc.c > [1]PETSC ERROR: #10 main() line 1088 in /home/balay/petsc/src/ksp/ksp/examples/tutorials/ex59.c > [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 1 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Object: Parameter # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.5.1-212-g160c54e GIT Date: 2014-09-05 18:38:50 -0500 > [0]PETSC ERROR: ./ex59 on a arch-pcbddc named asterix.mcs.anl.gov by balay Fri Sep 5 19:04:27 2014 > [0]PETSC ERROR: Configure options --with-pcbddc=1 PETSC_ARCH=arch-pcbddc > [0]PETSC ERROR: #1 ISLocalToGlobalMappingRestoreBlockInfo() line 1146 in /home/balay/petsc/src/vec/is/utils/isltog.c > [0]PETSC ERROR: #2 ISLocalToGlobalMappingRestoreInfo() line 1245 in /home/balay/petsc/src/vec/is/utils/isltog.c > [0]PETSC ERROR: #3 PCISDestroy() line 381 in /home/balay/petsc/src/ksp/pc/impls/is/pcis.c > [0]PETSC ERROR: #4 PCDestroy_BDDC() line 1373 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddc.c > [0]PETSC ERROR: #5 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #6 PCBDDCDestroyFETIDPPC() line 77 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddcfetidp.c > [0]PETSC ERROR: #7 PCDestroy_Shell() line 194 in /home/balay/petsc/src/ksp/pc/impls/shell/shellpc.c > [0]PETSC ERROR: #8 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #9 KSPDestroy() line 808 in /home/balay/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #10 main() line 1088 in /home/balay/petsc/src/ksp/ksp/examples/tutorials/ex59.c > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = PID 14663 RUNNING AT asterix.mcs.anl.gov > = EXIT CODE: 85 > = CLEANING UP REMAINING PROCESSES > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > =================================================================================== /home/balay/petsc/src/ksp/ksp/examples/tutorials Possible problem with ex59, diffs above ========================================= balay at asterix /home/balay/petsc/src/ksp/ksp/examples/tutorials (test) $ From balay at mcs.anl.gov Fri Sep 5 19:39:15 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 5 Sep 2014 19:39:15 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> Message-ID: Ok - looks like this error check in ISLocalToGlobalMappingRestoreBlockInfo() was added a couple of days back. >>> https://bitbucket.org/petsc/petsc/commits/cbc1caf078fb2bf42b82e0b5ac811b1101900405 PetscValidHeaderSpecific(mapping,IS_LTOGM_CLASSID,1); <<< This is breaking PCISDestroy() - which is attempting to pass in a null for 'mapping' >>>>>> if (pcis->ISLocalToGlobalMappingGetInfoWasCalled) { ierr = ISLocalToGlobalMappingRestoreInfo((ISLocalToGlobalMapping)0,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); } <<<<<< Commenting out the error check gets the code working. Satish On Fri, 5 Sep 2014, Satish Balay wrote: > On Wed, 3 Sep 2014, Stefano Zampini wrote: > > > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one is the dual of the other) and it does not have its own classes so far. > > > > That said, you can experiment with FETI-DP only after having setup a BDDC preconditioner with the options and customization you prefer. > > Use http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html for manual pages. > > > > For an 'how to' with FETIDP, please see src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically look at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once you have F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution to obtain a right-hand side for the FETIDP system and a physical solution from the solution of the FETIDP system. > > > > I would recommend you to use the development version of the library and either use the ?next? branch or the ?master' branch after having merged in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? also contains the new deluxe scaling operator for BDDC which is not available to use with FETI-DP. > > > > If you have any other questions which can be useful for other PETSc users, please use the mailing list; otherwise you can contact me personally. > > > > Stefano > > Hm - this example is crashing for me.. [both with next and master+stefano_zampini/pcbddc-primalfixes] > > Needs some debugging.. > > Satish > > >>>>>>>>>>> > balay at asterix /home/balay/petsc/src/ksp/ksp/examples/tutorials (test) > $ make runex59 > 12a13,96 > > [1]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [2]PETSC ERROR: Null argument, when expecting valid pointer > > [2]PETSC ERROR: Null Object: Parameter # 1 > > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [2]PETSC ERROR: Petsc Development GIT revision: v3.5.1-212-g160c54e GIT Date: 2014-09-05 18:38:50 -0500 > > [2]PETSC ERROR: [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [3]PETSC ERROR: Null argument, when expecting valid pointer > > [3]PETSC ERROR: Null Object: Parameter # 1 > > [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [3]PETSC ERROR: Petsc Development GIT revision: v3.5.1-212-g160c54e GIT Date: 2014-09-05 18:38:50 -0500 > > [3]PETSC ERROR: ./ex59 on a arch-pcbddc named asterix.mcs.anl.gov by balay Fri Sep 5 19:04:27 2014 > > [3]PETSC ERROR: Configure options --with-pcbddc=1 PETSC_ARCH=arch-pcbddc > > ./ex59 on a arch-pcbddc named asterix.mcs.anl.gov by balay Fri Sep 5 19:04:27 2014 > > [2]PETSC ERROR: Configure options --with-pcbddc=1 PETSC_ARCH=arch-pcbddc > > [2]PETSC ERROR: #1 ISLocalToGlobalMappingRestoreBlockInfo() line 1146 in /home/balay/petsc/src/vec/is/utils/isltog.c > > [3]PETSC ERROR: #1 ISLocalToGlobalMappingRestoreBlockInfo() line 1146 in /home/balay/petsc/src/vec/is/utils/isltog.c > > [3]PETSC ERROR: [2]PETSC ERROR: #2 ISLocalToGlobalMappingRestoreInfo() line 1245 in /home/balay/petsc/src/vec/is/utils/isltog.c > > [2]PETSC ERROR: #2 ISLocalToGlobalMappingRestoreInfo() line 1245 in /home/balay/petsc/src/vec/is/utils/isltog.c > > [3]PETSC ERROR: #3 PCISDestroy() line 381 in /home/balay/petsc/src/ksp/pc/impls/is/pcis.c > > #3 PCISDestroy() line 381 in /home/balay/petsc/src/ksp/pc/impls/is/pcis.c > > [2]PETSC ERROR: [3]PETSC ERROR: #4 PCDestroy_BDDC() line 1373 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddc.c > > #4 PCDestroy_BDDC() line 1373 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddc.c > > [2]PETSC ERROR: #5 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > > [3]PETSC ERROR: #5 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > > [2]PETSC ERROR: #6 PCBDDCDestroyFETIDPPC() line 77 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddcfetidp.c > > [3]PETSC ERROR: #6 PCBDDCDestroyFETIDPPC() line 77 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddcfetidp.c > > [3]PETSC ERROR: [2]PETSC ERROR: #7 PCDestroy_Shell() line 194 in /home/balay/petsc/src/ksp/pc/impls/shell/shellpc.c > > [2]PETSC ERROR: #7 PCDestroy_Shell() line 194 in /home/balay/petsc/src/ksp/pc/impls/shell/shellpc.c > > [3]PETSC ERROR: #8 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > > #8 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > > [2]PETSC ERROR: #9 KSPDestroy() line 808 in /home/balay/petsc/src/ksp/ksp/interface/itfunc.c > > [3]PETSC ERROR: #9 KSPDestroy() line 808 in /home/balay/petsc/src/ksp/ksp/interface/itfunc.c > > [2]PETSC ERROR: #10 main() line 1088 in /home/balay/petsc/src/ksp/ksp/examples/tutorials/ex59.c > > [3]PETSC ERROR: #10 main() line 1088 in /home/balay/petsc/src/ksp/ksp/examples/tutorials/ex59.c > > [3]PETSC ERROR: [2]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 2 > > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 3 > > --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: Null argument, when expecting valid pointer > > [1]PETSC ERROR: Null Object: Parameter # 1 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: Petsc Development GIT revision: v3.5.1-212-g160c54e GIT Date: 2014-09-05 18:38:50 -0500 > > [1]PETSC ERROR: ./ex59 on a arch-pcbddc named asterix.mcs.anl.gov by balay Fri Sep 5 19:04:27 2014 > > [1]PETSC ERROR: Configure options --with-pcbddc=1 PETSC_ARCH=arch-pcbddc > > [1]PETSC ERROR: #1 ISLocalToGlobalMappingRestoreBlockInfo() line 1146 in /home/balay/petsc/src/vec/is/utils/isltog.c > > [1]PETSC ERROR: #2 ISLocalToGlobalMappingRestoreInfo() line 1245 in /home/balay/petsc/src/vec/is/utils/isltog.c > > [1]PETSC ERROR: #3 PCISDestroy() line 381 in /home/balay/petsc/src/ksp/pc/impls/is/pcis.c > > [1]PETSC ERROR: #4 PCDestroy_BDDC() line 1373 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddc.c > > [1]PETSC ERROR: #5 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > > [1]PETSC ERROR: #6 PCBDDCDestroyFETIDPPC() line 77 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddcfetidp.c > > [1]PETSC ERROR: #7 PCDestroy_Shell() line 194 in /home/balay/petsc/src/ksp/pc/impls/shell/shellpc.c > > [1]PETSC ERROR: #8 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > > [1]PETSC ERROR: #9 KSPDestroy() line 808 in /home/balay/petsc/src/ksp/ksp/interface/itfunc.c > > [1]PETSC ERROR: #10 main() line 1088 in /home/balay/petsc/src/ksp/ksp/examples/tutorials/ex59.c > > [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 1 > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Null argument, when expecting valid pointer > > [0]PETSC ERROR: Null Object: Parameter # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Development GIT revision: v3.5.1-212-g160c54e GIT Date: 2014-09-05 18:38:50 -0500 > > [0]PETSC ERROR: ./ex59 on a arch-pcbddc named asterix.mcs.anl.gov by balay Fri Sep 5 19:04:27 2014 > > [0]PETSC ERROR: Configure options --with-pcbddc=1 PETSC_ARCH=arch-pcbddc > > [0]PETSC ERROR: #1 ISLocalToGlobalMappingRestoreBlockInfo() line 1146 in /home/balay/petsc/src/vec/is/utils/isltog.c > > [0]PETSC ERROR: #2 ISLocalToGlobalMappingRestoreInfo() line 1245 in /home/balay/petsc/src/vec/is/utils/isltog.c > > [0]PETSC ERROR: #3 PCISDestroy() line 381 in /home/balay/petsc/src/ksp/pc/impls/is/pcis.c > > [0]PETSC ERROR: #4 PCDestroy_BDDC() line 1373 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddc.c > > [0]PETSC ERROR: #5 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #6 PCBDDCDestroyFETIDPPC() line 77 in /home/balay/petsc/src/ksp/pc/impls/bddc/bddcfetidp.c > > [0]PETSC ERROR: #7 PCDestroy_Shell() line 194 in /home/balay/petsc/src/ksp/pc/impls/shell/shellpc.c > > [0]PETSC ERROR: #8 PCDestroy() line 121 in /home/balay/petsc/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: #9 KSPDestroy() line 808 in /home/balay/petsc/src/ksp/ksp/interface/itfunc.c > > [0]PETSC ERROR: #10 main() line 1088 in /home/balay/petsc/src/ksp/ksp/examples/tutorials/ex59.c > > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 > > > > =================================================================================== > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > = PID 14663 RUNNING AT asterix.mcs.anl.gov > > = EXIT CODE: 85 > > = CLEANING UP REMAINING PROCESSES > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > =================================================================================== > /home/balay/petsc/src/ksp/ksp/examples/tutorials > Possible problem with ex59, diffs above > ========================================= > balay at asterix /home/balay/petsc/src/ksp/ksp/examples/tutorials (test) > $ From jed at jedbrown.org Fri Sep 5 19:53:06 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 05 Sep 2014 18:53:06 -0600 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> Message-ID: <87egvpy3gt.fsf@jedbrown.org> Satish Balay writes: > Ok - looks like this error check in > ISLocalToGlobalMappingRestoreBlockInfo() was added a couple of days > back. > >>>> > https://bitbucket.org/petsc/petsc/commits/cbc1caf078fb2bf42b82e0b5ac811b1101900405 > PetscValidHeaderSpecific(mapping,IS_LTOGM_CLASSID,1); > <<< > > This is breaking PCISDestroy() - which is attempting to pass in a null for 'mapping' > >>>>>>> > if (pcis->ISLocalToGlobalMappingGetInfoWasCalled) { > ierr = ISLocalToGlobalMappingRestoreInfo((ISLocalToGlobalMapping)0,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); Yuck. > } > <<<<<< > > Commenting out the error check gets the code working. I consider the error check to be correct; the code using it needs to be fixed. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Fri Sep 5 22:22:43 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 5 Sep 2014 22:22:43 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: <87egvpy3gt.fsf@jedbrown.org> References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <87egvpy3gt.fsf@jedbrown.org> Message-ID: On Fri, 5 Sep 2014, Jed Brown wrote: > Satish Balay writes: > > > Ok - looks like this error check in > > ISLocalToGlobalMappingRestoreBlockInfo() was added a couple of days > > back. > > > >>>> > > https://bitbucket.org/petsc/petsc/commits/cbc1caf078fb2bf42b82e0b5ac811b1101900405 > > PetscValidHeaderSpecific(mapping,IS_LTOGM_CLASSID,1); > > <<< > > > > This is breaking PCISDestroy() - which is attempting to pass in a null for 'mapping' > > > >>>>>>> > > if (pcis->ISLocalToGlobalMappingGetInfoWasCalled) { > > ierr = ISLocalToGlobalMappingRestoreInfo((ISLocalToGlobalMapping)0,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); > > Yuck. > > > } > > <<<<<< > > > > Commenting out the error check gets the code working. > > I consider the error check to be correct; the code using it needs to be fixed. Perhaps the following is the fix [with proper comments, more error checks?]. But someone more familiar with this code should check this.. Satish -------------- $ git diff |cat diff --git a/src/ksp/pc/impls/is/pcis.c b/src/ksp/pc/impls/is/pcis.c index dab5836..0fa0217 100644 --- a/src/ksp/pc/impls/is/pcis.c +++ b/src/ksp/pc/impls/is/pcis.c @@ -140,6 +140,8 @@ PetscErrorCode PCISSetUp(PC pc) ierr = PetscObjectTypeCompare((PetscObject)pc->pmat,MATIS,&flg);CHKERRQ(ierr); if (!flg) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_WRONG,"Preconditioner type of Neumann Neumman requires matrix of type MATIS"); matis = (Mat_IS*)pc->pmat->data; + PetscObjectReference((PetscObject)pc->pmat); + pcis->pmat = pc->pmat; pcis->pure_neumann = matis->pure_neumann; @@ -378,8 +380,9 @@ PetscErrorCode PCISDestroy(PC pc) ierr = VecScatterDestroy(&pcis->global_to_B);CHKERRQ(ierr); ierr = PetscFree(pcis->work_N);CHKERRQ(ierr); if (pcis->ISLocalToGlobalMappingGetInfoWasCalled) { - ierr = ISLocalToGlobalMappingRestoreInfo((ISLocalToGlobalMapping)0,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); + ierr = ISLocalToGlobalMappingRestoreInfo(((Mat_IS*)pcis->pmat->data)->mapping,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); } + ierr = MatDestroy(&pcis->pmat);CHKERRQ(ierr); ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetUseStiffnessScaling_C",NULL);CHKERRQ(ierr); ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetSubdomainScalingFactor_C",NULL);CHKERRQ(ierr); ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetSubdomainDiagonalScaling_C",NULL);CHKERRQ(ierr); diff --git a/src/ksp/pc/impls/is/pcis.h b/src/ksp/pc/impls/is/pcis.h index 4a42cf9..736ea8c 100644 --- a/src/ksp/pc/impls/is/pcis.h +++ b/src/ksp/pc/impls/is/pcis.h @@ -73,6 +73,7 @@ typedef struct { /* We need: */ /* proc[k].loc_to_glob(proc[k].shared[i][m]) == proc[l].loc_to_glob(proc[l].shared[j][m]) */ /* for all 0 <= m < proc[k].n_shared[i], or equiv'ly, for all 0 <= m < proc[l].n_shared[j] */ + Mat pmat; } PC_IS; PETSC_EXTERN PetscErrorCode PCISSetUp(PC pc); From jed at jedbrown.org Fri Sep 5 22:35:45 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 05 Sep 2014 21:35:45 -0600 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <87egvpy3gt.fsf@jedbrown.org> Message-ID: <87lhpxwhda.fsf@jedbrown.org> Satish Balay writes: > Perhaps the following is the fix [with proper comments, more error > checks?]. But someone more familiar with this code should check this.. > > Satish > > -------------- > $ git diff |cat > diff --git a/src/ksp/pc/impls/is/pcis.c b/src/ksp/pc/impls/is/pcis.c > index dab5836..0fa0217 100644 > --- a/src/ksp/pc/impls/is/pcis.c > +++ b/src/ksp/pc/impls/is/pcis.c > @@ -140,6 +140,8 @@ PetscErrorCode PCISSetUp(PC pc) > ierr = PetscObjectTypeCompare((PetscObject)pc->pmat,MATIS,&flg);CHKERRQ(ierr); > if (!flg) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_WRONG,"Preconditioner type of Neumann Neumman requires matrix of type MATIS"); > matis = (Mat_IS*)pc->pmat->data; > + PetscObjectReference((PetscObject)pc->pmat); > + pcis->pmat = pc->pmat; Uh, PCISSetUp can be called more than once? And simply destroying the pcis->pmat reference is not enough because that extra reference could significantly increase the peak memory usage. The right solution is to not hold that reference and not hold the info. > pcis->pure_neumann = matis->pure_neumann; > > @@ -378,8 +380,9 @@ PetscErrorCode PCISDestroy(PC pc) > ierr = VecScatterDestroy(&pcis->global_to_B);CHKERRQ(ierr); > ierr = PetscFree(pcis->work_N);CHKERRQ(ierr); > if (pcis->ISLocalToGlobalMappingGetInfoWasCalled) { > - ierr = ISLocalToGlobalMappingRestoreInfo((ISLocalToGlobalMapping)0,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); > + ierr = ISLocalToGlobalMappingRestoreInfo(((Mat_IS*)pcis->pmat->data)->mapping,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); > } Why not restore the info at the place it is gotten, like we do with every other accessor? > + ierr = MatDestroy(&pcis->pmat);CHKERRQ(ierr); > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetUseStiffnessScaling_C",NULL);CHKERRQ(ierr); > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetSubdomainScalingFactor_C",NULL);CHKERRQ(ierr); > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetSubdomainDiagonalScaling_C",NULL);CHKERRQ(ierr); > diff --git a/src/ksp/pc/impls/is/pcis.h b/src/ksp/pc/impls/is/pcis.h > index 4a42cf9..736ea8c 100644 > --- a/src/ksp/pc/impls/is/pcis.h > +++ b/src/ksp/pc/impls/is/pcis.h > @@ -73,6 +73,7 @@ typedef struct { > /* We need: */ > /* proc[k].loc_to_glob(proc[k].shared[i][m]) == proc[l].loc_to_glob(proc[l].shared[j][m]) */ > /* for all 0 <= m < proc[k].n_shared[i], or equiv'ly, for all 0 <= m < proc[l].n_shared[j] */ > + Mat pmat; > } PC_IS; > > PETSC_EXTERN PetscErrorCode PCISSetUp(PC pc); -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jychang48 at gmail.com Fri Sep 5 22:55:39 2014 From: jychang48 at gmail.com (Justin Chang) Date: Fri, 5 Sep 2014 22:55:39 -0500 Subject: [petsc-users] FE discretization in DMPlex Message-ID: <896C129B-43B0-4E5C-A5B7-ADC604E34892@gmail.com> Hi all, So I understand how the FEM code works in the DMPlex examples (ex12 and 62). Pardon me if this is a silly question. 1) If I wanted to solve either the poisson or stokes using the discontinuous Galerkin method, is there a way to do this with the built-in DMPlex/FEM functions? Basically each cell/element has its own set of degrees of freedom, and jump/average operations would be needed to "connect" the dofs across element interfaces. 2) Or how about using something like Raviart-Thomas spaces (we'll say lowest order for simplicity). Where the velocity dofs are not nodal quantities, instead they are denoted by edge fluxes (or face fluxes for tetrahedrals). Pressure would be piecewise constant. Intuitively these should be doable if I were to write my own DMPlex/PetscSection code, but I was wondering if the above two discretizations are achievable in the way ex12 and ex62 are. Thanks, Justin From balay at mcs.anl.gov Sat Sep 6 01:18:58 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 6 Sep 2014 01:18:58 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: <87lhpxwhda.fsf@jedbrown.org> References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <87egvpy3gt.fsf@jedbrown.org> <87lhpxwhda.fsf@jedbrown.org> Message-ID: On Fri, 5 Sep 2014, Jed Brown wrote: > Satish Balay writes: > > Perhaps the following is the fix [with proper comments, more error > > checks?]. But someone more familiar with this code should check this.. > > > > Satish > > > > -------------- > > $ git diff |cat > > diff --git a/src/ksp/pc/impls/is/pcis.c b/src/ksp/pc/impls/is/pcis.c > > index dab5836..0fa0217 100644 > > --- a/src/ksp/pc/impls/is/pcis.c > > +++ b/src/ksp/pc/impls/is/pcis.c > > @@ -140,6 +140,8 @@ PetscErrorCode PCISSetUp(PC pc) > > ierr = PetscObjectTypeCompare((PetscObject)pc->pmat,MATIS,&flg);CHKERRQ(ierr); > > if (!flg) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_WRONG,"Preconditioner type of Neumann Neumman requires matrix of type MATIS"); > > matis = (Mat_IS*)pc->pmat->data; > > + PetscObjectReference((PetscObject)pc->pmat); > > + pcis->pmat = pc->pmat; > > Uh, PCISSetUp can be called more than once? I have no idea.. > > And simply destroying the pcis->pmat reference is not enough because > that extra reference could significantly increase the peak memory usage. Curently the object (pc->pmat) is destroyed at the end anyway [perhaps duing PCDestroy()]. This fix changes the order a bit so that its destoryed only after its last use. > The right solution is to not hold that reference and not hold the info. > > > pcis->pure_neumann = matis->pure_neumann; > > > > @@ -378,8 +380,9 @@ PetscErrorCode PCISDestroy(PC pc) > > ierr = VecScatterDestroy(&pcis->global_to_B);CHKERRQ(ierr); > > ierr = PetscFree(pcis->work_N);CHKERRQ(ierr); > > if (pcis->ISLocalToGlobalMappingGetInfoWasCalled) { > > - ierr = ISLocalToGlobalMappingRestoreInfo((ISLocalToGlobalMapping)0,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); > > + ierr = ISLocalToGlobalMappingRestoreInfo(((Mat_IS*)pcis->pmat->data)->mapping,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); > > } > > Why not restore the info at the place it is gotten, like we do with > every other accessor? Looks like this info is stashed in 'pcis->n_neigh, pcis->neigh' etc - and reused later multple times. [perhaps preventing multiple mallocs/frees] $ git grep -l 'pcis->n_neigh' src/ksp/pc/impls/bddc/bddcfetidp.c src/ksp/pc/impls/is/nn/nn.c src/ksp/pc/impls/is/pcis.c Or perhaps this info should be stashed in the IS so multiple ISLocalToGlobalMappingGetInfo() calls are cheap [but then the malloc'd memory will live until IS is destroyed anyway] I guess there are 2 issues you are touching on. A fix for this crash - and code cleanup. My patch gets the examples working. But I'll defer both isses to Stefano [asuming he is aquainted with the above sources]. Satish > > > + ierr = MatDestroy(&pcis->pmat);CHKERRQ(ierr); > > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetUseStiffnessScaling_C",NULL);CHKERRQ(ierr); > > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetSubdomainScalingFactor_C",NULL);CHKERRQ(ierr); > > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetSubdomainDiagonalScaling_C",NULL);CHKERRQ(ierr); > > diff --git a/src/ksp/pc/impls/is/pcis.h b/src/ksp/pc/impls/is/pcis.h > > index 4a42cf9..736ea8c 100644 > > --- a/src/ksp/pc/impls/is/pcis.h > > +++ b/src/ksp/pc/impls/is/pcis.h > > @@ -73,6 +73,7 @@ typedef struct { > > /* We need: */ > > /* proc[k].loc_to_glob(proc[k].shared[i][m]) == proc[l].loc_to_glob(proc[l].shared[j][m]) */ > > /* for all 0 <= m < proc[k].n_shared[i], or equiv'ly, for all 0 <= m < proc[l].n_shared[j] */ > > + Mat pmat; > > } PC_IS; > > > > PETSC_EXTERN PetscErrorCode PCISSetUp(PC pc); > > From knepley at gmail.com Sat Sep 6 03:58:03 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 6 Sep 2014 03:58:03 -0500 Subject: [petsc-users] FE discretization in DMPlex In-Reply-To: <896C129B-43B0-4E5C-A5B7-ADC604E34892@gmail.com> References: <896C129B-43B0-4E5C-A5B7-ADC604E34892@gmail.com> Message-ID: On Fri, Sep 5, 2014 at 10:55 PM, Justin Chang wrote: > Hi all, > > So I understand how the FEM code works in the DMPlex examples (ex12 and > 62). Pardon me if this is a silly question. > > 1) If I wanted to solve either the poisson or stokes using the > discontinuous Galerkin method, is there a way to do this with the built-in > DMPlex/FEM functions? Basically each cell/element has its own set of > degrees of freedom, and jump/average operations would be needed to > "connect" the dofs across element interfaces. > > 2) Or how about using something like Raviart-Thomas spaces (we'll say > lowest order for simplicity). Where the velocity dofs are not nodal > quantities, instead they are denoted by edge fluxes (or face fluxes for > tetrahedrals). Pressure would be piecewise constant. > > Intuitively these should be doable if I were to write my own > DMPlex/PetscSection code, but I was wondering if the above two > discretizations are achievable in the way ex12 and ex62 are. > Lets do RT first since its easier. The primal space is P_K = Poly_{q--1}(K) + x Poly_{q-1}(K) so at lowest order its just Poly_1. The dual space is moments of the normal component of velocity on the edges. So you would write a dual space where the functionals integrated the normal component. This is the tricky part: http://www.math.chalmers.se/~logg/pub/papers/KirbyLoggEtAl2010a.pdf DG is just a generalization of this kind of thing where you need to a) have some geometric quantities available to the pointwise functions (like h), and also some field quantities (like the jump and average). I understand exactly how I want to do the RT, BDM, BDMF, and NED elements, and those will be in soon. I think DG is fairly messy and am not completely sure what I want here. Matt > Thanks, > Justin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From alpkalpalp at gmail.com Sat Sep 6 04:35:51 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Sat, 6 Sep 2014 12:35:51 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> Message-ID: Hi, I switched back to master again and PETSC_VERSION_SUBMINOR is equal to 1 again. I think with the merge of stefano_zampini/pcbddc-primalfixe it is changing to 0. As you suggested I removed --with-shared-libraries option, but although make all seems working make test is failed with attached errors. thanks On Sat, Sep 6, 2014 at 2:44 AM, Satish Balay wrote: > perhaps unrelated - but are you sure you had the latest master? > > >>>> > PETSC_VERSION_RELEASE 0 > PETSC_VERSION_MAJOR 3 > PETSC_VERSION_MINOR 5 > PETSC_VERSION_SUBMINOR 0 > PETSC_VERSION_PATCH 0 > <<<<<<< > > PETSC_VERSION_PATCH should be 1 > > > >>>>> > CLINKER /home/alp/petsc/arch-mswin-c-debug/lib/libpetsc.so.3.05.0 > Warning: corrupt .drectve at end of def file > Warning: corrupt .drectve at end of def file > Warning: corrupt .drectve at end of def file > > rch-mswin-c-debug/obj/src/sys/utils/mpiu.o:(.text+0xdd2): undefined > reference to `__security_cookie' > arch-mswin-c-debug/obj/src/sys/utils/mpiu.o:(.text+0xdd2): relocation > truncated to fit: R_X86_64_PC32 against undefined symbol `__security_cookie' > <<<<<< > > Hm - must be related to --with-shared-libraries option you are > specifying. This is untested on windows. > > Can you remove it and retry? > > Satish > > On Fri, 5 Sep 2014, Alp Kalpalp wrote: > > > Hi, > > > > Sorry for the late response. I tried the same sequence. But I need to say > > that I use git pull before it. Error is now during make all > > > > you may find the logs in the attachment. > > > > thanks beforehand > > > > regards, > > > > > > On Fri, Sep 5, 2014 at 2:51 PM, Matthew Knepley > wrote: > > > > > Please send your configure.log and make.log > > > > > > Thanks, > > > > > > Matt > > > > > > > > > On Fri, Sep 5, 2014 at 4:27 AM, Alp Kalpalp > wrote: > > > > > >> As I said before, I have checked out "master" branch and merged with > your > > >> stefano_zampini/pcbddc-primalfixe branch. configured, compiled > successfully. > > >> I have used --with-pcbddc option in configure as Barry suggested. > > >> > > >> However, tests are failed with following reason: > > >> > > >> akalpalp at a-kalpalp ~/petsc > > >> $ make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug > test > > >> Running test examples to verify correct installation > > >> Using PETSC_DIR=/home/akalpalp/petsc and PETSC_ARCH=arch-mswin-c-debug > > >> *******************Error detected during compile or > > >> link!******************* > > >> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > >> /home/akalpalp/petsc/src/snes/examples/tutorials ex19 > > >> > > >> > ********************************************************************************* > > >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -o ex19.o -c -Wall > > >> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 > > >> -I/home/akalpalp/petsc/include > > >> -I/home/akalpalp/petsc/arch-mswin-c-debug/include `pwd`/ex19.c > > >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -Wall > -Wwrite-strings > > >> -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -o ex19 ex19.o > > >> -L/home/akalpalp/petsc/arch-mswin-c-debug/lib -lpetsc > > >> -Wl,-rpath,/home/akalpalp/petsc/arch-mswin-c-debug/lib -lf2clapack > > >> -lf2cblas -lpthread -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl > > >> > /home/akalpalp/petsc/arch-mswin-c-debug/lib/libpetsc.a(pcregis.o):pcregis.c:(.rdata$.refptr.PCCreate_BDDC[.refptr.PCCreate_BDDC]+0x0): > > >> undefined reference to `PCCreate_BDDC' > > >> collect2: error: ld returned 1 exit status > > >> makefile:108: recipe for target 'ex19' failed > > >> make[3]: [ex19] Error 1 (ignored) > > >> /usr/bin/rm -f ex19.o > > >> Completed test examples > > >> ========================================= > > >> Now to evaluate the computer systems you plan use - do: > > >> make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug > streams > > >> NPMAX= > > >> > > >> > > >> > > >> On Thu, Sep 4, 2014 at 5:37 PM, Barry Smith > wrote: > > >> > > >>> > > >>> This is likely due to the horrible horrible fact that some of the > > >>> bddc files only get compiled if ./configure is run with the option > > >>> --with-pcbddc you will need to rerun ./configure and then make with > that > > >>> option. > > >>> > > >>> I pray that someone removes that horrible confusing configure > option. > > >>> > > >>> Barry > > >>> > > >>> On Sep 4, 2014, at 8:52 AM, Alp Kalpalp > wrote: > > >>> > > >>> > Dear Stefano, > > >>> > > > >>> > I have checked out "master" branch and merged with your > > >>> stefano_zampini/pcbddc-primalfixe branch. configured, compiled and > all > > >>> tests are completed successfully. > > >>> > Then, I tried to compile ex59 with make ex59, it results in > unresolved > > >>> external error. I believe your bddc_feti files are not included in > > >>> compilation. > > >>> > Since I am not experienced on how to solve issues in Petsc, I need > to > > >>> ask several questions; > > >>> > > > >>> > 1-) Are there any global settings to add additonal directories to > > >>> compilation (src\ksp\pc\impls\bddc) > > >>> > 2-) or should I include these files on top of ex59 (AFAIK, > including > > >>> .c files is not a good thing) > > >>> > 3-) and finally what is the better way of helping you (creating > > >>> another branch from yours or what) > > >>> > > > >>> > Thanks in advance > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini < > > >>> stefano.zampini at gmail.com> wrote: > > >>> > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one > is > > >>> the dual of the other) and it does not have its own classes so far. > > >>> > > > >>> > That said, you can experiment with FETI-DP only after having setup > a > > >>> BDDC preconditioner with the options and customization you prefer. > > >>> > Use > > >>> > http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html > > >>> for manual pages. > > >>> > > > >>> > For an 'how to' with FETIDP, please see > > >>> src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically > look > > >>> at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented > > >>> Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once > you have > > >>> F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution > to > > >>> obtain a right-hand side for the FETIDP system and a physical > solution from > > >>> the solution of the FETIDP system. > > >>> > > > >>> > I would recommend you to use the development version of the library > > >>> and either use the ?next? branch or the ?master' branch after having > merged > > >>> in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? > also > > >>> contains the new deluxe scaling operator for BDDC which is not > available to > > >>> use with FETI-DP. > > >>> > > > >>> > If you have any other questions which can be useful for other PETSc > > >>> users, please use the mailing list; otherwise you can contact me > personally. > > >>> > > > >>> > Stefano > > >>> > > > >>> > > > >>> > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > > >>> > > > >>> >> Matthew Knepley writes: > > >>> >>>> 1- Is it possible to complete a FETI-DP solution with the > provided > > >>> >>>> functions in current PetSc release? > > >>> >>>> > > >>> >>> > > >>> >>> There is no FETI-DP in PETSc. > > >>> >> > > >>> >> Wrong. There is PCBDDC, which has the same eigenvalues as > FETI-DP. > > >>> You > > >>> >> can enable it by configuring --with-pcbddc. This will be turned > on by > > >>> >> default soon. It is fairly new, so you should use the branch > 'master' > > >>> >> instead of the release. It has an option to do FETI-DP instead of > > >>> BDDC. > > >>> >> See src/ksp/ksp/examples/tutorials/ex59.c. > > >>> >> > > >>> >> For either of these methods, you have to assemble a MATIS. If > you use > > >>> >> MatSetValuesLocal, most of your assembly code can stay the same. > > >>> >> > > >>> >> Hopefully we can get better examples before the next release. > Stefano > > >>> >> (the author of PCBDDC, Cc'd) tests mostly with external packages, > but > > >>> we > > >>> >> really need more complete tests within PETSc. > > >>> > > > >>> > > > >>> > > >>> > > >> > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to which > their > > > experiments lead. > > > -- Norbert Wiener > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1870643 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 20671 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.log Type: application/octet-stream Size: 113996 bytes Desc: not available URL: From balay at mcs.anl.gov Sat Sep 6 08:32:17 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 6 Sep 2014 08:32:17 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> Message-ID: On Sat, 6 Sep 2014, Alp Kalpalp wrote: > Hi, > > I switched back to master again and PETSC_VERSION_SUBMINOR is equal to 1 > again. I think with the merge of stefano_zampini/pcbddc-primalfixe > it is changing to 0. the merge won't change PETSC_VERSION_SUBMINOR > > As you suggested I removed --with-shared-libraries option, but although > make all seems working make test is failed with attached errors. I'll suggest "rm -rf /home/alp/petsc/arch-mswin-c-debug/" and then retry or if you could clean the git repo with: git reset --hard git clean -f -d -x And then rerun configure Satish > > thanks > > > On Sat, Sep 6, 2014 at 2:44 AM, Satish Balay wrote: > > > perhaps unrelated - but are you sure you had the latest master? > > > > >>>> > > PETSC_VERSION_RELEASE 0 > > PETSC_VERSION_MAJOR 3 > > PETSC_VERSION_MINOR 5 > > PETSC_VERSION_SUBMINOR 0 > > PETSC_VERSION_PATCH 0 > > <<<<<<< > > > > PETSC_VERSION_PATCH should be 1 > > > > > > >>>>> > > CLINKER /home/alp/petsc/arch-mswin-c-debug/lib/libpetsc.so.3.05.0 > > Warning: corrupt .drectve at end of def file > > Warning: corrupt .drectve at end of def file > > Warning: corrupt .drectve at end of def file > > > > rch-mswin-c-debug/obj/src/sys/utils/mpiu.o:(.text+0xdd2): undefined > > reference to `__security_cookie' > > arch-mswin-c-debug/obj/src/sys/utils/mpiu.o:(.text+0xdd2): relocation > > truncated to fit: R_X86_64_PC32 against undefined symbol `__security_cookie' > > <<<<<< > > > > Hm - must be related to --with-shared-libraries option you are > > specifying. This is untested on windows. > > > > Can you remove it and retry? > > > > Satish > > > > On Fri, 5 Sep 2014, Alp Kalpalp wrote: > > > > > Hi, > > > > > > Sorry for the late response. I tried the same sequence. But I need to say > > > that I use git pull before it. Error is now during make all > > > > > > you may find the logs in the attachment. > > > > > > thanks beforehand > > > > > > regards, > > > > > > > > > On Fri, Sep 5, 2014 at 2:51 PM, Matthew Knepley > > wrote: > > > > > > > Please send your configure.log and make.log > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > > > > > On Fri, Sep 5, 2014 at 4:27 AM, Alp Kalpalp > > wrote: > > > > > > > >> As I said before, I have checked out "master" branch and merged with > > your > > > >> stefano_zampini/pcbddc-primalfixe branch. configured, compiled > > successfully. > > > >> I have used --with-pcbddc option in configure as Barry suggested. > > > >> > > > >> However, tests are failed with following reason: > > > >> > > > >> akalpalp at a-kalpalp ~/petsc > > > >> $ make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug > > test > > > >> Running test examples to verify correct installation > > > >> Using PETSC_DIR=/home/akalpalp/petsc and PETSC_ARCH=arch-mswin-c-debug > > > >> *******************Error detected during compile or > > > >> link!******************* > > > >> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > >> /home/akalpalp/petsc/src/snes/examples/tutorials ex19 > > > >> > > > >> > > ********************************************************************************* > > > >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -o ex19.o -c -Wall > > > >> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 > > > >> -I/home/akalpalp/petsc/include > > > >> -I/home/akalpalp/petsc/arch-mswin-c-debug/include `pwd`/ex19.c > > > >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -Wall > > -Wwrite-strings > > > >> -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -o ex19 ex19.o > > > >> -L/home/akalpalp/petsc/arch-mswin-c-debug/lib -lpetsc > > > >> -Wl,-rpath,/home/akalpalp/petsc/arch-mswin-c-debug/lib -lf2clapack > > > >> -lf2cblas -lpthread -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl > > > >> > > /home/akalpalp/petsc/arch-mswin-c-debug/lib/libpetsc.a(pcregis.o):pcregis.c:(.rdata$.refptr.PCCreate_BDDC[.refptr.PCCreate_BDDC]+0x0): > > > >> undefined reference to `PCCreate_BDDC' > > > >> collect2: error: ld returned 1 exit status > > > >> makefile:108: recipe for target 'ex19' failed > > > >> make[3]: [ex19] Error 1 (ignored) > > > >> /usr/bin/rm -f ex19.o > > > >> Completed test examples > > > >> ========================================= > > > >> Now to evaluate the computer systems you plan use - do: > > > >> make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug > > streams > > > >> NPMAX= > > > >> > > > >> > > > >> > > > >> On Thu, Sep 4, 2014 at 5:37 PM, Barry Smith > > wrote: > > > >> > > > >>> > > > >>> This is likely due to the horrible horrible fact that some of the > > > >>> bddc files only get compiled if ./configure is run with the option > > > >>> --with-pcbddc you will need to rerun ./configure and then make with > > that > > > >>> option. > > > >>> > > > >>> I pray that someone removes that horrible confusing configure > > option. > > > >>> > > > >>> Barry > > > >>> > > > >>> On Sep 4, 2014, at 8:52 AM, Alp Kalpalp > > wrote: > > > >>> > > > >>> > Dear Stefano, > > > >>> > > > > >>> > I have checked out "master" branch and merged with your > > > >>> stefano_zampini/pcbddc-primalfixe branch. configured, compiled and > > all > > > >>> tests are completed successfully. > > > >>> > Then, I tried to compile ex59 with make ex59, it results in > > unresolved > > > >>> external error. I believe your bddc_feti files are not included in > > > >>> compilation. > > > >>> > Since I am not experienced on how to solve issues in Petsc, I need > > to > > > >>> ask several questions; > > > >>> > > > > >>> > 1-) Are there any global settings to add additonal directories to > > > >>> compilation (src\ksp\pc\impls\bddc) > > > >>> > 2-) or should I include these files on top of ex59 (AFAIK, > > including > > > >>> .c files is not a good thing) > > > >>> > 3-) and finally what is the better way of helping you (creating > > > >>> another branch from yours or what) > > > >>> > > > > >>> > Thanks in advance > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini < > > > >>> stefano.zampini at gmail.com> wrote: > > > >>> > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one > > is > > > >>> the dual of the other) and it does not have its own classes so far. > > > >>> > > > > >>> > That said, you can experiment with FETI-DP only after having setup > > a > > > >>> BDDC preconditioner with the options and customization you prefer. > > > >>> > Use > > > >>> > > http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html > > > >>> for manual pages. > > > >>> > > > > >>> > For an 'how to' with FETIDP, please see > > > >>> src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically > > look > > > >>> at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented > > > >>> Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once > > you have > > > >>> F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution > > to > > > >>> obtain a right-hand side for the FETIDP system and a physical > > solution from > > > >>> the solution of the FETIDP system. > > > >>> > > > > >>> > I would recommend you to use the development version of the library > > > >>> and either use the ?next? branch or the ?master' branch after having > > merged > > > >>> in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? > > also > > > >>> contains the new deluxe scaling operator for BDDC which is not > > available to > > > >>> use with FETI-DP. > > > >>> > > > > >>> > If you have any other questions which can be useful for other PETSc > > > >>> users, please use the mailing list; otherwise you can contact me > > personally. > > > >>> > > > > >>> > Stefano > > > >>> > > > > >>> > > > > >>> > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > > > >>> > > > > >>> >> Matthew Knepley writes: > > > >>> >>>> 1- Is it possible to complete a FETI-DP solution with the > > provided > > > >>> >>>> functions in current PetSc release? > > > >>> >>>> > > > >>> >>> > > > >>> >>> There is no FETI-DP in PETSc. > > > >>> >> > > > >>> >> Wrong. There is PCBDDC, which has the same eigenvalues as > > FETI-DP. > > > >>> You > > > >>> >> can enable it by configuring --with-pcbddc. This will be turned > > on by > > > >>> >> default soon. It is fairly new, so you should use the branch > > 'master' > > > >>> >> instead of the release. It has an option to do FETI-DP instead of > > > >>> BDDC. > > > >>> >> See src/ksp/ksp/examples/tutorials/ex59.c. > > > >>> >> > > > >>> >> For either of these methods, you have to assemble a MATIS. If > > you use > > > >>> >> MatSetValuesLocal, most of your assembly code can stay the same. > > > >>> >> > > > >>> >> Hopefully we can get better examples before the next release. > > Stefano > > > >>> >> (the author of PCBDDC, Cc'd) tests mostly with external packages, > > but > > > >>> we > > > >>> >> really need more complete tests within PETSc. > > > >>> > > > > >>> > > > > >>> > > > >>> > > > >> > > > > > > > > > > > > -- > > > > What most experimenters take for granted before they begin their > > > > experiments is infinitely more interesting than any results to which > > their > > > > experiments lead. > > > > -- Norbert Wiener > > > > > > > > > > From balay at mcs.anl.gov Sat Sep 6 08:48:59 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 6 Sep 2014 08:48:59 -0500 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <9F78868B-D65D-4AC8-A47B-0B7E2E5B18EA@mcs.anl.gov> Message-ID: BTW: You could also instal lapack (liblapack-devel), openmpi (openmpi-devel) packages from cygwin - and avoid --download-f2cblaslapack and --download-mpich Satish On Sat, 6 Sep 2014, Satish Balay wrote: > On Sat, 6 Sep 2014, Alp Kalpalp wrote: > > > Hi, > > > > I switched back to master again and PETSC_VERSION_SUBMINOR is equal to 1 > > again. I think with the merge of stefano_zampini/pcbddc-primalfixe > > it is changing to 0. > > the merge won't change PETSC_VERSION_SUBMINOR > > > > > As you suggested I removed --with-shared-libraries option, but although > > make all seems working make test is failed with attached errors. > > I'll suggest "rm -rf /home/alp/petsc/arch-mswin-c-debug/" and then retry > > or if you could clean the git repo with: > git reset --hard > git clean -f -d -x > > And then rerun configure > > Satish > > > > > thanks > > > > > > On Sat, Sep 6, 2014 at 2:44 AM, Satish Balay wrote: > > > > > perhaps unrelated - but are you sure you had the latest master? > > > > > > >>>> > > > PETSC_VERSION_RELEASE 0 > > > PETSC_VERSION_MAJOR 3 > > > PETSC_VERSION_MINOR 5 > > > PETSC_VERSION_SUBMINOR 0 > > > PETSC_VERSION_PATCH 0 > > > <<<<<<< > > > > > > PETSC_VERSION_PATCH should be 1 > > > > > > > > > >>>>> > > > CLINKER /home/alp/petsc/arch-mswin-c-debug/lib/libpetsc.so.3.05.0 > > > Warning: corrupt .drectve at end of def file > > > Warning: corrupt .drectve at end of def file > > > Warning: corrupt .drectve at end of def file > > > > > > rch-mswin-c-debug/obj/src/sys/utils/mpiu.o:(.text+0xdd2): undefined > > > reference to `__security_cookie' > > > arch-mswin-c-debug/obj/src/sys/utils/mpiu.o:(.text+0xdd2): relocation > > > truncated to fit: R_X86_64_PC32 against undefined symbol `__security_cookie' > > > <<<<<< > > > > > > Hm - must be related to --with-shared-libraries option you are > > > specifying. This is untested on windows. > > > > > > Can you remove it and retry? > > > > > > Satish > > > > > > On Fri, 5 Sep 2014, Alp Kalpalp wrote: > > > > > > > Hi, > > > > > > > > Sorry for the late response. I tried the same sequence. But I need to say > > > > that I use git pull before it. Error is now during make all > > > > > > > > you may find the logs in the attachment. > > > > > > > > thanks beforehand > > > > > > > > regards, > > > > > > > > > > > > On Fri, Sep 5, 2014 at 2:51 PM, Matthew Knepley > > > wrote: > > > > > > > > > Please send your configure.log and make.log > > > > > > > > > > Thanks, > > > > > > > > > > Matt > > > > > > > > > > > > > > > On Fri, Sep 5, 2014 at 4:27 AM, Alp Kalpalp > > > wrote: > > > > > > > > > >> As I said before, I have checked out "master" branch and merged with > > > your > > > > >> stefano_zampini/pcbddc-primalfixe branch. configured, compiled > > > successfully. > > > > >> I have used --with-pcbddc option in configure as Barry suggested. > > > > >> > > > > >> However, tests are failed with following reason: > > > > >> > > > > >> akalpalp at a-kalpalp ~/petsc > > > > >> $ make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug > > > test > > > > >> Running test examples to verify correct installation > > > > >> Using PETSC_DIR=/home/akalpalp/petsc and PETSC_ARCH=arch-mswin-c-debug > > > > >> *******************Error detected during compile or > > > > >> link!******************* > > > > >> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > > >> /home/akalpalp/petsc/src/snes/examples/tutorials ex19 > > > > >> > > > > >> > > > ********************************************************************************* > > > > >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -o ex19.o -c -Wall > > > > >> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 > > > > >> -I/home/akalpalp/petsc/include > > > > >> -I/home/akalpalp/petsc/arch-mswin-c-debug/include `pwd`/ex19.c > > > > >> /home/akalpalp/petsc/arch-mswin-c-debug/bin/mpicc -Wall > > > -Wwrite-strings > > > > >> -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -o ex19 ex19.o > > > > >> -L/home/akalpalp/petsc/arch-mswin-c-debug/lib -lpetsc > > > > >> -Wl,-rpath,/home/akalpalp/petsc/arch-mswin-c-debug/lib -lf2clapack > > > > >> -lf2cblas -lpthread -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl > > > > >> > > > /home/akalpalp/petsc/arch-mswin-c-debug/lib/libpetsc.a(pcregis.o):pcregis.c:(.rdata$.refptr.PCCreate_BDDC[.refptr.PCCreate_BDDC]+0x0): > > > > >> undefined reference to `PCCreate_BDDC' > > > > >> collect2: error: ld returned 1 exit status > > > > >> makefile:108: recipe for target 'ex19' failed > > > > >> make[3]: [ex19] Error 1 (ignored) > > > > >> /usr/bin/rm -f ex19.o > > > > >> Completed test examples > > > > >> ========================================= > > > > >> Now to evaluate the computer systems you plan use - do: > > > > >> make PETSC_DIR=/home/akalpalp/petsc PETSC_ARCH=arch-mswin-c-debug > > > streams > > > > >> NPMAX= > > > > >> > > > > >> > > > > >> > > > > >> On Thu, Sep 4, 2014 at 5:37 PM, Barry Smith > > > wrote: > > > > >> > > > > >>> > > > > >>> This is likely due to the horrible horrible fact that some of the > > > > >>> bddc files only get compiled if ./configure is run with the option > > > > >>> --with-pcbddc you will need to rerun ./configure and then make with > > > that > > > > >>> option. > > > > >>> > > > > >>> I pray that someone removes that horrible confusing configure > > > option. > > > > >>> > > > > >>> Barry > > > > >>> > > > > >>> On Sep 4, 2014, at 8:52 AM, Alp Kalpalp > > > wrote: > > > > >>> > > > > >>> > Dear Stefano, > > > > >>> > > > > > >>> > I have checked out "master" branch and merged with your > > > > >>> stefano_zampini/pcbddc-primalfixe branch. configured, compiled and > > > all > > > > >>> tests are completed successfully. > > > > >>> > Then, I tried to compile ex59 with make ex59, it results in > > > unresolved > > > > >>> external error. I believe your bddc_feti files are not included in > > > > >>> compilation. > > > > >>> > Since I am not experienced on how to solve issues in Petsc, I need > > > to > > > > >>> ask several questions; > > > > >>> > > > > > >>> > 1-) Are there any global settings to add additonal directories to > > > > >>> compilation (src\ksp\pc\impls\bddc) > > > > >>> > 2-) or should I include these files on top of ex59 (AFAIK, > > > including > > > > >>> .c files is not a good thing) > > > > >>> > 3-) and finally what is the better way of helping you (creating > > > > >>> another branch from yours or what) > > > > >>> > > > > > >>> > Thanks in advance > > > > >>> > > > > > >>> > > > > > >>> > > > > > >>> > > > > > >>> > > > > > >>> > > > > > >>> > On Wed, Sep 3, 2014 at 6:18 PM, Stefano Zampini < > > > > >>> stefano.zampini at gmail.com> wrote: > > > > >>> > FETIDP is in PETSc as a byproduct of the BDDC preconditioner (one > > > is > > > > >>> the dual of the other) and it does not have its own classes so far. > > > > >>> > > > > > >>> > That said, you can experiment with FETI-DP only after having setup > > > a > > > > >>> BDDC preconditioner with the options and customization you prefer. > > > > >>> > Use > > > > >>> > > > http://www.mcs.anl.gov/petsc/petsc-dev/docs/manualpages/KSP/index.html > > > > >>> for manual pages. > > > > >>> > > > > > >>> > For an 'how to' with FETIDP, please see > > > > >>> src/ksp/ksp/examples/tutorials/ex.59.c as Jed told you, specifically > > > look > > > > >>> at ComputeKSPFETIDP for obtaining the FETIDP matrix F (implemented > > > > >>> Matrix-free) and the optimal FETIDP dirichlet preconditioner. Once > > > you have > > > > >>> F, you can use PCBDDCMatFETIDPGetRHS and PCBDDCMatFetiDPGetSolution > > > to > > > > >>> obtain a right-hand side for the FETIDP system and a physical > > > solution from > > > > >>> the solution of the FETIDP system. > > > > >>> > > > > > >>> > I would recommend you to use the development version of the library > > > > >>> and either use the ?next? branch or the ?master' branch after having > > > merged > > > > >>> in the branch stefano_zampini/pcbddc-primalfixes. Note that ?next? > > > also > > > > >>> contains the new deluxe scaling operator for BDDC which is not > > > available to > > > > >>> use with FETI-DP. > > > > >>> > > > > > >>> > If you have any other questions which can be useful for other PETSc > > > > >>> users, please use the mailing list; otherwise you can contact me > > > personally. > > > > >>> > > > > > >>> > Stefano > > > > >>> > > > > > >>> > > > > > >>> > On Sep 3, 2014, at 5:19 PM, Jed Brown wrote: > > > > >>> > > > > > >>> >> Matthew Knepley writes: > > > > >>> >>>> 1- Is it possible to complete a FETI-DP solution with the > > > provided > > > > >>> >>>> functions in current PetSc release? > > > > >>> >>>> > > > > >>> >>> > > > > >>> >>> There is no FETI-DP in PETSc. > > > > >>> >> > > > > >>> >> Wrong. There is PCBDDC, which has the same eigenvalues as > > > FETI-DP. > > > > >>> You > > > > >>> >> can enable it by configuring --with-pcbddc. This will be turned > > > on by > > > > >>> >> default soon. It is fairly new, so you should use the branch > > > 'master' > > > > >>> >> instead of the release. It has an option to do FETI-DP instead of > > > > >>> BDDC. > > > > >>> >> See src/ksp/ksp/examples/tutorials/ex59.c. > > > > >>> >> > > > > >>> >> For either of these methods, you have to assemble a MATIS. If > > > you use > > > > >>> >> MatSetValuesLocal, most of your assembly code can stay the same. > > > > >>> >> > > > > >>> >> Hopefully we can get better examples before the next release. > > > Stefano > > > > >>> >> (the author of PCBDDC, Cc'd) tests mostly with external packages, > > > but > > > > >>> we > > > > >>> >> really need more complete tests within PETSc. > > > > >>> > > > > > >>> > > > > > >>> > > > > >>> > > > > >> > > > > > > > > > > > > > > > -- > > > > > What most experimenters take for granted before they begin their > > > > > experiments is infinitely more interesting than any results to which > > > their > > > > > experiments lead. > > > > > -- Norbert Wiener > > > > > > > > > > > > > > > From alpkalpalp at gmail.com Sat Sep 6 16:21:50 2014 From: alpkalpalp at gmail.com (Alp Kalpalp) Date: Sun, 7 Sep 2014 00:21:50 +0300 Subject: [petsc-users] Petsc configuration and multiple branch working Message-ID: Hi, Nowadays I am trying to build a branch and sometimes I am swittching back to master branch. I experienced that configuring and making a branch requires almost 30 minutes on cygwin. I wonder; 1- How do you manage more than one branches? 2- Do we need to reconfigure after a git pull? 3- Are switching back and forward in between branches like me? or just using seperate work-dir for each branch? please help me I am so exhausted to configure and make petsc!!!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Sep 6 16:30:19 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 6 Sep 2014 16:30:19 -0500 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: References: Message-ID: On Sat, Sep 6, 2014 at 4:21 PM, Alp Kalpalp wrote: > Hi, > > Nowadays I am trying to build a branch and sometimes I am swittching back > to master branch. > I experienced that configuring and making a branch requires almost 30 > minutes on cygwin. > I wonder; > > 1- How do you manage more than one branches? > a) Do not reconfigure. You need (normally) only remake across branch changes b) Use ccache to minimzie recompiling c) Run a virtual machine (maybe VirtualBox or VMWare) for Linux since the filesystem latencies on Windows are so much worse. > 2- Do we need to reconfigure after a git pull? > Not normally > 3- Are switching back and forward in between branches like me? or just > using separate work-dir for each branch? > I switch branches ~ 50 times a day. Matt > please help me I am so exhausted to configure and make petsc!!!! > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sat Sep 6 16:38:08 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 6 Sep 2014 16:38:08 -0500 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: References: Message-ID: On Sat, 6 Sep 2014, Alp Kalpalp wrote: > Hi, > > Nowadays I am trying to build a branch and sometimes I am swittching back > to master branch. switching branch during a build will break the build.. > I experienced that configuring and making a branch requires almost 30 > minutes on cygwin. yes windows/cygwin builds are slow.. Linux is faster. I know ssd speeds up builds on linux. I'm guessing it will significantly speedup windows aswell. > I wonder; > > 1- How do you manage more than one branches? I don't do this yet - but I shoud: use different PETSC_ARCHes for different branches perhaps PETSC_ARCH autogenerated by configure should branch name aswell. > 2- Do we need to reconfigure after a git pull? Not normally - but ocassionally there are configure changes which require reconfigure. > 3- Are switching back and forward in between branches like me? or just > using seperate work-dir for each branch? I do both.. And ocassionally do 'rm -rf arch-* && git clean -f -d -x' to get a prinstine repo [which deletes all my builds aswell] > > please help me I am so exhausted to configure and make petsc!!!! Windows builds are slow. But now with gnumake builds - they are much faster than before [configure is still slow]. linux+ssd builds are much faster than windows [hence I'm lazy at maintaining multiple builds..] Satish From bsmith at mcs.anl.gov Sat Sep 6 17:23:47 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 6 Sep 2014 17:23:47 -0500 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: References: Message-ID: For your configuration I would recommend multiple local clones. Once you have the first petsc clone just do git clone petsc petsc-clone-1 git clone petsc petsc-clone-2 Now you have three PETSc directories and can have different branches in each. Note that git is ?smart? and does not copy the entire repository for each clone; it maintains one repository but three working directories. The problem with using different PETSC_ARCH for each branch is that when you switch the branches it will sometimes/often change an include file that many of the C files are dependent on so make gmake will require recompiling much of the library, with different working directories this will not happen. Barry On Sep 6, 2014, at 4:21 PM, Alp Kalpalp wrote: > Hi, > > Nowadays I am trying to build a branch and sometimes I am swittching back to master branch. > I experienced that configuring and making a branch requires almost 30 minutes on cygwin. > I wonder; > > 1- How do you manage more than one branches? > 2- Do we need to reconfigure after a git pull? > 3- Are switching back and forward in between branches like me? or just using seperate work-dir for each branch? > > please help me I am so exhausted to configure and make petsc!!!! From jed at jedbrown.org Sat Sep 6 17:50:05 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 06 Sep 2014 16:50:05 -0600 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: References: Message-ID: <87ha0kwehu.fsf@jedbrown.org> Barry Smith writes: > The problem with using different PETSC_ARCH for each branch is that > when you switch the branches it will sometimes/often change an > include file that many of the C files are dependent on so make > gmake will require recompiling much of the library, with different > working directories this will not happen. I use ccache so that those "recompiles" take less than 10 seconds on average. I don't think having a separate clone per branch is useful, so I just have one clone and about 50 PETSC_ARCHes within it. Reconfigure is usually not necessary unless you have to go way back in history. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sat Sep 6 18:03:40 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 6 Sep 2014 18:03:40 -0500 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: <87ha0kwehu.fsf@jedbrown.org> References: <87ha0kwehu.fsf@jedbrown.org> Message-ID: <8BAA96E2-06F3-4454-ABA6-E1FFB33B9FAE@mcs.anl.gov> How do you use ccache? Do you do this ?weird? thing To install for the second method, do something like this: cp ccache /usr/local/bin/ ln -s ccache /usr/local/bin/gcc ln -s ccache /usr/local/bin/g++ ln -s ccache /usr/local/bin/cc ln -s ccache /usr/local/bin/c++ Thanks Barry On Sep 6, 2014, at 5:50 PM, Jed Brown wrote: > Barry Smith writes: > >> The problem with using different PETSC_ARCH for each branch is that >> when you switch the branches it will sometimes/often change an >> include file that many of the C files are dependent on so make >> gmake will require recompiling much of the library, with different >> working directories this will not happen. > > I use ccache so that those "recompiles" take less than 10 seconds on > average. I don't think having a separate clone per branch is useful, so > I just have one clone and about 50 PETSC_ARCHes within it. Reconfigure > is usually not necessary unless you have to go way back in history. From balay at mcs.anl.gov Sat Sep 6 18:03:52 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 6 Sep 2014 18:03:52 -0500 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: <87ha0kwehu.fsf@jedbrown.org> References: <87ha0kwehu.fsf@jedbrown.org> Message-ID: On Sat, 6 Sep 2014, Jed Brown wrote: > Barry Smith writes: > > > The problem with using different PETSC_ARCH for each branch is that > > when you switch the branches it will sometimes/often change an > > include file that many of the C files are dependent on so make > > gmake will require recompiling much of the library, with different > > working directories this will not happen. > > I use ccache so that those "recompiles" take less than 10 seconds on > average. I don't think having a separate clone per branch is useful, so > I just have one clone and about 50 PETSC_ARCHes within it. Reconfigure > is usually not necessary unless you have to go way back in history. fedora linux defaults to using ccache - I guess I've been using it all along.. Do you do any additional tuning of ccache? $ type gcc g++ gcc is /usr/lib64/ccache/gcc g++ is /usr/lib64/ccache/g++ Satish From jed at jedbrown.org Sat Sep 6 18:11:12 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 06 Sep 2014 17:11:12 -0600 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: References: <87ha0kwehu.fsf@jedbrown.org> Message-ID: <87egvowdin.fsf@jedbrown.org> Satish Balay writes: > fedora linux defaults to using ccache - I guess I've been using it all along.. > Do you do any additional tuning of ccache? > > $ type gcc g++ > gcc is /usr/lib64/ccache/gcc > g++ is /usr/lib64/ccache/g++ Doesn't need anything else. I don't use Fedora and don't actually want everything cached because I compile some large projects that I'll never recompile, so I inject my wrappers at the MPI level. $ cat ~/usr/ccache/mpich/bin/mpicc #!/bin/dash ccache /opt/mpich/bin/mpicc "$@" But what Fedora is doing is quite sensible. Note that ccache is GPLv3+ so Apple is probably searching for a way to make it not work on a Mac... -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Sat Sep 6 18:14:06 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 6 Sep 2014 18:14:06 -0500 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: <8BAA96E2-06F3-4454-ABA6-E1FFB33B9FAE@mcs.anl.gov> References: <87ha0kwehu.fsf@jedbrown.org> <8BAA96E2-06F3-4454-ABA6-E1FFB33B9FAE@mcs.anl.gov> Message-ID: On Sat, 6 Sep 2014, Barry Smith wrote: > > How do you use ccache? Do you do this ?weird? thing > > To install for the second method, do something like this: > > cp ccache /usr/local/bin/ > ln -s ccache /usr/local/bin/gcc > ln -s ccache /usr/local/bin/g++ > ln -s ccache /usr/local/bin/cc > ln -s ccache /usr/local/bin/c++ My linux box is configured this way [by the distribution] balay at asterix /home/balay $ which gcc /usr/lib64/ccache/gcc balay at asterix /home/balay $ cd /usr/lib64/ccache/ balay at asterix /usr/lib64/ccache $ ls -l total 0 lrwxrwxrwx. 1 root root 16 Sep 2 16:41 c++ -> ../../bin/ccache* lrwxrwxrwx. 1 root root 16 Sep 2 16:39 cc -> ../../bin/ccache* lrwxrwxrwx. 1 root root 16 Sep 2 16:41 g++ -> ../../bin/ccache* lrwxrwxrwx. 1 root root 16 Sep 2 16:39 gcc -> ../../bin/ccache* lrwxrwxrwx. 1 root root 16 Sep 2 16:41 x86_64-redhat-linux-c++ -> ../../bin/ccache* lrwxrwxrwx. 1 root root 16 Sep 2 16:41 x86_64-redhat-linux-g++ -> ../../bin/ccache* lrwxrwxrwx. 1 root root 16 Sep 2 16:39 x86_64-redhat-linux-gcc -> ../../bin/ccache* balay at asterix /usr/lib64/ccache $ Satish > > Thanks > > Barry > > > > On Sep 6, 2014, at 5:50 PM, Jed Brown wrote: > > > Barry Smith writes: > > > >> The problem with using different PETSC_ARCH for each branch is that > >> when you switch the branches it will sometimes/often change an > >> include file that many of the C files are dependent on so make > >> gmake will require recompiling much of the library, with different > >> working directories this will not happen. > > > > I use ccache so that those "recompiles" take less than 10 seconds on > > average. I don't think having a separate clone per branch is useful, so > > I just have one clone and about 50 PETSC_ARCHes within it. Reconfigure > > is usually not necessary unless you have to go way back in history. > > From knepley at gmail.com Sat Sep 6 19:09:01 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 6 Sep 2014 19:09:01 -0500 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: <8BAA96E2-06F3-4454-ABA6-E1FFB33B9FAE@mcs.anl.gov> References: <87ha0kwehu.fsf@jedbrown.org> <8BAA96E2-06F3-4454-ABA6-E1FFB33B9FAE@mcs.anl.gov> Message-ID: On Sat, Sep 6, 2014 at 6:03 PM, Barry Smith wrote: > > How do you use ccache? Do you do this ?weird? thing > > To install for the second method, do something like this: > --with-cc='ccache gcc' --download-mpich works fine. Matt > cp ccache /usr/local/bin/ > ln -s ccache /usr/local/bin/gcc > ln -s ccache /usr/local/bin/g++ > ln -s ccache /usr/local/bin/cc > ln -s ccache /usr/local/bin/c++ > > Thanks > > Barry > > > > On Sep 6, 2014, at 5:50 PM, Jed Brown wrote: > > > Barry Smith writes: > > > >> The problem with using different PETSC_ARCH for each branch is that > >> when you switch the branches it will sometimes/often change an > >> include file that many of the C files are dependent on so make > >> gmake will require recompiling much of the library, with different > >> working directories this will not happen. > > > > I use ccache so that those "recompiles" take less than 10 seconds on > > average. I don't think having a separate clone per branch is useful, so > > I just have one clone and about 50 PETSC_ARCHes within it. Reconfigure > > is usually not necessary unless you have to go way back in history. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Sep 6 19:59:52 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 6 Sep 2014 19:59:52 -0500 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: References: <87ha0kwehu.fsf@jedbrown.org> <8BAA96E2-06F3-4454-ABA6-E1FFB33B9FAE@mcs.anl.gov> Message-ID: <442996BE-60FF-4234-B19B-55219F1E76BC@mcs.anl.gov> What will I do with all this new freed up time? Does it work in Xcode? Barry On Sep 6, 2014, at 7:09 PM, Matthew Knepley wrote: > On Sat, Sep 6, 2014 at 6:03 PM, Barry Smith wrote: > > How do you use ccache? Do you do this ?weird? thing > > To install for the second method, do something like this: > > --with-cc='ccache gcc' --download-mpich > > works fine. > > Matt > > cp ccache /usr/local/bin/ > ln -s ccache /usr/local/bin/gcc > ln -s ccache /usr/local/bin/g++ > ln -s ccache /usr/local/bin/cc > ln -s ccache /usr/local/bin/c++ > > Thanks > > Barry > > > > On Sep 6, 2014, at 5:50 PM, Jed Brown wrote: > > > Barry Smith writes: > > > >> The problem with using different PETSC_ARCH for each branch is that > >> when you switch the branches it will sometimes/often change an > >> include file that many of the C files are dependent on so make > >> gmake will require recompiling much of the library, with different > >> working directories this will not happen. > > > > I use ccache so that those "recompiles" take less than 10 seconds on > > average. I don't think having a separate clone per branch is useful, so > > I just have one clone and about 50 PETSC_ARCHes within it. Reconfigure > > is usually not necessary unless you have to go way back in history. > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From jed at jedbrown.org Sat Sep 6 20:23:18 2014 From: jed at jedbrown.org (Jed Brown) Date: Sat, 06 Sep 2014 19:23:18 -0600 Subject: [petsc-users] Petsc configuration and multiple branch working In-Reply-To: <442996BE-60FF-4234-B19B-55219F1E76BC@mcs.anl.gov> References: <87ha0kwehu.fsf@jedbrown.org> <8BAA96E2-06F3-4454-ABA6-E1FFB33B9FAE@mcs.anl.gov> <442996BE-60FF-4234-B19B-55219F1E76BC@mcs.anl.gov> Message-ID: <87zjecusu1.fsf@jedbrown.org> Barry Smith writes: > What will I do with all this new freed up time? Does it work in Xcode? Sounds like that could use up the newfound time and then some. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From john.m.alletto at lmco.com Sun Sep 7 16:35:51 2014 From: john.m.alletto at lmco.com (Alletto, John M) Date: Sun, 7 Sep 2014 21:35:51 +0000 Subject: [petsc-users] Where can I learn about combining grids for a Laplacian Message-ID: I have a 3D Laplacian program that I wrote in matlab that I would like to transition to PETSc. In my current program I have an inner cube which utilizes a 27 point uniform stencil surrounded by an outer cubical which uses a smaller stencil on a variable grid. I require the inner cubical to contain accurate results and do not care if error accumulates in the outer regions. Where can I learn about combining grids for PETSc implementation? Many Thanks John -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Sep 7 20:06:44 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 07 Sep 2014 19:06:44 -0600 Subject: [petsc-users] Where can I learn about combining grids for a Laplacian In-Reply-To: References: Message-ID: <87wq9egbtn.fsf@jedbrown.org> "Alletto, John M" writes: > I have a 3D Laplacian program that I wrote in matlab that I would like to transition to PETSc. > > In my current program I have an inner cube which utilizes a 27 point uniform stencil > surrounded by an outer cubical which uses a smaller stencil on a variable grid. > > I require the inner cubical to contain accurate results and do not care if error accumulates in the outer regions. This is not how elliptic PDEs work. In general, you need accuracy throughout the domain to have an accurate solution in a desired area. Depending on the coefficients and forcing function, you might get away with adaptive resolution, but you can't naively lay down a fine mesh/accurate discretization in the target area and expect to get an accurate solution. Now if you had, for example, isotropic coefficients in most of the domain, but anisotropic tensor-valued coefficients in a region, you could use a stencil that is (necessarily) 27-point in the anisotropic region and reduces to 7-point elsewhere. For that, you could preallocate a matrix with the exact number of nonzeros that will be needed, though I recommend starting with the 27-point case and only optimize later if it will provide a clear benefit. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From mailinglists at xgm.de Mon Sep 8 02:19:21 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Mon, 08 Sep 2014 09:19:21 +0200 Subject: [petsc-users] Putting petsc in a namespace In-Reply-To: <87mwaeys43.fsf@jedbrown.org> References: <87mwaeys43.fsf@jedbrown.org> Message-ID: <36997a2571932e4d9029a5f869d25fd9@xgm.de> Hi there, ... Am 05.09.2014 18:00, schrieb Jed Brown: > Florian Lindner writes: > >> Hello, >> >> This may be rather a C/C++ question, but ... >> >> I encapsulate some petsc functions into c++ classes. Since I don't >> want >> to pull all petsc symbols into the global namespace for anyone using >> my >> classes I try to put petsc into it's own namespace: >> >> Header petsc.h: > > Note that there is already a petsc.h in the PETSc distribution. It is > different from yours. Ok, that may not be causing problems here, but it will certainly cause problems in the future, renamed it to petnum.h / .cpp. Thx! >> namespace petsc { >> #include "petscmat.h" >> } >> >> class Vector { >> petsc::Vec vector; >> } >> >> >> Implementation petsc.cpp: >> >> #include "petsc.h" >> >> namespace petsc { >> #include "petscviewer.h" >> } >> >> using namespace petsc; >> >> >> User: >> >> #include "petsc.h" >> #include // if the user wants he can import parts of >> petsc >> of course >> >> >> But this gives a massive amount of error messsages like: >> >> mpic++ -o petsc.o -c -O0 -g3 -Wall >> -I/home/florian/software/petsc/include >> -I/home/florian/software/petsc/arch-linux2-c-debug/include petsc.cpp >> mpic++ -o prbf.o -c -O0 -g3 -Wall >> -I/home/florian/software/petsc/include >> -I/home/florian/software/petsc/arch-linux2-c-debug/include prbf.cpp >> In file included from >> /home/florian/software/petsc/include/petscksp.h:6:0, >> from prbf.cpp:9: >> /home/florian/software/petsc/include/petscpc.h:9:14: error: >> 'PetscErrorCode' does not name a type >> PETSC_EXTERN PetscErrorCode PCInitializePackage(void); > > What do you expect when you put some things in the namespace and > include > dependencies outside the namespace? Well, I have these symbols in the petsc and in the global namespace. > namespaces don't play well with macros (including header guards). I > don't think what you are attempting is a good use of time, but if you > do > it, I would recommend creating a public interface that does not include > any PETSc headers, thus completely hiding the PETSc interface. I want to offer an interface that uses some Petsc types (like Vec and Mat) but I do not want to import all petsc symbols into the global namespace for anyone using the interface. That does not seam possible though... Thanks, Florian From C.Klaij at marin.nl Mon Sep 8 02:45:29 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 8 Sep 2014 07:45:29 +0000 Subject: [petsc-users] fieldsplit_0_ monitor in combination with selfp In-Reply-To: References: <2bc6df3de1c645e69d98f3673de704b0@MAR190n2.marin.local> <326013535d8a4af4ad43bc7ab4945f92@MAR190n2.marin.local> <1f81bb0885e94ce59a1f4aa683619cbb@MAR190N1.marin.local> <62638b7e069743d09cc9f5f7e0a4ece3@MAR190N1.marin.local>, Message-ID: Matt, Thanks for clarifying this issue (default preonly for A00 seems rather strange to me). Anyway, in summary: 1) Nothing wrong with ex70.c, one just needs to explicitly specifiy "-fieldsplit_0_ksp_type gmres" to avoid the default preonly for A00 in the diagonal. 2) I've verified using petsc-3.5.1 that "out-of-the-box" SIMPLE mpiexec -n 2 ./ex70 -nx 4 -ny 6 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower -pc_fieldsplit_schur_precondition selfp -fieldsplit_1_inner_ksp_type preonly -fieldsplit_1_inner_pc_type jacobi -fieldsplit_0_ksp_monitor -fieldsplit_1_ksp_monitor -ksp_monitor -fieldsplit_0_ksp_type gmres > out1 gives almost the same as "user" SIMPLE mpiexec -n 2 ./ex70 -nx 4 -ny 6 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type lower -user_ksp -fieldsplit_0_ksp_monitor -fieldsplit_1_ksp_monitor -ksp_monitor > out2 as it should. See attachments for out1 and out2. 3) "out-of-the-box" SIMPLE is easier because the user doesn't need to make the Schur approximation, thanks for adding this to PETSc! Chris MARIN news: Development of a Scaled-Down Floating Wind Turbine for Offshore Basin Testing This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley Sent: Friday, September 05, 2014 5:38 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp On Fri, Sep 5, 2014 at 7:31 AM, Klaij, Christiaan > wrote: Thanks! I've spotted another difference: you are setting the fieldsplit_0_ksp_type and I'm not, just relying on the default instead. If I add -fieldsplit_0_ksp_type gmres then is also get the correct answer. Probably, you will get my problem if you remove -fieldsplit_velocity. This is not a bug. The default solver for A00 is preonly, unless it is used as the inner solver as well, in which case it defaults to GMRES so as not to give an inexact Schur complement by default. ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type gmres -fieldsplit_pressure_inner_ksp_max_it 1 -fieldsplit_pressure_inner_pc_type jacobi -pc_fieldsplit_schur_precondition selfp SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=77 total number of function evaluations=2 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 right preconditioning has attached null space using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=962, cols=962 package used to perform factorization: petsc total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=145, cols=145 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=145, cols=962 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_pressure_inner_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=962, cols=145 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 Mat Object: 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=2601, allocated nonzeros=2601 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1107, cols=1107 total: nonzeros=29785, allocated nonzeros=29785 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 513 nodes, limit used is 5 Matt mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -pc_fieldsplit_schur_precondition selfp \ -fieldsplit_1_inner_ksp_type preonly \ -fieldsplit_1_inner_pc_type jacobi \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 \ -fieldsplit_0_ksp_type gmres -ksp_view KSP Object: 2 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 2 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 2 MPI processes type: gmres MARIN news: Development of a Scaled-Down Floating Wind Turbine for Offshore Basin Testing This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Friday, September 05, 2014 2:10 PM To: Klaij, Christiaan; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp On Fri, Sep 5, 2014 at 1:34 AM, Klaij, Christiaan > wrote: Matt, I think the problem is somehow related to -pc_fieldsplit_schur_precondition selfp. In the example below your are not using that option. Here is the selfp output. It retains the A00 solver. ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -ksp_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi -pc_fieldsplit_schur_precondition selfp SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=20 total number of function evaluations=2 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 right preconditioning has attached null space using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 3.45047 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=962, cols=962 package used to perform factorization: petsc total: nonzeros=68692, allocated nonzeros=68692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 456 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=145, cols=145 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=145, cols=962 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_pressure_inner_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_pressure_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=962, cols=145 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 Mat Object: 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=2601, allocated nonzeros=2601 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1107, cols=1107 total: nonzeros=29785, allocated nonzeros=29785 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 513 nodes, limit used is 5 Thanks, Matt Chris [cid:imagea7316f.JPG at f7909bee.44832f74][cid:image4a95d2.JPG at 67db08fd.4e9b7a1c] dr. ir. Christiaan Klaij CFD Researcher Research & Development MARIN 2, Haagsteeg E C.Klaij at marin.nl P.O. Box 28 T +31 317 49 39 11 6700 AA Wageningen F +31 317 49 32 45 T +31 317 49 33 44 The Netherlands I www.marin.nl MARIN news: MARIN at SMM, Hamburg, September 9-12 This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system. ________________________________ From: Matthew Knepley > Sent: Friday, September 05, 2014 12:36 AM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp On Thu, Sep 4, 2014 at 7:26 AM, Klaij, Christiaan > wrote: Sorry, here's the ksp_view. I'm expecting -fieldsplit_1_inner_ksp_type preonly to set the ksp(A00) in the Schur complement only, but it seems to set it in the inv(A00) of the diagonal as well. I think something is wrong in your example (we strongly advise against using MatNest directly). I cannot reproduce this using SNES ex62: ./config/builder2.py check src/snes/examples/tutorials/ex62.c --testnum=36 --args="-fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi" which translates to ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -ksp_converged_reason -snes_view -show_solution 0 -fieldsplit_pressure_inner_ksp_type preonly -fieldsplit_pressure_inner_pc_type jacobi gives Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 1 SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=20 total number of function evaluations=2 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 right preconditioning has attached null space using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 3.45047 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=962, cols=962 package used to perform factorization: petsc total: nonzeros=68692, allocated nonzeros=68692 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 456 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000 left preconditioning has attached null space using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=145, cols=145 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=145, cols=962 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_pressure_inner_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_pressure_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=962, cols=962 total: nonzeros=19908, allocated nonzeros=19908 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=962, cols=145 total: nonzeros=4466, allocated nonzeros=4466 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 481 nodes, limit used is 5 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=145, cols=145 total: nonzeros=945, allocated nonzeros=945 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1107, cols=1107 total: nonzeros=29785, allocated nonzeros=29785 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 513 nodes, limit used is 5 Matt Chris 0 KSP Residual norm 1.229687498638e+00 Residual norms for fieldsplit_1_ solve. 0 KSP Residual norm 7.185799114488e+01 1 KSP Residual norm 3.873274154012e+01 1 KSP Residual norm 1.107969383366e+00 KSP Object: 1 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (fieldsplit_0_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=48, cols=48 package used to perform factorization: petsc total: nonzeros=200, allocated nonzeros=200 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=48, cols=48 total: nonzeros=200, allocated nonzeros=240 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=48, cols=48 total: nonzeros=200, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_1_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (fieldsplit_1_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_sub_) 1 MPI processes type: bjacobi block Jacobi: number of blocks = 1 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (fieldsplit_1_sub_sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_sub_sub_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=24, cols=24 package used to perform factorization: petsc total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=24, cols=24 total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mpiaij rows=24, cols=24 total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: schurcomplement rows=24, cols=24 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_1_) 1 MPI processes type: mpiaij rows=24, cols=24 total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 5 nodes, limit used is 5 A10 Mat Object: (a10_) 1 MPI processes type: mpiaij rows=24, cols=48 total: nonzeros=96, allocated nonzeros=96 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines KSP of A00 KSP Object: (fieldsplit_1_inner_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_1_inner_) 1 MPI processes type: jacobi linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: mpiaij rows=48, cols=48 total: nonzeros=200, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines A01 Mat Object: (a01_) 1 MPI processes type: mpiaij rows=48, cols=24 total: nonzeros=96, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Mat Object: 1 MPI processes type: mpiaij rows=24, cols=24 total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=72, cols=72 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="fieldsplit_0_", type=mpiaij, rows=48, cols=48 (0,1) : prefix="a01_", type=mpiaij, rows=48, cols=24 (1,0) : prefix="a10_", type=mpiaij, rows=24, cols=48 (1,1) : prefix="fieldsplit_1_", type=mpiaij, rows=24, cols=24 From: Matthew Knepley > Sent: Thursday, September 04, 2014 2:20 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] fieldsplit_0_ monitor in combination with selfp On Thu, Sep 4, 2014 at 7:06 AM, Klaij, Christiaan > wrote: I'm playing with the selfp option in fieldsplit using snes/examples/tutorials/ex70.c. For example: mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -pc_fieldsplit_schur_precondition selfp \ -fieldsplit_1_inner_ksp_type preonly \ -fieldsplit_1_inner_pc_type jacobi \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 gives the following output 0 KSP Residual norm 1.229687498638e+00 Residual norms for fieldsplit_1_ solve. 0 KSP Residual norm 2.330138480101e+01 1 KSP Residual norm 1.609000846751e+01 1 KSP Residual norm 1.180287268335e+00 To my suprise I don't see anything for the fieldsplit_0_ solve, why? Always run with -ksp_view for any solver question. Thanks, Matt Furthermore, if I understand correctly the above should be exactly equivalent with mpiexec -n 2 ./ex70 -nx 4 -ny 6 \ -ksp_type fgmres \ -pc_type fieldsplit \ -pc_fieldsplit_type schur \ -pc_fieldsplit_schur_fact_type lower \ -user_ksp \ -fieldsplit_0_ksp_monitor -fieldsplit_0_ksp_max_it 1 \ -fieldsplit_1_ksp_monitor -fieldsplit_1_ksp_max_it 1 \ -ksp_monitor -ksp_max_it 1 0 KSP Residual norm 1.229687498638e+00 Residual norms for fieldsplit_0_ solve. 0 KSP Residual norm 5.486639587672e-01 1 KSP Residual norm 6.348354253703e-02 Residual norms for fieldsplit_1_ solve. 0 KSP Residual norm 2.321938107977e+01 1 KSP Residual norm 1.605484031258e+01 1 KSP Residual norm 1.183225251166e+00 because -user_ksp replaces the Schur complement by the simple approximation A11 - A10 inv(diag(A00)) A01. Beside the missing fielsplit_0_ part, the numbers are pretty close but not exactly the same. Any explanation? Chris dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image4a95d2.JPG Type: image/jpeg Size: 1622 bytes Desc: image4a95d2.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagea7316f.JPG Type: image/jpeg Size: 1069 bytes Desc: imagea7316f.JPG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: out1 Type: application/octet-stream Size: 12614 bytes Desc: out1 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: out2 Type: application/octet-stream Size: 12614 bytes Desc: out2 URL: From jed at jedbrown.org Mon Sep 8 09:08:29 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 08 Sep 2014 08:08:29 -0600 Subject: [petsc-users] Putting petsc in a namespace In-Reply-To: <36997a2571932e4d9029a5f869d25fd9@xgm.de> References: <87mwaeys43.fsf@jedbrown.org> <36997a2571932e4d9029a5f869d25fd9@xgm.de> Message-ID: <87ha0ifbmq.fsf@jedbrown.org> Florian Lindner writes: >> What do you expect when you put some things in the namespace and >> include >> dependencies outside the namespace? > > Well, I have these symbols in the petsc and in the global namespace. No, the headers have guards so that they are only processed once. So the first time you include a file, that is the only place those symbols will be declared. And don't try to hack around the header guard because that is disallowed by the standard (though it might work anyway) and will cause macro collisions. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Mon Sep 8 10:53:27 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 8 Sep 2014 10:53:27 -0500 Subject: [petsc-users] Putting petsc in a namespace In-Reply-To: <36997a2571932e4d9029a5f869d25fd9@xgm.de> References: <87mwaeys43.fsf@jedbrown.org> <36997a2571932e4d9029a5f869d25fd9@xgm.de> Message-ID: <227B30B0-4D46-4940-8281-4081877F083D@mcs.anl.gov> On Sep 8, 2014, at 2:19 AM, Florian Lindner wrote: > > I want to offer an interface that uses some Petsc types (like Vec and Mat) but I do not want to import all petsc symbols into the global namespace for anyone using the interface. That does not seam possible though... > > Thanks, > > Florian Which symbols do you want to have available and which ones not? Do you not want to use the other symbols but not let the user use them, for example, you want to use KSP but don?t want the user to have access to them? Or something else. Note that you can configure PETSc into several libraries and then use only the ones you want with -with-single-library=no it creates a separate TS, SNES, KSP, DM, Mat,and Vec library. Let us know what you want in more detail and we may have suggestions on how to achieve it. Barry From balay at mcs.anl.gov Mon Sep 8 11:29:16 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 8 Sep 2014 11:29:16 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> Message-ID: On Mon, 1 Sep 2014, Matthew Knepley wrote: > On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik wrote: > > Satish, can we patch our download source? I can do it, but I am not sure of > the process for HDF5. Added a repo https://bitbucket.org/petsc/pkg-hdf5 Fixed the relavent files https://bitbucket.org/petsc/pkg-hdf5/commits/45e78ea38975af56f6fe26cfd7068475a2d94f10 Spun a new tarball http://ftp.mcs.anl.gov/pub/petsc/externalpackages/hdf5-1.8.10-patch1.1.tar.gz Updated maint-3.4 and maint (3.5) Satish From jed at jedbrown.org Mon Sep 8 11:53:56 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 08 Sep 2014 10:53:56 -0600 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> Message-ID: <8761gyf3yz.fsf@jedbrown.org> Satish Balay writes: > On Mon, 1 Sep 2014, Matthew Knepley wrote: > >> On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik wrote: >> >> Satish, can we patch our download source? I can do it, but I am not sure of >> the process for HDF5. > > Added a repo https://bitbucket.org/petsc/pkg-hdf5 > > Fixed the relavent files https://bitbucket.org/petsc/pkg-hdf5/commits/45e78ea38975af56f6fe26cfd7068475a2d94f10 > > Spun a new tarball http://ftp.mcs.anl.gov/pub/petsc/externalpackages/hdf5-1.8.10-patch1.1.tar.gz Uh, hdf5-1.8.12 (at least) has fixed these problems. Why not upgrade (preferably to 1.8.13)? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Mon Sep 8 12:15:18 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 8 Sep 2014 12:15:18 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <8761gyf3yz.fsf@jedbrown.org> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <8761gyf3yz.fsf@jedbrown.org> Message-ID: On Mon, 8 Sep 2014, Jed Brown wrote: > Satish Balay writes: > > > On Mon, 1 Sep 2014, Matthew Knepley wrote: > > > >> On Mon, Sep 1, 2014 at 6:57 AM, ?smund Ervik wrote: > >> > >> Satish, can we patch our download source? I can do it, but I am not sure of > >> the process for HDF5. > > > > Added a repo https://bitbucket.org/petsc/pkg-hdf5 > > > > Fixed the relavent files https://bitbucket.org/petsc/pkg-hdf5/commits/45e78ea38975af56f6fe26cfd7068475a2d94f10 > > > > Spun a new tarball http://ftp.mcs.anl.gov/pub/petsc/externalpackages/hdf5-1.8.10-patch1.1.tar.gz > > Uh, hdf5-1.8.12 (at least) has fixed these problems. Why not upgrade > (preferably to 1.8.13)? Something to check for master.. Satish From balay at mcs.anl.gov Mon Sep 8 12:37:39 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 8 Sep 2014 12:37:39 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <54045F23.2070506@ntnu.no> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> Message-ID: On Mon, 1 Sep 2014, ?smund Ervik wrote: > This is the reason why I asked whether I can somehow tell > "--download-hdf5" to download a more recent version. 1.8.11 should do it. This works for me with examples.. --download-hdf5=http://www.hdfgroup.org/ftp/HDF5/prev-releases/hdf5-1.8.13/src/hdf5-1.8.13.tar.gz Satish ----------- balay at asterix /home/balay/petsc/src/vec/vec/examples/tutorials (master) $ ./ex10 -hdf5 Vec Object:Test_Vec 1 MPI processes type: seq 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 writing vector in hdf5 to vector.dat ... reading vector in hdf5 from vector.dat ... Vec Object:Test_Vec 1 MPI processes type: seq 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 balay at asterix /home/balay/petsc/src/vec/vec/examples/tutorials (master) $ From jed at jedbrown.org Mon Sep 8 12:52:23 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 08 Sep 2014 11:52:23 -0600 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> Message-ID: <87r3zmdmp4.fsf@jedbrown.org> Satish Balay writes: > On Mon, 1 Sep 2014, ?smund Ervik wrote: > >> This is the reason why I asked whether I can somehow tell >> "--download-hdf5" to download a more recent version. 1.8.11 should do it. > > > This works for me with examples.. > > --download-hdf5=http://www.hdfgroup.org/ftp/HDF5/prev-releases/hdf5-1.8.13/src/hdf5-1.8.13.tar.gz Sounds good, will you upgrade hdf5.py? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Mon Sep 8 13:51:34 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 8 Sep 2014 13:51:34 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <87r3zmdmp4.fsf@jedbrown.org> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <87r3zmdmp4.fsf@jedbrown.org> Message-ID: On Mon, 8 Sep 2014, Jed Brown wrote: > Satish Balay writes: > > > On Mon, 1 Sep 2014, ?smund Ervik wrote: > > > >> This is the reason why I asked whether I can somehow tell > >> "--download-hdf5" to download a more recent version. 1.8.11 should do it. > > > > > > This works for me with examples.. > > > > --download-hdf5=http://www.hdfgroup.org/ftp/HDF5/prev-releases/hdf5-1.8.13/src/hdf5-1.8.13.tar.gz > > Sounds good, will you upgrade hdf5.py? pushed and merged to next https://bitbucket.org/petsc/petsc/commits/72dba1de4d91de2998c7fb909813c856df9b7209 satish From C.Klaij at marin.nl Tue Sep 9 02:31:23 2014 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 9 Sep 2014 07:31:23 +0000 Subject: [petsc-users] how to change KSP of A00 inside the Schur complement? Message-ID: <1b5f0863dd9347969309b256ad97932c@MAR190N1.marin.local> In the program below, I'm using PCFieldSplitGetSubKSP to get the sub KSP's of a Schur fieldsplit preconditioner. I'm setting fieldsplit_0 to BICG+ILU and fieldsplit_1 to CG+ICC. Running $ ./fieldsplittry -ksp_view shows that this works as expected (full output below). Now, I would like to change the KSP of A00 inside the Schur complement, so I'm running $ ./fieldsplittry -fieldsplit_1_inner_ksp_type preonly -fieldsplit_1_inner_pc_type jacobi -ksp_view (full output below). To my surprise, this shows the fieldsplit_1_inner_ KSP to be BICG+ILU while the fieldsplit_0_ KSP is changed to preonly; the fieldsplit_0_ PC is still ILU (no Jacobi anywhere). What am I doing wrong this time? The bottom line is: how do I set the KSP of A00 inside the Schur complement to preonly+jacobi while keeping the other settings? Preferably directly in the code without command-line arguments. Chris $ cat fieldsplittry.F90 program fieldsplittry use petscksp implicit none #include PetscErrorCode :: ierr PetscInt :: size,i,j,start,end,n=4,numsplit=1 PetscScalar :: zero=0.0,one=1.0 Vec :: diag3,x,b Mat :: A,subA(4),myS PC :: pc,subpc(2) KSP :: ksp,subksp(2) IS :: isg(2) call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr); ! vectors call VecCreateMPI(MPI_COMM_WORLD,3*n,PETSC_DECIDE,diag3,ierr); CHKERRQ(ierr) call VecSet(diag3,one,ierr); CHKERRQ(ierr) call VecCreateMPI(MPI_COMM_WORLD,4*n,PETSC_DECIDE,x,ierr); CHKERRQ(ierr) call VecSet(x,zero,ierr); CHKERRQ(ierr) call VecDuplicate(x,b,ierr); CHKERRQ(ierr) call VecSet(b,one,ierr); CHKERRQ(ierr) ! matrix a00 call MatCreateAIJ(MPI_COMM_WORLD,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,1,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,subA(1),ierr);CHKERRQ(ierr) call MatDiagonalSet(subA(1),diag3,INSERT_VALUES,ierr);CHKERRQ(ierr) call MatAssemblyBegin(subA(1),MAT_FINAL_ASSEMBLY,ierr);CHKERRQ(ierr) call MatAssemblyEnd(subA(1),MAT_FINAL_ASSEMBLY,ierr);CHKERRQ(ierr) ! matrix a01 call MatCreateAIJ(MPI_COMM_WORLD,3*n,n,PETSC_DECIDE,PETSC_DECIDE,1,PETSC_NULL_INTEGER,1,PETSC_NULL_INTEGER,subA(2),ierr);CHKERRQ(ierr) call MatGetOwnershipRange(subA(2),start,end,ierr);CHKERRQ(ierr); do i=start,end-1 j=mod(i,size*n) call MatSetValue(subA(2),i,j,one,INSERT_VALUES,ierr);CHKERRQ(ierr) end do call MatAssemblyBegin(subA(2),MAT_FINAL_ASSEMBLY,ierr);CHKERRQ(ierr) call MatAssemblyEnd(subA(2),MAT_FINAL_ASSEMBLY,ierr);CHKERRQ(ierr) ! matrix a10 call MatTranspose(subA(2),MAT_INITIAL_MATRIX,subA(3),ierr);CHKERRQ(ierr) ! matrix a11 (empty) call MatCreateAIJ(MPI_COMM_WORLD,n,n,PETSC_DECIDE,PETSC_DECIDE,0,PETSC_NULL_INTEGER,0,PETSC_NULL_INTEGER,subA(4),ierr);CHKERRQ(ierr) call MatAssemblyBegin(subA(4),MAT_FINAL_ASSEMBLY,ierr);CHKERRQ(ierr) call MatAssemblyEnd(subA(4),MAT_FINAL_ASSEMBLY,ierr);CHKERRQ(ierr) ! nested mat [a00,a01;a10,a11] call MatCreateNest(MPI_COMM_WORLD,2,PETSC_NULL_OBJECT,2,PETSC_NULL_OBJECT,subA,A,ierr);CHKERRQ(ierr) call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr);CHKERRQ(ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr);CHKERRQ(ierr) call MatNestGetISs(A,isg,PETSC_NULL_OBJECT,ierr);CHKERRQ(ierr); ! KSP and PC call KSPCreate(MPI_COMM_WORLD,ksp,ierr);CHKERRQ(ierr) call KSPSetOperators(ksp,A,A,ierr);CHKERRQ(ierr) call KSPSetType(ksp,KSPGCR,ierr);CHKERRQ(ierr) call KSPGetPC(ksp,pc,ierr);CHKERRQ(ierr) call PCSetType(pc,PCFIELDSPLIT,ierr);CHKERRQ(ierr) call PCFieldSplitSetType(pc,PC_COMPOSITE_SCHUR,ierr);CHKERRQ(ierr) call PCFieldSplitSetIS(pc,"0",isg(1),ierr);CHKERRQ(ierr) call PCFieldSplitSetIS(pc,"1",isg(2),ierr);CHKERRQ(ierr) call PCFieldSplitSetSchurFactType(pc,PC_FIELDSPLIT_SCHUR_FACT_LOWER,ierr);CHKERRQ(ierr) call PCFieldSplitSetSchurPre(pc,PC_FIELDSPLIT_SCHUR_PRE_SELFP,PETSC_NULL_OBJECT,ierr);CHKERRQ(ierr) call KSPSetUp(ksp,ierr);CHKERRQ(ierr); call PCFieldSplitGetSubKSP(pc,numsplit,subksp,ierr);CHKERRQ(ierr) call KSPSetType(subksp(1),KSPBICG,ierr);CHKERRQ(ierr) call KSPGetPC(subksp(1),subpc(1),ierr);CHKERRQ(ierr) call PCSetType(subpc(1),PCILU,ierr);CHKERRQ(ierr) call KSPSetType(subksp(2),KSPCG,ierr);CHKERRQ(ierr) call KSPGetPC(subksp(2),subpc(2),ierr);CHKERRQ(ierr) call PCSetType(subpc(2),PCICC,ierr);CHKERRQ(ierr) ! call PetscFree(subksp);CHKERRQ(ierr); call KSPSetFromOptions(ksp,ierr);CHKERRQ(ierr) call KSPSolve(ksp,b,x,ierr);CHKERRQ(ierr) call KSPGetSolution(ksp,x,ierr);CHKERRQ(ierr) ! call VecView(x,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) call PetscFinalize(ierr) end program fieldsplittry $ ./fieldsplittry -ksp_view KSP Object: 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 1 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: bicg maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12 package used to perform factorization: petsc total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=12, cols=12 total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_1_) 1 MPI processes type: cg maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: icc 0 levels of fill tolerance for zero pivot 2.22045e-14 using Manteuffel shift [POSITIVE_DEFINITE] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqsbaij rows=4, cols=4 package used to perform factorization: petsc total: nonzeros=4, allocated nonzeros=4 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: schurcomplement rows=4, cols=4 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=4, cols=4 total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 1 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=4, cols=12 total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_0_) 1 MPI processes type: bicg maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12 package used to perform factorization: petsc total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=12, cols=12 total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: 1 MPI processes type: seqaij rows=12, cols=4 total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: 1 MPI processes type: seqaij rows=4, cols=4 total: nonzeros=4, allocated nonzeros=4 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=16, cols=16 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="fieldsplit_0_", type=seqaij, rows=12, cols=12 (0,1) : type=seqaij, rows=12, cols=4 (1,0) : type=seqaij, rows=4, cols=12 (1,1) : prefix="fieldsplit_1_", type=seqaij, rows=4, cols=4 $ ./fieldsplittry -fieldsplit_1_inner_ksp_type preonly -fieldsplit_1_inner_pc_type jacobi -ksp_view KSP Object: 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 1 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization LOWER Preconditioner for the Schur complement formed from Sp, an assembled approximation to S, which uses (the lumped) A00's diagonal's inverse Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12 package used to perform factorization: petsc total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=12, cols=12 total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_1_) 1 MPI processes type: cg maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: icc 0 levels of fill tolerance for zero pivot 2.22045e-14 using Manteuffel shift [POSITIVE_DEFINITE] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqsbaij rows=4, cols=4 package used to perform factorization: petsc total: nonzeros=4, allocated nonzeros=4 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: schurcomplement rows=4, cols=4 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=4, cols=4 total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 1 nodes, limit used is 5 A10 Mat Object: 1 MPI processes type: seqaij rows=4, cols=12 total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_1_inner_) 1 MPI processes type: bicg maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_inner_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=12, cols=12 package used to perform factorization: petsc total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=12, cols=12 total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: 1 MPI processes type: seqaij rows=12, cols=4 total: nonzeros=12, allocated nonzeros=12 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: 1 MPI processes type: seqaij rows=4, cols=4 total: nonzeros=4, allocated nonzeros=4 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=16, cols=16 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="fieldsplit_0_", type=seqaij, rows=12, cols=12 (0,1) : type=seqaij, rows=12, cols=4 (1,0) : type=seqaij, rows=4, cols=12 (1,1) : prefix="fieldsplit_1_", type=seqaij, rows=4, cols=4 $ dr. ir. Christiaan Klaij CFD Researcher Research & Development E mailto:C.Klaij at marin.nl T +31 317 49 33 44 MARIN 2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands T +31 317 49 39 11, F +31 317 49 32 45, I www.marin.nl From mailinglists at xgm.de Tue Sep 9 06:12:07 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Tue, 09 Sep 2014 13:12:07 +0200 Subject: [petsc-users] Putting petsc in a namespace In-Reply-To: <227B30B0-4D46-4940-8281-4081877F083D@mcs.anl.gov> References: <36997a2571932e4d9029a5f869d25fd9@xgm.de> <227B30B0-4D46-4940-8281-4081877F083D@mcs.anl.gov> Message-ID: <2589998.d4cIGtPG42@asaru> Am Montag, 8. September 2014, 10:53:27 schrieb Barry Smith: > > On Sep 8, 2014, at 2:19 AM, Florian Lindner wrote: > > > > > I want to offer an interface that uses some Petsc types (like Vec and Mat) but I do not want to import all petsc symbols into the global namespace for anyone using the interface. That does not seam possible though... > > > > Thanks, > > > > Florian > > Which symbols do you want to have available and which ones not? Do you not want to use the other symbols but not let the user use them, for example, you want to use KSP but don?t want the user to have access to them? Or something else. Note that you can configure PETSc into several libraries and then use only the ones you want with -with-single-library=no it creates a separate TS, SNES, KSP, DM, Mat,and Vec library. > > Let us know what you want in more detail and we may have suggestions on how to achieve it. Thanks for your help! If a user of my class library does an #include "mypetsc.h" he should not get all the petsc symbols in his global namespace, e.g. #include "mypetsc.h" Vector v; // is an object of my class lib Vec pv; // should not work, since I do not want petsc symbols in my global namespace petsc::Vec pv; // fine, since petsc symbols are contained in namespace petsc. But this does not seem to be possible. petsc::Vec = v.vector // vector of type petsc::Vec is a public member of Vector. That's why "mypetsc.h" needs to #include "petscvec.h". Because of that an #include "mypetsc" imports all the symbols of "petscvec" into anyone wanting to use Vector. Since "mypetsc.h" only contains declarations and no code I tried it with forward declarations of Vec, but no success. At the end an #include "mypetsc.h" should import only my own symbols like Vector and Matrix. It would be ok, if it also imports Vec and Mat (which are types of public members / return types of public functions). It would also be ok, if it imports all other petsc symbols (like VecGetOwnershipRange) in a seperate namespace. I hope I was able to convey what I want... Thx, Florian From ashwinsrnth at gmail.com Tue Sep 9 10:15:02 2014 From: ashwinsrnth at gmail.com (Ashwin Srinath) Date: Tue, 9 Sep 2014 11:15:02 -0400 Subject: [petsc-users] [petsc4py] Interoperability with PyCUDA? Message-ID: Hello, petsc-users I posted about this before without luck, but perhaps that was a little too specific a request: http://lists.mcs.anl.gov/pipermail/petsc-users/2014-July/022145.html In general, it would be great to be able to use PyCUDA and petsc4py together. Here is an example of how I'd like to be able to do this (based on this PyCUDA example ): from petsc4py import PETSc as petsc import pycuda.driver as cuda from pycuda import autoinit from pycuda.compiler import SourceModule mod = SourceModule(""" __global__ void doublify(double *a) { int idx = threadIdx.x + threadIdx.y*4; a[idx] *= 2; } """) v = petsc.Vec() v.create() v.setSizes(16) v.setType('cusp') v.set(1) func = mod.get_function("doublify") *func(???, block=(4,4,1))* `func` accepts as ??? an object that supports the Python buffer interface, see here , and I'm wondering if it's possible for petsc4py cusp vectors to support that? Are there any other ways to use custom kernels with petsc4py cusp vectors? petsc4py users - if you use CUSP vectors in your petsc4py code, may I ask how? Thanks so much, Ashwin -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Sep 9 10:38:25 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 9 Sep 2014 10:38:25 -0500 Subject: [petsc-users] Putting petsc in a namespace In-Reply-To: <2589998.d4cIGtPG42@asaru> References: <36997a2571932e4d9029a5f869d25fd9@xgm.de> <227B30B0-4D46-4940-8281-4081877F083D@mcs.anl.gov> <2589998.d4cIGtPG42@asaru> Message-ID: On Tue, 9 Sep 2014, Florian Lindner wrote: > Am Montag, 8. September 2014, 10:53:27 schrieb Barry Smith: > > > > On Sep 8, 2014, at 2:19 AM, Florian Lindner wrote: > > > > > > > > I want to offer an interface that uses some Petsc types (like Vec and Mat) but I do not want to import all petsc symbols into the global namespace for anyone using the interface. That does not seam possible though... > > > > > > Thanks, > > > > > > Florian > > > > Which symbols do you want to have available and which ones not? Do you not want to use the other symbols but not let the user use them, for example, you want to use KSP but don?t want the user to have access to them? Or something else. Note that you can configure PETSc into several libraries and then use only the ones you want with -with-single-library=no it creates a separate TS, SNES, KSP, DM, Mat,and Vec library. > > > > Let us know what you want in more detail and we may have suggestions on how to achieve it. > > Thanks for your help! > > If a user of my class library does an #include "mypetsc.h" he should not get all the petsc symbols in his global namespace, e.g. > > #include "mypetsc.h" > > Vector v; // is an object of my class lib > Vec pv; // should not work, since I do not want petsc symbols in my global namespace > petsc::Vec pv; // fine, since petsc symbols are contained in namespace petsc. But this does not seem to be possible. > > petsc::Vec = v.vector // vector of type petsc::Vec is a public member of Vector. That's why "mypetsc.h" needs to #include "petscvec.h". Because of that an #include "mypetsc" imports all the symbols of "petscvec" into anyone wanting to use Vector. > > Since "mypetsc.h" only contains declarations and no code I tried it with forward declarations of Vec, but no success. > Hm - it should work. PETSc uses opaque objects (pointers) in C. You should be able to do the same from C++ http://en.wikipedia.org/wiki/Opaque_pointer >>>>>>>> from include/petscmat.h typedef struct _p_Mat* Mat; from include/petsc-private/matimpl.h struct _p_Mat { PETSCHEADER(struct _MatOps); }; <<<<<<< You should be able to use similar mode. Attaching a [non-working] example - that compiles without errors. Note: if you are exposing more than PETSc objects [like enums, or other things - you might need to duplicate them in your interface.. [i.e I don't think you can wrap petsc headers with 'namespace' and have all PETSc public stuff available automatically] Satish -------------- $ make mpicxx -o ex1.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -fPIC -I/home/balay/petsc/include -I/home/balay/petsc/arch-maint/include -I/usr/include/mpich-x86_64 `pwd`/ex1.cpp mpicxx -o mypetsc.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -fPIC -I/home/balay/petsc/include -I/home/balay/petsc/arch-maint/include -I/usr/include/mpich-x86_64 `pwd`/mypetsc.cpp mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0 -o ex1 ex1.o mypetsc.o -Wl,-rpath,/home/balay/petsc/arch-maint/lib -L/home/balay/petsc/arch-maint/lib -lpetsc -llapack -lblas -lX11 -lpthread -lm -Wl,-rpath,/usr/lib64/mpich/lib -L/usr/lib64/mpich/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.9.1 -L/usr/lib/gcc/x86_64-redhat-linux/4.9.1 -lmpichf90 -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -Wl,-rpath,/usr/lib64/mpich/lib -L/usr/lib64/mpich/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.9.1 -L/usr/lib/gcc/x86_64-redhat-linux/4.9.1 -ldl -Wl,-rpath,/usr/lib64/mpich/lib -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl /usr/bin/rm -f ex1.o mypetsc.o balay at asterix /home/balay/tmp/test $ cat mypetsc.h namespace petsc { typedef struct _p_Vec* Vec; typedef struct _p_Mat* Mat; } class Vector { public: int Scale(); private: petsc::Vec vector; }; balay at asterix /home/balay/tmp/test $ cat mypetsc.cpp #include "petsc.h" #include "petsc-private/vecimpl.h" #include "petsc-private/matimpl.h" #include "mypetsc.h" int Vector::Scale() { PetscScalar a=1.0; VecScale(Vector::vector,a); return 0; } balay at asterix /home/balay/tmp/test $ cat ex1.cpp #include "mypetsc.h" int main () { Vector v; v.Scale(); return 0; } balay at asterix /home/balay/tmp/test $ ------------------------------------ > At the end an #include "mypetsc.h" should import only my own symbols like Vector and Matrix. It would be ok, if it also imports Vec and Mat (which are types of public members / return types of public functions). It would also be ok, if it imports all other petsc symbols (like VecGetOwnershipRange) in a seperate namespace. > > I hope I was able to convey what I want... > > Thx, > Florian > -------------- next part -------------- A non-text attachment was scrubbed... Name: test.tar.gz Type: application/gzip Size: 599 bytes Desc: URL: From jed at jedbrown.org Tue Sep 9 10:41:20 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 09 Sep 2014 09:41:20 -0600 Subject: [petsc-users] Putting petsc in a namespace In-Reply-To: <2589998.d4cIGtPG42@asaru> References: <36997a2571932e4d9029a5f869d25fd9@xgm.de> <227B30B0-4D46-4940-8281-4081877F083D@mcs.anl.gov> <2589998.d4cIGtPG42@asaru> Message-ID: <8761gwby3j.fsf@jedbrown.org> Florian Lindner writes: > If a user of my class library does an #include "mypetsc.h" he should not get all the petsc symbols in his global namespace, e.g. > > #include "mypetsc.h" > > Vector v; // is an object of my class lib > Vec pv; // should not work, since I do not want petsc symbols in my global namespace > petsc::Vec pv; // fine, since petsc symbols are contained in namespace petsc. But this does not seem to be possible. > > petsc::Vec = v.vector // vector of type petsc::Vec is a public member of Vector. That's why "mypetsc.h" needs to #include "petscvec.h". Because of that an #include "mypetsc" imports all the symbols of "petscvec" into anyone wanting to use Vector. In your first email, you wrote: #include "petsc.h" #include // if the user wants he can import parts of petsc This is trying to include the same file twice, which you cannot do. You can wrap a namespace around a PETSc header, but it will break if the user includes a PETSc file directly either before or after, and macros like ADD_VALUES, NORM_2, PETSC_VIEWER_STDOUT_WORLD, etc., will not have the petsc:: namespace. I think that your wrapper class with public members provides no value and will be more difficult to use. At least this has been the case in every instance I have seen, and almost everyone that tries later concludes that it was a bad idea and abandons it. What value do you think it provides? It's not encapsulation because the implementation is public and it doesn't break any dependencies because the header is still included. So it's mostly an extra layer of (logical) indirection providing only a minor syntax change that will not be familiar to other users of PETSc. What value is that? Now if you mean to request that all PETSc symbols be unified in the (C-style) Petsc* namespace instead of occupying a handful of others (Vec*, Mat*, KSP*, etc.), we know that it's the right thing to do and have been holding off mainly to limit the amount of changes that we force users to deal with in order to upgrade. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Tue Sep 9 10:56:05 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 9 Sep 2014 10:56:05 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <87r3zmdmp4.fsf@jedbrown.org> Message-ID: On Mon, 8 Sep 2014, Satish Balay wrote: > On Mon, 8 Sep 2014, Jed Brown wrote: > > > Satish Balay writes: > > > > > On Mon, 1 Sep 2014, ?smund Ervik wrote: > > > > > >> This is the reason why I asked whether I can somehow tell > > >> "--download-hdf5" to download a more recent version. 1.8.11 should do it. > > > > > > > > > This works for me with examples.. > > > > > > --download-hdf5=http://www.hdfgroup.org/ftp/HDF5/prev-releases/hdf5-1.8.13/src/hdf5-1.8.13.tar.gz > > > > Sounds good, will you upgrade hdf5.py? > > pushed and merged to next > > https://bitbucket.org/petsc/petsc/commits/72dba1de4d91de2998c7fb909813c856df9b7209 http://ftp.mcs.anl.gov/pub/petsc/nightlylogs/archive/2014/09/09/configure_next_arch-c-exodus-dbg-builder_bb-proxy.log > ../liblib/.libs/libnetcdf.so: undefined reference to `H5Pset_fapl_mpiposix' Ok there is an issue with hdf5-1.8.13 with current netcdf. Looks like its fixed in netcdf.git http://hdf-forum.184993.n3.nabble.com/Re-undefined-reference-to-H5Pset-fapl-mpiposix-td4027216.html Revert or use netcdf.git? [there can be a cascading effect of dependencies?] Satish From jed at jedbrown.org Tue Sep 9 11:07:19 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 09 Sep 2014 10:07:19 -0600 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <87r3zmdmp4.fsf@jedbrown.org> Message-ID: <871trkbww8.fsf@jedbrown.org> Satish Balay writes: > Ok there is an issue with hdf5-1.8.13 with current netcdf. Gah, after all their effort to maintain binary compatibility from 1.6 to 1.8, they casually break compatibility in a subminor release. > Looks like its fixed in netcdf.git > > http://hdf-forum.184993.n3.nabble.com/Re-undefined-reference-to-H5Pset-fapl-mpiposix-td4027216.html > > Revert or use netcdf.git? [there can be a cascading effect of dependencies?] It may impact exodusii, so maybe just use hdf5-1.8.12. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Tue Sep 9 11:09:12 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 9 Sep 2014 11:09:12 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <87r3zmdmp4.fsf@jedbrown.org> Message-ID: On Tue, 9 Sep 2014, Satish Balay wrote: > > > > This works for me with examples.. > > > > > > > > --download-hdf5=http://www.hdfgroup.org/ftp/HDF5/prev-releases/hdf5-1.8.13/src/hdf5-1.8.13.tar.gz > > > > > > Sounds good, will you upgrade hdf5.py? > > > > pushed and merged to next > > > > https://bitbucket.org/petsc/petsc/commits/72dba1de4d91de2998c7fb909813c856df9b7209 > > http://ftp.mcs.anl.gov/pub/petsc/nightlylogs/archive/2014/09/09/configure_next_arch-c-exodus-dbg-builder_bb-proxy.log > > > ../liblib/.libs/libnetcdf.so: undefined reference to `H5Pset_fapl_mpiposix' > > Ok there is an issue with hdf5-1.8.13 with current netcdf. Looks like its fixed in netcdf.git > > http://hdf-forum.184993.n3.nabble.com/Re-undefined-reference-to-H5Pset-fapl-mpiposix-td4027216.html > > Revert or use netcdf.git? [there can be a cascading effect of dependencies?] Another hdf5 failure in nightlybuilds.. http://ftp.mcs.anl.gov/pub/petsc/nightlylogs/archive/2014/09/09/build_next_arch-freebsd-cxx-cmplx-pkgs-dbg_wii.log >>>>>>>> Warning: Possible change of value in conversion from INTEGER(8) to INTEGER(4) at (1) FC fortranlib_test-tH5R.o FC fortranlib_test-tH5S.o FC fortranlib_test-tH5T.o FC fortranlib_test-tH5VL.o FC fortranlib_test-tH5Z.o FC fortranlib_test-tH5Sselect.o FC fortranlib_test-tH5P.o FC fortranlib_test-tH5A.o FC fortranlib_test-tH5I.o FC fortranlib_test-tH5G.o FC fortranlib_test-tH5E.o FC fortranlib_test-tHDF5.o tHDF5.f90:34.10: USE TH5E 1 Fatal Error: Can't open module file 'th5e.mod' for reading at (1): No such file or directory <<<<<<<<<<< Reinvoking 'make' in the build dir get the build going. [and its not even a parallel build on this machine]. Wierd.. Satish From jed at jedbrown.org Tue Sep 9 11:14:17 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 09 Sep 2014 10:14:17 -0600 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <87r3zmdmp4.fsf@jedbrown.org> Message-ID: <87tx4gai06.fsf@jedbrown.org> Satish Balay writes: > Reinvoking 'make' in the build dir get the build going. [and its not even a parallel build on this > machine]. Wierd.. Different source files can have ordering requirements and there is no good way to determine that ordering, so it's often semi-manual or based on crude heuristics. Thanks, Fortran Committee. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jed at jedbrown.org Tue Sep 9 11:21:35 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 09 Sep 2014 10:21:35 -0600 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <871trkbww8.fsf@jedbrown.org> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <87r3zmdmp4.fsf@jedbrown.org> <871trkbww8.fsf@jedbrown.org> Message-ID: <87r3zkaho0.fsf@jedbrown.org> Jed Brown writes: > Satish Balay writes: >> Ok there is an issue with hdf5-1.8.13 with current netcdf. > > Gah, after all their effort to maintain binary compatibility from 1.6 to > 1.8, they casually break compatibility in a subminor release. And they didn't even bump the soname (it's libhdf5.so.8 in both cases). Time to subscribe to their list to complain. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From balay at mcs.anl.gov Tue Sep 9 13:07:16 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 9 Sep 2014 13:07:16 -0500 Subject: [petsc-users] petsc-3.5.2.tar.gz now available Message-ID: Dear PETSc users, The patch release petsc-3.5.2 is now available for download. http://www.mcs.anl.gov/petsc/download/index.html Some of the changes include: * ml: fix partitioning with parmetis * build: fix gnumake dependencies and verify minimum required gnumake version. * fix: ISLocalToGlobalCreateIS() for bs > 0 * fix: ISLocalToGlobalMappingCreateIS() to respect the block size of the IS * KSPSetSupportedNorm: simplify priority logic around KSP_NORM_NONE * AIJ/BAIJ: Performance improvement for MatZeroRows(), MatZeroRowsColumns() * PetscFV: Fully enable other limiters * TS+Plex: Overlap cells were misidentified as ghost cells * fix: MatAXPY_ for mpibaij, mpisbaij, seqbaij and seqsbaij * pastix: updated to version 5.2.2 This prevents crashes with diagonal matrices * DMDA: Fix VecView() for two-dimensional grids with curvilinear coordinates * DMDA: Fix VecView() displaying of coordinate limits in parallel * configure: use LIBS option verbatim in the link line. * configure: check if the compiler returns zero error code on link failure Satish From balay at mcs.anl.gov Tue Sep 9 17:43:14 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 9 Sep 2014 17:43:14 -0500 Subject: [petsc-users] Configure/make with "--download-hdf5" or "--with-hdf5-dir=" crashes In-Reply-To: <871trkbww8.fsf@jedbrown.org> References: <54042CDA.9080002@ntnu.no> <540455F5.8050908@ntnu.no> <54045F23.2070506@ntnu.no> <87r3zmdmp4.fsf@jedbrown.org> <871trkbww8.fsf@jedbrown.org> Message-ID: On Tue, 9 Sep 2014, Jed Brown wrote: > Satish Balay writes: > > Ok there is an issue with hdf5-1.8.13 with current netcdf. > > Gah, after all their effort to maintain binary compatibility from 1.6 to > 1.8, they casually break compatibility in a subminor release. > > > Looks like its fixed in netcdf.git > > > > http://hdf-forum.184993.n3.nabble.com/Re-undefined-reference-to-H5Pset-fapl-mpiposix-td4027216.html > > > > Revert or use netcdf.git? [there can be a cascading effect of dependencies?] > > It may impact exodusii, so maybe just use hdf5-1.8.12. Ok - switched to hdf5-1.8.12 - merged with next. Satish From blechta at karlin.mff.cuni.cz Wed Sep 10 08:34:34 2014 From: blechta at karlin.mff.cuni.cz (Jan Blechta) Date: Wed, 10 Sep 2014 15:34:34 +0200 Subject: [petsc-users] MAT_IGNORE_NEGATIVE_INDICES Message-ID: <20140910153434.0d08f5a9@gott> Is there an anology of VEC_IGNORE_NEGATIVE_INDICES for Mats? Can be such a behaviour somehow emulated? Jan From blechta at karlin.mff.cuni.cz Wed Sep 10 08:47:05 2014 From: blechta at karlin.mff.cuni.cz (Jan Blechta) Date: Wed, 10 Sep 2014 15:47:05 +0200 Subject: [petsc-users] MAT_IGNORE_NEGATIVE_INDICES In-Reply-To: <20140910153434.0d08f5a9@gott> References: <20140910153434.0d08f5a9@gott> Message-ID: <20140910154705.7f92ab2c@gott> Just found the answer in 'MatSetValues' doc: > Negative indices may be passed in idxm and idxn, these rows and > columns are simply ignored. Jan On Wed, 10 Sep 2014 15:34:34 +0200 Jan Blechta wrote: > Is there an anology of VEC_IGNORE_NEGATIVE_INDICES for Mats? Can be > such a behaviour somehow emulated? > > Jan From Eric.Chamberland at giref.ulaval.ca Thu Sep 11 10:34:17 2014 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Thu, 11 Sep 2014 11:34:17 -0400 Subject: [petsc-users] Curiosity about MatSetOptionsPrefix on a_11 in PCSetUp_FieldSplit In-Reply-To: <54042CDA.9080002@ntnu.no> References: <54042CDA.9080002@ntnu.no> Message-ID: <5411C0F9.6090106@giref.ulaval.ca> Hi, I was just curious to know why the prefix of the sub-matrix a_{11} in a matnest is forced to the ksp prefix by PCSetUp_FieldSplit? In my case it changes from "gcrSchur_fieldsplit_a_11_" to "gcrSchur_fieldsplit_schur_" after the PCSetUp_FieldSplit, by this (I think) lines of code in fieldsplit.c (petsc 3.5.2): const char *prefix; ierr = MatGetSubMatrix(pc->pmat,ilink->is,ilink->is_col,MAT_INITIAL_MATRIX,&jac->pmat[i]);CHKERRQ(ierr); ierr = KSPGetOptionsPrefix(ilink->ksp,&prefix);CHKERRQ(ierr); ierr = MatSetOptionsPrefix(jac->pmat[i],prefix);CHKERRQ(ierr); ierr = MatViewFromOptions(jac->pmat[i],NULL,"-mat_view");CHKERRQ(ierr); We wanted to pass options to the a_11 matrix, but using the PC_COMPOSITE_SCHUR, we have to give the a_{11} matrix a unique prefix different from the ksp prefix used to solve the shur complement (which we named "schur") and have MatSetFromOptions use this unique prefixe. It all worked, but we saw this "curiosity" in the kspview (view attached log) about the PCSetUp_FieldSplit which rename our matrix (after all the precautions we took to name it differently)!!!! The code is working, so maybe this is not an issue, but I can't tell from my knowledge if it can be harmful? thank you! Eric -------------- next part -------------- [0] PetscInitialize(): PETSc successfully started: number of processors = 1 [0] PetscGetHostName(): Rejecting domainname, likely is NIS melkor.(none) [0] PetscInitialize(): Running on machine: melkor [0] PetscCommDuplicate(): Duplicating a communicator 139814426697728 68462368 max tags = 2147483647 #PETSc Option Table entries: -info -on_error_attach_debugger ddd #End of PETSc Option Table entries assignation du prefixe (asgnPrefixeOptionsPETSc) pour Solveur_ProjectionL2_0x7fff820fcf20 prefixe : Options_ProjectionL2 type matrice : aij type precond : hypre type solveur : cg (iteratif/precond) librairie : petsc assignation du prefixe (asgnPrefixeOptionsPETSc) pour Solveur_ProjectionL2_0x7fff820ffa28 prefixe : Options_ProjectionL2 type matrice : aij type precond : hypre type solveur : cg (iteratif/precond) librairie : petsc ATTENTION: On a pas la BCS, on bascule vers MUMPS assignation du prefixe (asgnPrefixeOptionsPETSc) pour mon_solvlin prefixe : gcrSchur_ type matrice : nest type precond : fieldsplit type solveur : gcr (iteratif/precond) librairie : mumps chrono::SolveurLinPETSc::detruitKSP::debut VmSize: 556768 VmRSS: 109304 VmPeak: 647920 VmData: 24060 VmHWM: 192648 ::fin VmSize: 556768 VmRSS: 109304 VmPeak: 647920 VmData: 24060 VmHWM: 192648 WC: 0.000304 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 556768 VmRSS: 109304 VmPeak: 647920 VmData: 24060 VmHWM: 192648 WC: 0.04414 SelfUser: 0.038 SelfSys: 0.006 ChildUser: 0 Childsys: 0 chrono::Geometrie::reconstructionModeTocher::debut VmSize: 557568 VmRSS: 111668 VmPeak: 647920 VmData: 24860 VmHWM: 192648 ::fin VmSize: 557596 VmRSS: 113736 VmPeak: 647920 VmData: 24860 VmHWM: 192648 WC: 0.511831 SelfUser: 0.51 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 557596 VmRSS: 113736 VmPeak: 647920 VmData: 24860 VmHWM: 192648 WC: 0.991469 SelfUser: 0.913 SelfSys: 0.077 ChildUser: 0 Childsys: 0 ::fin VmSize: 557596 VmRSS: 113736 VmPeak: 647920 VmData: 24860 VmHWM: 192648 WC: 0.999435 SelfUser: 0.92 SelfSys: 0.077 ChildUser: 0 Childsys: 0 Gestion des booleens et des scalaires : ------------------------------------- Basculement du booleen "ActiveExportations" a true chrono::ProblemeGD::asgnParametresEtInitialise::debut VmSize: 557596 VmRSS: 113736 VmPeak: 647920 VmData: 24860 VmHWM: 192648 Vous utilisez un Field split! WOW! cool! 2 u* pression* Numerote GIS SSFeuilles: ----------------------------------------------------------------------- SystemeSymbolique::affiche() [ this = 0x7fff820e9460] Couplages (28): 1) 0x45f7240 : uX:uX sur domaine # 10 de type "8Maillage" type TF: 2 2) 0x45f7300 : uX:uY sur domaine # 10 de type "8Maillage" type TF: 2 3) 0x45f73c0 : uX:uZ sur domaine # 10 de type "8Maillage" type TF: 2 4) 0x45f7070 : uX:pression sur domaine # 10 de type "8Maillage" type TF: 2 5) 0x470ffa0 : uY:uX sur domaine # 10 de type "8Maillage" type TF: 2 6) 0x4726770 : uY:uY sur domaine # 10 de type "8Maillage" type TF: 2 7) 0x45f6e80 : uY:uZ sur domaine # 10 de type "8Maillage" type TF: 2 8) 0x45f71a0 : uY:pression sur domaine # 10 de type "8Maillage" type TF: 2 9) 0x45f6ed0 : uZ:uX sur domaine # 10 de type "8Maillage" type TF: 2 10) 0x4726980 : uZ:uY sur domaine # 10 de type "8Maillage" type TF: 2 11) 0x4726a00 : uZ:uZ sur domaine # 10 de type "8Maillage" type TF: 2 12) 0x431ce20 : uZ:pression sur domaine # 10 de type "8Maillage" type TF: 2 13) 0x4712e50 : pression:uX sur domaine # 10 de type "8Maillage" type TF: 2 14) 0x45f7120 : pression:uY sur domaine # 10 de type "8Maillage" type TF: 2 15) 0x431cda0 : pression:uZ sur domaine # 10 de type "8Maillage" type TF: 2 16) 0x431cea0 : pression:pression sur domaine # 10 de type "8Maillage" type TF: 2 17) 0x4726bd0 : uX:uX sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 18) 0x4726c50 : uX:uY sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 19) 0x4726cd0 : uX:uZ sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 20) 0x4726d50 : uX:pression sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 21) 0x45f6530 : uY:uX sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 22) 0x45f65b0 : uY:uY sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 23) 0x45f6630 : uY:uZ sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 24) 0x45f66b0 : uY:pression sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 25) 0x45f6730 : uZ:uX sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 26) 0x45f67b0 : uZ:uY sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 27) 0x45f6830 : uZ:uZ sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 28) 0x45f68b0 : uZ:pression sur domaine # 50 de type "17EntiteGeometrique" type TF: 2 Champs hors couplages (0): Liste champs connus (4): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) 4) 0x4340a80 : pression (racine) Liste champs ?quations (4): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) 4) 0x4340a80 : pression (racine) Liste champs inconnues (4): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) 4) 0x4340a80 : pression (racine) ------------------------------------------------------------------------- Infos internes (4): 0) 0x433bc30 : uX [REI] 1) 0x433ccc0 : uY [REI] 2) 0x433dd50 : uZ [REI] 3) 0x4340a80 : pression [REI] ------------------------------------------------------------------------- Numerote GIS SSS: ----------------------------------------------------------------------- SystemeSymbolique::affiche() [ this = 0x7fff820e9650] Couplages (16): 1) 0x45f7240 : uX:uX sur domaine # 10 de type "8Maillage" type TF: 2 2) 0x45f7300 : uX:uY sur domaine # 10 de type "8Maillage" type TF: 2 3) 0x45f73c0 : uX:uZ sur domaine # 10 de type "8Maillage" type TF: 2 4) 0x45f7070 : uX:pression sur domaine # 10 de type "8Maillage" type TF: 2 5) 0x470ffa0 : uY:uX sur domaine # 10 de type "8Maillage" type TF: 2 6) 0x4726770 : uY:uY sur domaine # 10 de type "8Maillage" type TF: 2 7) 0x45f6e80 : uY:uZ sur domaine # 10 de type "8Maillage" type TF: 2 8) 0x45f71a0 : uY:pression sur domaine # 10 de type "8Maillage" type TF: 2 9) 0x45f6ed0 : uZ:uX sur domaine # 10 de type "8Maillage" type TF: 2 10) 0x4726980 : uZ:uY sur domaine # 10 de type "8Maillage" type TF: 2 11) 0x4726a00 : uZ:uZ sur domaine # 10 de type "8Maillage" type TF: 2 12) 0x431ce20 : uZ:pression sur domaine # 10 de type "8Maillage" type TF: 2 13) 0x4712e50 : pression:uX sur domaine # 10 de type "8Maillage" type TF: 2 14) 0x45f7120 : pression:uY sur domaine # 10 de type "8Maillage" type TF: 2 15) 0x431cda0 : pression:uZ sur domaine # 10 de type "8Maillage" type TF: 2 16) 0x431cea0 : pression:pression sur domaine # 10 de type "8Maillage" type TF: 2 Champs hors couplages (0): Liste champs connus (4): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) 4) 0x4340a80 : pression (racine) Liste champs ?quations (4): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) 4) 0x4340a80 : pression (racine) Liste champs inconnues (4): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) 4) 0x4340a80 : pression (racine) ------------------------------------------------------------------------- Infos internes (4): 0) 0x433bc30 : uX [REI] 1) 0x433ccc0 : uY [REI] 2) 0x433dd50 : uZ [REI] 3) 0x4340a80 : pression [REI] ------------------------------------------------------------------------- ajoute la sous-chaine: u*,pression* Tous les groupes:[u*,pression*] traite le field split (d?group?): u* FS: ajoute le champ: uX FS: ajoute le champ: uY FS: ajoute le champ: uZ FS: n'ajoute pas le champ: pression ligne: On a cr?? le sous-syst?me symbolique suivant en (0,0) : ----------------------------------------------------------------------- SystemeSymbolique::affiche() [ this = 0x45f90b0] Couplages (9): 1) 0x45f52e0 : uX:uX sur domaine # 10 de type "8Maillage" type TF: 2 2) 0x45f9360 : uX:uY sur domaine # 10 de type "8Maillage" type TF: 2 3) 0x45f9440 : uX:uZ sur domaine # 10 de type "8Maillage" type TF: 2 4) 0x45f9520 : uY:uX sur domaine # 10 de type "8Maillage" type TF: 2 5) 0x45f95d0 : uY:uY sur domaine # 10 de type "8Maillage" type TF: 2 6) 0x45f9670 : uY:uZ sur domaine # 10 de type "8Maillage" type TF: 2 7) 0x45f9710 : uZ:uX sur domaine # 10 de type "8Maillage" type TF: 2 8) 0x45f97e0 : uZ:uY sur domaine # 10 de type "8Maillage" type TF: 2 9) 0x45f9880 : uZ:uZ sur domaine # 10 de type "8Maillage" type TF: 2 Champs hors couplages (0): Liste champs connus (3): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) Liste champs ?quations (3): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) Liste champs inconnues (3): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) ------------------------------------------------------------------------- Infos internes (3): 0) 0x433bc30 : uX [REI] 1) 0x433ccc0 : uY [REI] 2) 0x433dd50 : uZ [REI] ------------------------------------------------------------------------- lignes: On a trouv? 2 GIS dont les enfants sont tous inclus dans les 9 IF+SG pr?sents dans aGISNumerotationComplet Reconstitution du champ ligne u par ses GIS enfants traite le field split (d?group?): pression* FS: n'ajoute pas le champ: uX FS: n'ajoute pas le champ: uY FS: n'ajoute pas le champ: uZ FS: ajoute le champ: pression ligne: On a cr?? le sous-syst?me symbolique suivant en (1,1) : ----------------------------------------------------------------------- SystemeSymbolique::affiche() [ this = 0x45fe730] Couplages (1): 1) 0x45fe920 : pression:pression sur domaine # 10 de type "8Maillage" type TF: 2 Champs hors couplages (0): Liste champs connus (1): 1) 0x4340a80 : pression (racine) Liste champs ?quations (1): 1) 0x4340a80 : pression (racine) Liste champs inconnues (1): 1) 0x4340a80 : pression (racine) ------------------------------------------------------------------------- Infos internes (1): 0) 0x4340a80 : pression [REI] ------------------------------------------------------------------------- lignes: On a trouv? 1 GIS dont les enfants sont tous inclus dans les 9 IF+SG pr?sents dans aGISNumerotationComplet Reconstitution du champ ligne pression par ses GIS enfants ligne: On a cr?? le sous-syst?me symbolique suivant en (1,0) : ----------------------------------------------------------------------- SystemeSymbolique::affiche() [ this = 0x4603080] Couplages (3): 1) 0x4603290 : pression:uX sur domaine # 10 de type "8Maillage" type TF: 2 2) 0x4603380 : pression:uY sur domaine # 10 de type "8Maillage" type TF: 2 3) 0x4603430 : pression:uZ sur domaine # 10 de type "8Maillage" type TF: 2 Champs hors couplages (0): Liste champs connus (4): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) 4) 0x4340a80 : pression (racine) Liste champs ?quations (1): 1) 0x4340a80 : pression (racine) Liste champs inconnues (3): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) ------------------------------------------------------------------------- Infos internes (4): 0) 0x433bc30 : uX [RI] 1) 0x433ccc0 : uY [RI] 2) 0x433dd50 : uZ [RI] 3) 0x4340a80 : pression [RE] ------------------------------------------------------------------------- lignes: On a trouv? 3 GIS dont les enfants sont tous inclus dans les 9 IF+SG pr?sents dans aGISNumerotationComplet Reconstitution du champ ligne pression par ses GIS enfants Reconstitution du champ ligne u par ses GIS enfants col: On a cr?? le sous-syst?me symbolique suivant en (0,1) : ----------------------------------------------------------------------- SystemeSymbolique::affiche() [ this = 0x4608140] Couplages (3): 1) 0x4608330 : uX:pression sur domaine # 10 de type "8Maillage" type TF: 2 2) 0x4608420 : uY:pression sur domaine # 10 de type "8Maillage" type TF: 2 3) 0x46084d0 : uZ:pression sur domaine # 10 de type "8Maillage" type TF: 2 Champs hors couplages (0): Liste champs connus (4): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) 4) 0x4340a80 : pression (racine) Liste champs ?quations (3): 1) 0x433bc30 : uX (racine) 2) 0x433ccc0 : uY (racine) 3) 0x433dd50 : uZ (racine) Liste champs inconnues (1): 1) 0x4340a80 : pression (racine) ------------------------------------------------------------------------- Infos internes (4): 0) 0x433bc30 : uX [RE] 1) 0x433ccc0 : uY [RE] 2) 0x433dd50 : uZ [RE] 3) 0x4340a80 : pression [RI] ------------------------------------------------------------------------- colonnes: On a trouv? 3 GIS dont les enfants sont tous inclus dans les 9 IF+SG pr?sents dans aGISNumerotationComplet Reconstitution du champ colonne u par ses GIS enfants Reconstitution du champ colonne pression par ses GIS enfants On garde les lignes avec DDLs dirichlet: DDLsNum: CL Champ ? imposer: uZ DDLsNum: CL Champ ? imposer: uY DDLsNum: CL Champ ? imposer: uX DDLsNum: CL Champ ? imposer: u Trouv? Champ de la CL dans la num?rotation! Exportation au format: GIREF [0] PetscCommDuplicate(): Duplicating a communicator 68383344 73355168 max tags = 2147483647 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] VecScatterCreate(): Special case: sequential vector general to stride [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 ::fin VmSize: 559548 VmRSS: 119748 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.174405 SelfUser: 0.174 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::ProblemeGD::resoudre::debut VmSize: 559548 VmRSS: 119748 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::SolveurLinPETSc::initialise:AssembleurGD::debut VmSize: 559548 VmRSS: 119972 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::SolveurLinPETSc::initialise:initialiseNumerotation:AssembleurGD::debut VmSize: 559548 VmRSS: 119972 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 119972 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.000196 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::initialise:initialiseObjetPETSc:AssembleurGD::debut VmSize: 559548 VmRSS: 119972 VmPeak: 647920 VmData: 26812 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 ::fin VmSize: 559548 VmRSS: 119972 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.000452 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 OptionsSolveurLinPETSc::configureMatrice de Matrice_AssembleurGD (gcrSchur_) Instanciation (configureMatrice) de la matrice de prefixe: gcrSchur_ Identifiant de la matrice: M0_ Type Matrice : nest Librairie du Solveur : mumps aMatriceSymetrique : 0 aSymetrieDetruite : 0 aIgnoreNonSymetrie : 0 aBasculeTypeCSR : 0 aTypeSymetrique : sbaij aTypeNonSymetrique : aij aTypeMatLu : nest Nom du solveur : mon_solvlin Entre dans MatricePETScParBlocs::asgnDimension 0x4431710 On appelle reqOptionsSousBloc avec a_00 dans gcrSchur_ On cr?e le pr?fixe: gcrSchur_fieldsplit_a_00_ prefixe : gcrSchur_fieldsplit_a_00_ type matrice : aij type precond : lu type solveur : preonly (direct) librairie : petsc 0x4431710 asgnOptionsSousBloc par le nom du sous-bloc: a_00 de pr?fixe: gcrSchur_fieldsplit_a_00_ dans gcrSchur_ MPB(0,0) = N/A solv: 0 Instanciation (configureMatrice) de la matrice de prefixe: gcrSchur_fieldsplit_a_00_ Identifiant de la matrice: M1_ Type Matrice : aij Librairie du Solveur : librairie_auto aMatriceSymetrique : 0 aSymetrieDetruite : 0 aIgnoreNonSymetrie : 0 aBasculeTypeCSR : 0 aTypeSymetrique : sbaij aTypeNonSymetrique : aij aTypeMatLu : aij Bravo, vous avez le bon nb de sous-blocs vs sous-num?rotations! chrono::MatricePETSc::asgnDimension gcrSchur_fieldsplit_a_00_::debut VmSize: 559548 VmRSS: 120232 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::MatricePETScCSR::creeObjet(81x81)::debut VmSize: 559548 VmRSS: 120232 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::CompteurNonZeroCSRGIS::algoMetaCouplages()::debut VmSize: 559548 VmRSS: 120232 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::CompteurNonZeroCSRGIS::visiteMaillage::debut VmSize: 559548 VmRSS: 121288 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 121552 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.001453 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::CompteurNonZeroCSRGIS::SetNonZeros::debut VmSize: 559548 VmRSS: 121552 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 121552 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.004112 SelfUser: 0.004 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 121552 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.014277 SelfUser: 0.013 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::creeObjetPrive::debut VmSize: 559548 VmRSS: 121552 VmPeak: 647920 VmData: 26812 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] MatCreate_SeqAIJ_Inode(): Not using Inode routines due to -mat_no_inode ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.001289 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::definirStructureEtMettreAZeroAvecGIS::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::MatricePETScCSR::definirStructureEtMettreAZeroAvecGIS::Couplages::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.002659 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::definirStructureEtMettreAZeroGIS::FinAssemblage::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.00084 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.004598 SelfUser: 0.004 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.024048 SelfUser: 0.021 SelfSys: 0.003 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.024529 SelfUser: 0.021 SelfSys: 0.003 ChildUser: 0 Childsys: 0 MPB(0,1) = N/A solv: 0 Instanciation (configureMatrice) de la matrice de prefixe: gcrSchur_BlocsHDiag_ Identifiant de la matrice: M2_ Type Matrice : aij Librairie du Solveur : librairie_auto aMatriceSymetrique : 0 aSymetrieDetruite : 1 aIgnoreNonSymetrie : 0 aBasculeTypeCSR : 0 aTypeSymetrique : sbaij aTypeNonSymetrique : aij aTypeMatLu : aij Bravo, vous avez le bon nb de sous-blocs vs sous-num?rotations! chrono::MatricePETSc::asgnDimension gcrSchur_BlocsHDiag_::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::MatricePETScCSR::creeObjet(81x8)::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::CompteurNonZeroCSRGIS::algoMetaCouplages()::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::CompteurNonZeroCSRGIS::visiteMaillage::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.00083 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::CompteurNonZeroCSRGIS::SetNonZeros::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.001924 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.00785 SelfUser: 0.007 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::creeObjetPrive::debut VmSize: 559548 VmRSS: 122080 VmPeak: 647920 VmData: 26812 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] MatCreate_SeqAIJ_Inode(): Not using Inode routines due to -mat_no_inode ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.001006 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::definirStructureEtMettreAZeroAvecGIS::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::MatricePETScCSR::definirStructureEtMettreAZeroAvecGIS::Couplages::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.001945 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::definirStructureEtMettreAZeroGIS::FinAssemblage::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.000763 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.003795 SelfUser: 0.002 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.019979 SelfUser: 0.018 SelfSys: 0.002 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.020403 SelfUser: 0.018 SelfSys: 0.002 ChildUser: 0 Childsys: 0 MPB(1,0) = N/A solv: 0 Instanciation (configureMatrice) de la matrice de prefixe: gcrSchur_BlocsHDiag_ Identifiant de la matrice: M3_ Type Matrice : aij Librairie du Solveur : librairie_auto aMatriceSymetrique : 0 aSymetrieDetruite : 1 aIgnoreNonSymetrie : 0 aBasculeTypeCSR : 0 aTypeSymetrique : sbaij aTypeNonSymetrique : aij aTypeMatLu : aij Bravo, vous avez le bon nb de sous-blocs vs sous-num?rotations! chrono::MatricePETSc::asgnDimension gcrSchur_BlocsHDiag_::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::MatricePETScCSR::creeObjet(8x81)::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::CompteurNonZeroCSRGIS::algoMetaCouplages()::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::CompteurNonZeroCSRGIS::visiteMaillage::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.001043 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::CompteurNonZeroCSRGIS::SetNonZeros::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.002736 SelfUser: 0.002 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.018447 SelfUser: 0.018 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::creeObjetPrive::debut VmSize: 559548 VmRSS: 122344 VmPeak: 647920 VmData: 26812 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] MatCreate_SeqAIJ_Inode(): Not using Inode routines due to -mat_no_inode ::fin VmSize: 559548 VmRSS: 122608 VmPeak: 647920 VmData: 26812 VmHWM: 192648 WC: 0.001003 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::definirStructureEtMettreAZeroAvecGIS::debut VmSize: 559548 VmRSS: 122608 VmPeak: 647920 VmData: 26812 VmHWM: 192648 chrono::MatricePETScCSR::definirStructureEtMettreAZeroAvecGIS::Couplages::debut VmSize: 559548 VmRSS: 122608 VmPeak: 647920 VmData: 26812 VmHWM: 192648 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.004956 SelfUser: 0.005 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::definirStructureEtMettreAZeroGIS::FinAssemblage::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.001032 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.007204 SelfUser: 0.007 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.034172 SelfUser: 0.033 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.034704 SelfUser: 0.033 SelfSys: 0.001 ChildUser: 0 Childsys: 0 MPB(1,1) = N/A solv: 0 Instanciation (configureMatrice) de la matrice de prefixe: gcrSchur_fieldsplit_a_11_ Identifiant de la matrice: M4_ Type Matrice : sbaij Librairie du Solveur : librairie_auto aMatriceSymetrique : 0 aSymetrieDetruite : 0 aIgnoreNonSymetrie : 0 aBasculeTypeCSR : 0 aTypeSymetrique : sbaij aTypeNonSymetrique : aij aTypeMatLu : sbaij Bravo, vous avez le bon nb de sous-blocs vs sous-num?rotations! chrono::MatricePETSc::asgnDimension gcrSchur_fieldsplit_a_11_::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 chrono::MatricePETScCSR::creeObjet(8x8)::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 chrono::CompteurNonZeroCSRGIS::algoMetaCouplages()::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 chrono::CompteurNonZeroCSRGIS::visiteMaillage::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.000798 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::CompteurNonZeroCSRGIS::SetNonZeros::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.002245 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.01242 SelfUser: 0.011 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::creeObjetPrive::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] MatCreate_SeqSBAIJ(): Not using Inode routines due to -mat_no_inode [0] MatSetOption_SeqSBAIJ(): Option UNUSED_NONZERO_LOCATION_ERR not relevent ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.001146 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::definirStructureEtMettreAZeroAvecGIS::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 chrono::MatricePETScCSR::definirStructureEtMettreAZeroAvecGIS::Couplages::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.004398 SelfUser: 0.005 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETScCSR::definirStructureEtMettreAZeroGIS::FinAssemblage::debut VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.000767 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.006251 SelfUser: 0.006 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.028055 SelfUser: 0.026 SelfSys: 0.002 ChildUser: 0 Childsys: 0 ::fin VmSize: 559680 VmRSS: 122608 VmPeak: 647920 VmData: 26944 VmHWM: 192648 WC: 0.028525 SelfUser: 0.026 SelfSys: 0.002 ChildUser: 0 Childsys: 0 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 Construction (creeMatriceEtAssigneProfile) de la matrice de prefixe: gcrSchur_ Type Matrice : nest Nouvelle Matrice : VRAI Nom du probleme : AssembleurGD Librairie du Solveur : mumps Matrice Refaite dans SolveurLinPETSc::asgnProfileMatrice chrono::SolveurLinPETSc::initialise:newVecResidu:AssembleurGD::debut VmSize: 559812 VmRSS: 122872 VmPeak: 647920 VmData: 27076 VmHWM: 192648 ::fin VmSize: 559812 VmRSS: 122872 VmPeak: 647920 VmData: 27076 VmHWM: 192648 WC: 0.000223 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::initialise:initialiseVecResidu:AssembleurGD::debut VmSize: 559812 VmRSS: 122872 VmPeak: 647920 VmData: 27076 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] VecScatterCreate(): Special case: sequential vector general to stride [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 ::fin VmSize: 559812 VmRSS: 122872 VmPeak: 647920 VmData: 27076 VmHWM: 192648 WC: 0.000515 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::initialise:newVecCorrection:AssembleurGD::debut VmSize: 559812 VmRSS: 122872 VmPeak: 647920 VmData: 27076 VmHWM: 192648 ::fin VmSize: 559812 VmRSS: 122872 VmPeak: 647920 VmData: 27076 VmHWM: 192648 WC: 0.000205 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::initialise:initialiseVecCorrection:AssembleurGD::debut VmSize: 559812 VmRSS: 122872 VmPeak: 647920 VmData: 27076 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] VecScatterCreate(): Special case: sequential vector general to stride [0] PetscCommDuplicate(): Using internal PETSc communicator 68383344 73355168 ::fin VmSize: 559944 VmRSS: 122872 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.000466 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 122872 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.116025 SelfUser: 0.107 SelfSys: 0.009 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PreTraitParPasDeTemps:AssembleurGD::debut VmSize: 559944 VmRSS: 122872 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:DebutPasDeTemps::************ProblemeGD_aPPExecutePreTraitement************::debut VmSize: 559944 VmRSS: 122872 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePreTraitement************::effectueCalcul::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 ======== Pas de Temps [1] = 1 ======== chrono::PP::aPPMAJDeplacementNewtonPrecedent::effectueCalcul::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.000304 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.003025 SelfUser: 0.003 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.003435 SelfUser: 0.003 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.004056 SelfUser: 0.003 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::EcritResuPreParPasDeTemps:AssembleurGD::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.000211 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PPIteration:AssembleurGD::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:DebutIterationNlin::************ProblemeGD_aPPExecutePreTraitementIteration************::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePreTraitementIteration************::effectueCalcul::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::PP::aPPMAJNoIterationNewton::effectueCalcul::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.000202 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::aPPMAJNoIterationCumule::effectueCalcul::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::PP::PostTrait:aPPMAJNoIterationCumule:aPPMAJNoIterationCumuleAuxVoisins::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::PP::aPPMAJNoIterationCumuleAuxVoisins::effectueCalcul::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.000231 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.000622 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.001034 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.00181 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.002237 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.002619 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::faisAssemblage:AssembleurGD::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueChamp:AssembleurGD::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.000452 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::debutAssemblage:AssembleurGD::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 ::fin VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 WC: 0.00037 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::assembleMatriceEtResidu:AssembleurGD::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::ProblemeEF::assemblePriveDomaine:AssembleurGD::debut VmSize: 559944 VmRSS: 123136 VmPeak: 647920 VmData: 27208 VmHWM: 192648 chrono::ProblemeEF::assemblePrive:Mat:Vec:AssembleurGD::debut VmSize: 559944 VmRSS: 123400 VmPeak: 647920 VmData: 27208 VmHWM: 192648 ::fin VmSize: 565144 VmRSS: 124180 VmPeak: 647920 VmData: 27360 VmHWM: 192648 WC: 0.153915 SelfUser: 0.047 SelfSys: 0.006 ChildUser: 0 Childsys: 0 ::fin VmSize: 565144 VmRSS: 124184 VmPeak: 647920 VmData: 27360 VmHWM: 192648 WC: 0.155662 SelfUser: 0.049 SelfSys: 0.006 ChildUser: 0 Childsys: 0 chrono::ProblemeEF::assemblePrivePeau:AssembleurGD::debut VmSize: 565144 VmRSS: 124184 VmPeak: 647920 VmData: 27360 VmHWM: 192648 ::fin VmSize: 565144 VmRSS: 124184 VmPeak: 647920 VmData: 27360 VmHWM: 192648 WC: 0.000246 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 565144 VmRSS: 124184 VmPeak: 647920 VmData: 27360 VmHWM: 192648 WC: 0.159546 SelfUser: 0.051 SelfSys: 0.008 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueMatrice:AssembleurGD::debut VmSize: 565144 VmRSS: 124184 VmPeak: 647920 VmData: 27360 VmHWM: 192648 chrono::MatricePETSc::mettreAZeroLignes::debut VmSize: 571572 VmRSS: 124664 VmPeak: 647920 VmData: 27632 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 571572 VmRSS: 124664 VmPeak: 647920 VmData: 27632 VmHWM: 192648 WC: 0.0018 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETSc::mettreAZeroLignes::debut VmSize: 571572 VmRSS: 124664 VmPeak: 647920 VmData: 27632 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 571572 VmRSS: 124664 VmPeak: 647920 VmData: 27632 VmHWM: 192648 WC: 0.000881 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 ::fin VmSize: 571572 VmRSS: 124696 VmPeak: 647920 VmData: 27632 VmHWM: 192648 WC: 0.193168 SelfUser: 0.048 SelfSys: 0.004 ChildUser: 0.098 Childsys: 0.048 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueResidu:AssembleurGD::debut VmSize: 571572 VmRSS: 124696 VmPeak: 647920 VmData: 27632 VmHWM: 192648 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 577728 VmRSS: 124808 VmPeak: 647920 VmData: 27632 VmHWM: 192648 WC: 0.160504 SelfUser: 0.018 SelfSys: 0.004 ChildUser: 0.092 Childsys: 0.059 chrono::SolveurLinPETSc::faisAssemblagePrive::finAssemblageAssembleurGD::debut VmSize: 577728 VmRSS: 124808 VmPeak: 647920 VmData: 27632 VmHWM: 192648 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 577728 VmRSS: 124848 VmPeak: 647920 VmData: 27632 VmHWM: 192648 WC: 0.000381 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 577728 VmRSS: 124848 VmPeak: 647920 VmData: 27632 VmHWM: 192648 WC: 0.515877 SelfUser: 0.119 SelfSys: 0.016 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::resoudre_et_RechercheLineaire:AssembleurGD::debut VmSize: 577728 VmRSS: 124848 VmPeak: 647920 VmData: 27632 VmHWM: 192648 chrono::SolveurLinPETSc::resoudre_Factorisation_et_DR:AssembleurGD::debut VmSize: 577728 VmRSS: 124848 VmPeak: 647920 VmData: 27632 VmHWM: 192648 chrono::asgnOperateurKSP::debut VmSize: 577728 VmRSS: 124848 VmPeak: 647920 VmData: 27632 VmHWM: 192648 ::fin VmSize: 577728 VmRSS: 124848 VmPeak: 647920 VmData: 27632 VmHWM: 192648 WC: 0.000683 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::resoudre:KSPSetUp:AssembleurGD::debut VmSize: 577728 VmRSS: 124848 VmPeak: 647920 VmData: 27632 VmHWM: 192648 nomme champ 0 : a_00 nomme champ 1 : schur [0] PCSetUp(): Setting up PC for first time[0] VecScatterCreate(): Special case: sequential vector stride to stride [0] VecScatterCreate(): Special case: sequential vector stride to stride 0x4431710 On appelle reqOptionsSousBloc avec a_00 dans gcrSchur_ [0] PCSetUp(): Setting up PC for first time[0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 0 Fill ratio:given 5 needed 1.03902 [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 1.03902 or use [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,1.03902); [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. [0] Mat_CheckInode_FactorLU(): Found 27 nodes of 81. Limit used: 5. Using Inode routines 0x4431710 On appelle reqOptionsSousBloc avec schur dans gcrSchur_ On cr?e le pr?fixe: gcrSchur_fieldsplit_schur_ prefixe : gcrSchur_fieldsplit_schur_ type matrice : schurcomplement type precond : jacobi type solveur : gcr (iteratif/precond) librairie : petsc 0x4431710 asgnOptionsSousBloc par le nom du sous-bloc: schur de pr?fixe: gcrSchur_fieldsplit_schur_ dans gcrSchur_ [0] PCSetUp(): Setting up PC for first time::fin VmSize: 580112 VmRSS: 127972 VmPeak: 647920 VmData: 30016 VmHWM: 192648 WC: 0.010764 SelfUser: 0.009 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::resoudre:KSPSolve:AssembleurGD::debut VmSize: 580112 VmRSS: 127972 VmPeak: 647920 VmData: 30016 VmHWM: 192648 [0] PetscCommDuplicate(): Duplicating a communicator 139814426695680 76804080 max tags = 2147483647 KSP Object:(gcrSchur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 0 maximum iterations=100, initial guess is zero tolerances: relative=1e-11, absolute=1e-11, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(gcrSchur_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 0 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-12, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: schurcomplement rows=8, cols=8 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 A10 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=8, cols=81 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=81, cols=8 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=89, cols=89 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="gcrSchur_fieldsplit_a_00_", type=seqaij, rows=81, cols=81 (0,1) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=81, cols=8 (1,0) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=8, cols=81 (1,1) : prefix="gcrSchur_fieldsplit_schur_", type=seqsbaij, rows=8, cols=8 Configuration (ecritInfoKSP) du KSP : Nom du solveur : mon_solvlin Librairie du Solveur : petsc Type du solveur : gcr (gcrSchur_) Type du pr?cond. : fieldsplit (gcrSchur_) Type Matrice : nest MatNonzeroState : 0 Type Matrice : nest Residual norms for gcrSchur_ solve. 0 KSP Residual norm 2.108717462444e-01 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged Residual norms for gcrSchur_fieldsplit_schur_ solve. 0 KSP Residual norm 1.651981957458e-03 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 1 KSP Residual norm 2.204951633351e-05 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 2 KSP Residual norm 1.118981498028e-19 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 1.118981498028e-19 is less than absolute tolerance 1.000000000000e-12 at iteration 2 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 1 KSP Residual norm 6.726023515396e-17 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 6.726023515396e-17 is less than absolute tolerance 1.000000000000e-11 at iteration 1 KSP Object:(gcrSchur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 1 maximum iterations=100, initial guess is zero tolerances: relative=1e-11, absolute=1e-11, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(gcrSchur_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 1 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-12, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: schurcomplement rows=8, cols=8 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 A10 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=8, cols=81 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=81, cols=8 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=89, cols=89 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="gcrSchur_fieldsplit_a_00_", type=seqaij, rows=81, cols=81 (0,1) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=81, cols=8 (1,0) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=8, cols=81 (1,1) : prefix="gcrSchur_fieldsplit_schur_", type=seqsbaij, rows=8, cols=8 ::fin VmSize: 580244 VmRSS: 128424 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.007586 SelfUser: 0.005 SelfSys: 0.003 ChildUser: 0 Childsys: 0 KSPSolve non zero id. : FAUX KSPSolve valeurs id. : FAUX [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 Err. rel. resolution : 4.4867e-16 / 3.56045e-16 ::fin VmSize: 580244 VmRSS: 128660 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.020825 SelfUser: 0.016 SelfSys: 0.005 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::CL+VecDDLs:AssembleurGD::debut VmSize: 580244 VmRSS: 128660 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128660 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.00024 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::appliqueCorrectionAuProbleme:AssembleurGD::debut VmSize: 580244 VmRSS: 128660 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128720 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.009112 SelfUser: 0.008 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128720 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.031035 SelfUser: 0.025 SelfSys: 0.006 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PPIteration:AssembleurGD::debut VmSize: 580244 VmRSS: 128720 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:FinIterationNlin::************ProblemeGD_aPPExecutePostTraitementIteration************::debut VmSize: 580244 VmRSS: 128720 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePostTraitementIteration************::effectueCalcul::debut VmSize: 580244 VmRSS: 128720 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128720 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000213 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128720 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.0006 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128720 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.001 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::AnalyseProcessusIteratif::CC:miseAZero: SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD::debut VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000206 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD it?ration # 1,chrono::AnalyseProcessusIteratif::CC:reqConvAtteinte:SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD::debut VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::CC::PreTrait:CCNL2Res:aPPMAJDeplacementNewtonPrecedent::debut VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::aPPMAJDeplacementNewtonPrecedent::effectueCalcul::debut VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000265 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000676 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::CC::PostTrait:CCNInf:aPPMAJNormeInfCorrection::debut VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::aPPMAJNormeInfCorrection::effectueCalcul::debut VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000208 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000598 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNInf(1)[81]= 0.113041[0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 chrono::CC::PostTrait:CCNL2CorRel:aPPMAJNormeL2CorrectionRelative::debut VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::aPPMAJNormeL2CorrectionRelative::effectueCalcul::debut VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000241 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128788 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000624 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNL2CorRel(0)= 0.995307chrono::CC::PostTrait:CCNInfRes:aPPMAJNormeInfResidu::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::aPPMAJNormeInfResidu::effectueCalcul::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000235 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000616 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNInfR?s(0)[50]= 0.123724chrono::CC::PostTrait:CCNL2Res:aPPMAJNormeL2Residu::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::aPPMAJNormeL2Residu::effectueCalcul::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000195 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000613 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNL2R?s(1)= 0.0223524::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.012743 SelfUser: 0.012 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PPIteration:AssembleurGD::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:DebutIterationNlin::************ProblemeGD_aPPExecutePreTraitementIteration************::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePreTraitementIteration************::effectueCalcul::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::aPPMAJNoIterationNewton::effectueCalcul::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000191 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::aPPMAJNoIterationCumule::effectueCalcul::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::PostTrait:aPPMAJNoIterationCumule:aPPMAJNoIterationCumuleAuxVoisins::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::PP::aPPMAJNoIterationCumuleAuxVoisins::effectueCalcul::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000223 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000609 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.00098 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.001753 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.002121 SelfUser: 0.001 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.002494 SelfUser: 0.001 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::faisAssemblage:AssembleurGD::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 Construction (creeMatriceEtAssigneProfile) de la matrice de prefixe: gcrSchur_ Type Matrice : nest Nouvelle Matrice : FAUX Nom du probleme : AssembleurGD Librairie du Solveur : mumps chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueChamp:AssembleurGD::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000347 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::debutAssemblage:AssembleurGD::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 WC: 0.000363 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::assembleMatriceEtResidu:AssembleurGD::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::ProblemeEF::assemblePriveDomaine:AssembleurGD::debut VmSize: 580244 VmRSS: 128844 VmPeak: 647920 VmData: 30148 VmHWM: 192648 chrono::ProblemeEF::assemblePrive:Mat:Vec:AssembleurGD::debut VmSize: 580244 VmRSS: 128848 VmPeak: 647920 VmData: 30148 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128872 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.04273 SelfUser: 0.041 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128872 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.044193 SelfUser: 0.043 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::ProblemeEF::assemblePrivePeau:AssembleurGD::debut VmSize: 580396 VmRSS: 128872 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128872 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000187 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128872 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.048016 SelfUser: 0.046 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueMatrice:AssembleurGD::debut VmSize: 580396 VmRSS: 128872 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::MatricePETSc::mettreAZeroLignes::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000976 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETSc::mettreAZeroLignes::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000859 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.027587 SelfUser: 0.027 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueResidu:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.009115 SelfUser: 0.009 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::finAssemblageAssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000286 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.087135 SelfUser: 0.085 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::resoudre_et_RechercheLineaire:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::SolveurLinPETSc::resoudre_Factorisation_et_DR:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::asgnOperateurKSP::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000226 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::resoudre:KSPSolve:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426695680 76804080 KSP Object:(gcrSchur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 1 maximum iterations=100, initial guess is zero tolerances: relative=1e-11, absolute=1e-11, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(gcrSchur_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 1 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-12, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: schurcomplement rows=8, cols=8 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 A10 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=8, cols=81 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=81, cols=8 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=89, cols=89 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="gcrSchur_fieldsplit_a_00_", type=seqaij, rows=81, cols=81 (0,1) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=81, cols=8 (1,0) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=8, cols=81 (1,1) : prefix="gcrSchur_fieldsplit_schur_", type=seqsbaij, rows=8, cols=8 Configuration (ecritInfoKSP) du KSP : Nom du solveur : mon_solvlin Librairie du Solveur : petsc Type du solveur : gcr (gcrSchur_) Type du pr?cond. : fieldsplit (gcrSchur_) Type Matrice : nest MatNonzeroState : 0 Type Matrice : nest [0] PCSetUp(): Setting up PC with same nonzero pattern Residual norms for gcrSchur_ solve. 0 KSP Residual norm 4.221202844964e-03 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Setting up PC with same nonzero pattern [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Setting up PC with same nonzero pattern [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged Residual norms for gcrSchur_fieldsplit_schur_ solve. 0 KSP Residual norm 1.111806403591e-04 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 1 KSP Residual norm 8.504395824449e-06 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 2 KSP Residual norm 1.699170180987e-20 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 1.699170180987e-20 is less than absolute tolerance 1.000000000000e-12 at iteration 2 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 1 KSP Residual norm 1.003584996519e-18 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 1.003584996519e-18 is less than absolute tolerance 1.000000000000e-11 at iteration 1 KSP Object:(gcrSchur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 2 maximum iterations=100, initial guess is zero tolerances: relative=1e-11, absolute=1e-11, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(gcrSchur_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 2 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-12, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: schurcomplement rows=8, cols=8 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 A10 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=8, cols=81 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=81, cols=8 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=89, cols=89 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="gcrSchur_fieldsplit_a_00_", type=seqaij, rows=81, cols=81 (0,1) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=81, cols=8 (1,0) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=8, cols=81 (1,1) : prefix="gcrSchur_fieldsplit_schur_", type=seqsbaij, rows=8, cols=8 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00749 SelfUser: 0.005 SelfSys: 0.003 ChildUser: 0 Childsys: 0 KSPSolve non zero id. : VRAI KSPSolve valeurs id. : FAUX [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 Err. rel. resolution : 3.06564e-16 / 2.66726e-16 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00933 SelfUser: 0.007 SelfSys: 0.003 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::CL+VecDDLs:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000186 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::appliqueCorrectionAuProbleme:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.008276 SelfUser: 0.008 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.018656 SelfUser: 0.016 SelfSys: 0.003 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PPIteration:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:FinIterationNlin::************ProblemeGD_aPPExecutePostTraitementIteration************::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePostTraitementIteration************::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000183 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000562 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000933 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD it?ration # 2,chrono::AnalyseProcessusIteratif::CC:reqConvAtteinte:SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::CC::PreTrait:CCNL2Res:aPPMAJDeplacementNewtonPrecedent::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJDeplacementNewtonPrecedent::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000251 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00062 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::CC::PostTrait:CCNInf:aPPMAJNormeInfCorrection::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeInfCorrection::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000186 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000561 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 CCNInf(1)[81]= 0.00932268chrono::CC::PostTrait:CCNL2CorRel:aPPMAJNormeL2CorrectionRelative::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeL2CorrectionRelative::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000187 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000559 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNL2CorRel(0)= 0.0632437chrono::CC::PostTrait:CCNInfRes:aPPMAJNormeInfResidu::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeInfResidu::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000185 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000554 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNInfR?s(0)[21]= 0.00141465chrono::CC::PostTrait:CCNL2Res:aPPMAJNormeL2Residu::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeL2Residu::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000185 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000554 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNL2R?s(1)= 0.000447447::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.012174 SelfUser: 0.01 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PPIteration:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:DebutIterationNlin::************ProblemeGD_aPPExecutePreTraitementIteration************::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePreTraitementIteration************::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNoIterationNewton::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000187 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::aPPMAJNoIterationCumule::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::PostTrait:aPPMAJNoIterationCumule:aPPMAJNoIterationCumuleAuxVoisins::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNoIterationCumuleAuxVoisins::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000199 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000581 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000948 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.001696 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002146 SelfUser: 0.001 SelfSys: 0.002 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002518 SelfUser: 0.001 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::faisAssemblage:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 Construction (creeMatriceEtAssigneProfile) de la matrice de prefixe: gcrSchur_ Type Matrice : nest Nouvelle Matrice : FAUX Nom du probleme : AssembleurGD Librairie du Solveur : mumps chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueChamp:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000312 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::debutAssemblage:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000307 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::assembleMatriceEtResidu:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::ProblemeEF::assemblePriveDomaine:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::ProblemeEF::assemblePrive:Mat:Vec:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.041738 SelfUser: 0.042 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.042774 SelfUser: 0.043 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::ProblemeEF::assemblePrivePeau:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000183 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.046532 SelfUser: 0.047 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueMatrice:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::MatricePETSc::mettreAZeroLignes::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000937 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETSc::mettreAZeroLignes::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000857 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.026958 SelfUser: 0.025 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueResidu:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00904 SelfUser: 0.009 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::finAssemblageAssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000285 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.084839 SelfUser: 0.082 SelfSys: 0.003 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::resoudre_et_RechercheLineaire:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::SolveurLinPETSc::resoudre_Factorisation_et_DR:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::asgnOperateurKSP::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000224 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::resoudre:KSPSolve:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426695680 76804080 KSP Object:(gcrSchur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 2 maximum iterations=100, initial guess is zero tolerances: relative=1e-11, absolute=1e-11, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(gcrSchur_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 2 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-12, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: schurcomplement rows=8, cols=8 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 A10 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=8, cols=81 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=81, cols=8 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=89, cols=89 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="gcrSchur_fieldsplit_a_00_", type=seqaij, rows=81, cols=81 (0,1) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=81, cols=8 (1,0) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=8, cols=81 (1,1) : prefix="gcrSchur_fieldsplit_schur_", type=seqsbaij, rows=8, cols=8 Configuration (ecritInfoKSP) du KSP : Nom du solveur : mon_solvlin Librairie du Solveur : petsc Type du solveur : gcr (gcrSchur_) Type du pr?cond. : fieldsplit (gcrSchur_) Type Matrice : nest MatNonzeroState : 0 Type Matrice : nest [0] PCSetUp(): Setting up PC with same nonzero pattern Residual norms for gcrSchur_ solve. 0 KSP Residual norm 3.649269236024e-07 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Setting up PC with same nonzero pattern [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Setting up PC with same nonzero pattern [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged Residual norms for gcrSchur_fieldsplit_schur_ solve. 0 KSP Residual norm 6.412889148832e-09 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 1 KSP Residual norm 3.715668495120e-10 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 2 KSP Residual norm 4.606225538561e-21 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 4.606225538561e-21 is less than absolute tolerance 1.000000000000e-12 at iteration 2 [0] PCSetUp(): Leaving PC with identical preconditioner since operator is unchanged 1 KSP Residual norm 4.606135580612e-21 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 4.606135580612e-21 is less than absolute tolerance 1.000000000000e-11 at iteration 1 KSP Object:(gcrSchur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 3 maximum iterations=100, initial guess is zero tolerances: relative=1e-11, absolute=1e-11, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(gcrSchur_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 3 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-12, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: schurcomplement rows=8, cols=8 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 A10 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=8, cols=81 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=81, cols=8 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=89, cols=89 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="gcrSchur_fieldsplit_a_00_", type=seqaij, rows=81, cols=81 (0,1) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=81, cols=8 (1,0) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=8, cols=81 (1,1) : prefix="gcrSchur_fieldsplit_schur_", type=seqsbaij, rows=8, cols=8 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.007458 SelfUser: 0.005 SelfSys: 0.002 ChildUser: 0 Childsys: 0 KSPSolve non zero id. : VRAI KSPSolve valeurs id. : FAUX [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 Err. rel. resolution : 1.89337e-14 / 1.26223e-14 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.009294 SelfUser: 0.007 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::CL+VecDDLs:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000189 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::appliqueCorrectionAuProbleme:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.008102 SelfUser: 0.008 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.018403 SelfUser: 0.016 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PPIteration:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:FinIterationNlin::************ProblemeGD_aPPExecutePostTraitementIteration************::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePostTraitementIteration************::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000181 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000545 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000928 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD it?ration # 3,chrono::AnalyseProcessusIteratif::CC:reqConvAtteinte:SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::CC::PreTrait:CCNL2Res:aPPMAJDeplacementNewtonPrecedent::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJDeplacementNewtonPrecedent::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000243 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000621 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::CC::PostTrait:CCNInf:aPPMAJNormeInfCorrection::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeInfCorrection::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000185 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000552 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNInf(1)[84]= 5.13693e-07chrono::CC::PostTrait:CCNL2CorRel:aPPMAJNormeL2CorrectionRelative::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeL2CorrectionRelative::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000185 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000558 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNL2CorRel(0)= 3.83885e-06chrono::CC::PostTrait:CCNInfRes:aPPMAJNormeInfResidu::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeInfResidu::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000187 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000568 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNInfR?s(0)[46]= 1.39608e-07chrono::CC::PostTrait:CCNL2Res:aPPMAJNormeL2Residu::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeL2Residu::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000226 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000587 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNL2R?s(1)= 3.86822e-08::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.012286 SelfUser: 0.011 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PPIteration:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:DebutIterationNlin::************ProblemeGD_aPPExecutePreTraitementIteration************::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePreTraitementIteration************::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNoIterationNewton::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000185 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::aPPMAJNoIterationCumule::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::PostTrait:aPPMAJNoIterationCumule:aPPMAJNoIterationCumuleAuxVoisins::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNoIterationCumuleAuxVoisins::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000198 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00057 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000936 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.001682 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002056 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002435 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::faisAssemblage:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 Construction (creeMatriceEtAssigneProfile) de la matrice de prefixe: gcrSchur_ Type Matrice : nest Nouvelle Matrice : FAUX Nom du probleme : AssembleurGD Librairie du Solveur : mumps chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueChamp:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000303 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::debutAssemblage:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000309 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::assembleMatriceEtResidu:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::ProblemeEF::assemblePriveDomaine:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::ProblemeEF::assemblePrive:Mat:Vec:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.041683 SelfUser: 0.041 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.042739 SelfUser: 0.042 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::ProblemeEF::assemblePrivePeau:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000184 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.046532 SelfUser: 0.046 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueMatrice:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::MatricePETSc::mettreAZeroLignes::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00088 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::MatricePETSc::mettreAZeroLignes::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000853 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 81; storage space: 0 unneeded,1845 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 54 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 81 X 8; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 8 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 27)/(num_localrows 81) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 8 X 81; storage space: 0 unneeded,144 used [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 24 [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 8) < 0.6. Do not use CompressedRow routines. [0] MatAssemblyEnd_SeqSBAIJ(): Matrix size: 8 X 8, block size 1; storage space: 0 unneeded, 8 used [0] MatAssemblyEnd_SeqSBAIJ(): Number of mallocs during MatSetValues is 0 [0] MatAssemblyEnd_SeqSBAIJ(): Most nonzeros blocks in any row is 1 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.026892 SelfUser: 0.026 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::appliqueResidu:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00914 SelfUser: 0.009 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::faisAssemblagePrive::finAssemblageAssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000316 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.08494 SelfUser: 0.083 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::resoudre_et_RechercheLineaire:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::SolveurLinPETSc::resoudre_Factorisation_et_DR:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::asgnOperateurKSP::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000224 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::resoudre:KSPSolve:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426695680 76804080 KSP Object:(gcrSchur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 3 maximum iterations=100, initial guess is zero tolerances: relative=1e-11, absolute=1e-11, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(gcrSchur_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 3 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-12, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: schurcomplement rows=8, cols=8 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 A10 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=8, cols=81 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=81, cols=8 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=89, cols=89 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="gcrSchur_fieldsplit_a_00_", type=seqaij, rows=81, cols=81 (0,1) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=81, cols=8 (1,0) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=8, cols=81 (1,1) : prefix="gcrSchur_fieldsplit_schur_", type=seqsbaij, rows=8, cols=8 Configuration (ecritInfoKSP) du KSP : Nom du solveur : mon_solvlin Librairie du Solveur : petsc Type du solveur : gcr (gcrSchur_) Type du pr?cond. : fieldsplit (gcrSchur_) Type Matrice : nest MatNonzeroState : 0 Type Matrice : nest [0] PCSetUp(): Setting up PC with same nonzero pattern Residual norms for gcrSchur_ solve. 0 KSP Residual norm 3.305223331380e-15 [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 3.305223331380e-15 is less than absolute tolerance 1.000000000000e-11 at iteration 0 KSP Object:(gcrSchur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 3 maximum iterations=100, initial guess is zero tolerances: relative=1e-11, absolute=1e-11, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object:(gcrSchur_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: gcr GCR: restart = 30 GCR: restarts performed = 3 maximum iterations=10000, initial guess is zero tolerances: relative=1e-12, absolute=1e-12, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: schurcomplement rows=8, cols=8 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 A10 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=8, cols=81 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.03902 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=81, cols=81, bs=3 package used to perform factorization: petsc total: nonzeros=1917, allocated nonzeros=1917 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 27 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (gcrSchur_fieldsplit_a_00_) 1 MPI processes type: seqaij rows=81, cols=81, bs=3 total: nonzeros=1845, allocated nonzeros=1845 total number of mallocs used during MatSetValues calls =0 not using I-node routines A01 Mat Object: (gcrSchur_BlocsHDiag_) 1 MPI processes type: seqaij rows=81, cols=8 total: nonzeros=144, allocated nonzeros=144 total number of mallocs used during MatSetValues calls =0 not using I-node routines Mat Object: (gcrSchur_fieldsplit_schur_) 1 MPI processes type: seqsbaij rows=8, cols=8 total: nonzeros=8, allocated nonzeros=8 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: nest rows=89, cols=89 Matrix object: type=nest, rows=2, cols=2 MatNest structure: (0,0) : prefix="gcrSchur_fieldsplit_a_00_", type=seqaij, rows=81, cols=81 (0,1) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=81, cols=8 (1,0) : prefix="gcrSchur_BlocsHDiag_", type=seqaij, rows=8, cols=81 (1,1) : prefix="gcrSchur_fieldsplit_schur_", type=seqsbaij, rows=8, cols=8 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00474 SelfUser: 0.004 SelfSys: 0.001 ChildUser: 0 Childsys: 0 KSPSolve non zero id. : VRAI KSPSolve valeurs id. : FAUX [0] PetscCommDuplicate(): Using internal PETSc communicator 139814426697728 68462368 Err. rel. resolution : 1 / 1 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.006545 SelfUser: 0.005 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::CL+VecDDLs:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000186 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::appliqueCorrectionAuProbleme:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00819 SelfUser: 0.008 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.01575 SelfUser: 0.014 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PPIteration:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:FinIterationNlin::************ProblemeGD_aPPExecutePostTraitementIteration************::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePostTraitementIteration************::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000182 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000587 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000956 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD it?ration # 4,chrono::AnalyseProcessusIteratif::CC:reqConvAtteinte:SolveurStatNlinPETSc::BouclePointFixe:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::CC::PreTrait:CCNL2Res:aPPMAJDeplacementNewtonPrecedent::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJDeplacementNewtonPrecedent::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000284 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000653 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::CC::PostTrait:CCNInf:aPPMAJNormeInfCorrection::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeInfCorrection::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000191 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000559 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNInf(1)[0]= 0chrono::CC::PostTrait:CCNL2CorRel:aPPMAJNormeL2CorrectionRelative::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeL2CorrectionRelative::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000187 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00056 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNL2CorRel(1)= 0chrono::CC::PostTrait:CCNInfRes:aPPMAJNormeInfResidu::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeInfResidu::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000186 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000555 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNInfR?s(1)[50]= 1.11196e-15chrono::CC::PostTrait:CCNL2Res:aPPMAJNormeL2Residu::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPMAJNormeL2Residu::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000186 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000575 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 CCNL2R?s(1)= 3.50353e-16::fin VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.012192 SelfUser: 0.01 SelfSys: 0.002 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::PostTraitParPasDeTemps:AssembleurGD::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:FinPasDeTemps::************ProblemeGD_aPPExecutePostTraitement************::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::************ProblemeGD_aPPExecutePostTraitement************::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::PreTrait:************ProblemeGD_aPPExecutePostTraitement************:aPPCalculPsiMoyen::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPCalculPsiMoyen::effectueCalcul::debut VmSize: 580396 VmRSS: 128992 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00289 SelfUser: 0.003 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.003268 SelfUser: 0.003 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::PreTrait:************ProblemeGD_aPPExecutePostTraitement************:aPPCalculVolumeNonDeforme::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPCalculVolumeNonDeforme::effectueCalcul::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000943 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.001361 SelfUser: 0 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::PP::PreTrait:************ProblemeGD_aPPExecutePostTraitement************:aPPCalculVolumeDeforme::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPCalculVolumeDeforme::effectueCalcul::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002386 SelfUser: 0.003 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002763 SelfUser: 0.003 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::ProblemeGD::executePostTraitement::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::ResiduDansChampReactionsNodales::effectueCalcul::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::ProblemeEF::assemblePriveDomaine:AssembleurGD::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::ProblemeEF::assemblePrive:Vec:AssembleurGD::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.030955 SelfUser: 0.03 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.03191 SelfUser: 0.03 SelfSys: 0.001 ChildUser: 0 Childsys: 0 [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs. [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.035811 SelfUser: 0.035 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.036312 SelfUser: 0.035 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.044626 SelfUser: 0.042 SelfSys: 0.002 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.045013 SelfUser: 0.042 SelfSys: 0.003 ChildUser: 0 Childsys: 0 chrono::GestionPrePostTraitement::executePrePostTrait:GESTIONPREPOSTDEFAUT:FinPasDeTemps::aPPExportVolume::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPExportVolume::effectueCalcul::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::PreTrait:aPPExportVolume:aPPCalculGradDefMoyen::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPCalculGradDefMoyen::effectueCalcul::debut VmSize: 580396 VmRSS: 129028 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002911 SelfUser: 0.002 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.003289 SelfUser: 0.002 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::PP::PreTrait:aPPExportVolume:aPPCalculPKIIMoyen::debut VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPCalculPKIIMoyen::effectueCalcul::debut VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.003516 SelfUser: 0.003 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.003905 SelfUser: 0.003 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::PreTrait:aPPExportVolume:aPPCalculPsiMoyen::debut VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPCalculPsiMoyen::effectueCalcul::debut VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002656 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.003036 SelfUser: 0.003 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::PreTrait:aPPExportVolume:aPPCalculVolumeNonDeforme::debut VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPCalculVolumeNonDeforme::effectueCalcul::debut VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000905 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.001286 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::PreTrait:aPPExportVolume:aPPCalculVolumeDeforme::debut VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPCalculVolumeDeforme::effectueCalcul::debut VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00236 SelfUser: 0.002 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129088 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.002745 SelfUser: 0.002 SelfSys: 0.001 ChildUser: 0 Childsys: 0 chrono::Maillage::exporteParallele::debut VmSize: 580396 VmRSS: 129192 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129312 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.004005 SelfUser: 0.004 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::PostTrait:aPPExportVolume:aPPEcrisFichierChampsPourAdaptation::debut VmSize: 580396 VmRSS: 129408 VmPeak: 647920 VmData: 30300 VmHWM: 192648 chrono::PP::aPPEcrisFichierChampsPourAdaptation::effectueCalcul::debut VmSize: 580396 VmRSS: 129408 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000594 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.00101 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.026378 SelfUser: 0.023 SelfSys: 0.003 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.026761 SelfUser: 0.023 SelfSys: 0.003 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.072381 SelfUser: 0.066 SelfSys: 0.006 ChildUser: 0 Childsys: 0 chrono::SolveurStatNlinPETSc::EcritResuPostParPasDeTemps:AssembleurGD::debut VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000185 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::aPPExportGIREFCCCorrection::effectueCalcul::debut VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000418 SelfUser: 0.001 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::PP::aPPExportGIREFCCResidu::effectueCalcul::debut VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000371 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 ::fin VmSize: 580396 VmRSS: 129532 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 1.12123 SelfUser: 0.672 SelfSys: 0.066 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::detruitKSP::debut VmSize: 578344 VmRSS: 129656 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 578344 VmRSS: 129656 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000225 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 chrono::SolveurLinPETSc::detruitKSP::debut VmSize: 578344 VmRSS: 129656 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 578344 VmRSS: 129656 VmPeak: 647920 VmData: 30300 VmHWM: 192648 WC: 0.000189 SelfUser: 0 SelfSys: 0 ChildUser: 0 Childsys: 0 Destructeur DDLsNumerotation Destructeur DDLsNumerotation Destructeur DDLsNumerotation Destructeur DDLsNumerotation Destructeur DDLsNumerotation chrono::SolveurLinPETSc::detruitKSP::debut VmSize: 563980 VmRSS: 129728 VmPeak: 647920 VmData: 30300 VmHWM: 192648 ::fin VmSize: 563792 VmRSS: 124952 VmPeak: 647920 VmData: 30112 VmHWM: 192648 WC: 0.001961 SelfUser: 0.001 SelfSys: 0.001 ChildUser: 0 Childsys: 0 ::fin VmSize: 563792 VmRSS: 124964 VmPeak: 647920 VmData: 30112 VmHWM: 192648 WC: 2.4343 SelfUser: 1.9 SelfSys: 0.149 ChildUser: 0 Childsys: 0 [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator embedded in a user MPI_Comm 73355168 [0] Petsc_DelComm_Outer(): User MPI_Comm 68383344 is being freed after removing reference from inner PETSc comm to this outer comm [0] PetscFinalize(): PetscFinalize() called [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 68462368 [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator embedded in a user MPI_Comm 68462368 [0] Petsc_DelComm_Outer(): User MPI_Comm 139814426697728 is being freed after removing reference from inner PETSc comm to this outer comm [0] PetscCommDestroy(): Deleting PETSc MPI_Comm 68462368 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm 68462368 [0] Petsc_DelThreadComm(): Deleting thread communicator data in an MPI_Comm 68462368 [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 68462368 [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 73355168 [0] PetscCommDestroy(): Deleting PETSc MPI_Comm 73355168 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm 73355168 [0] Petsc_DelThreadComm(): Deleting thread communicator data in an MPI_Comm 73355168 [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 73355168 [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 76804080 [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator embedded in a user MPI_Comm 76804080 [0] Petsc_DelComm_Outer(): User MPI_Comm 139814426695680 is being freed after removing reference from inner PETSc comm to this outer comm [0] PetscCommDestroy(): Deleting PETSc MPI_Comm 76804080 [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm 76804080 [0] Petsc_DelThreadComm(): Deleting thread communicator data in an MPI_Comm 76804080 [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm 76804080 WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-Options_ProjectionL2ksp_atol value: 1e-15 Option left: name:-Options_ProjectionL2ksp_divtol value: 1e+12 Option left: name:-Options_ProjectionL2ksp_max_it value: 10000 Option left: name:-Options_ProjectionL2ksp_rtol value: 1e-15 Option left: name:-Options_ProjectionL2pc_hypre_type value: boomeramg Option left: name:-gcrSchur_fieldsplit_a_00_mat_bcs_columnmajor (no value) Option left: name:-gcrSchur_fieldsplit_a_00_mat_mkl_pardiso_6 value: 0 Option left: name:-gcrSchur_fieldsplit_a_00_mat_pardiso_69 value: 11 Option left: name:-gcrSchur_fieldsplit_a_00_mg_coarse_pc_factor_mat_solver_package value: mumps Option left: name:-gcrSchur_fieldsplit_a_00_pc_ml_maxNlevels value: 2 Option left: name:-gcrSchur_fieldsplit_a_11_mat_bcs_columnmajor (no value) Option left: name:-gcrSchur_fieldsplit_a_11_mat_mkl_pardiso_6 value: 0 Option left: name:-gcrSchur_fieldsplit_a_11_mat_mumps_icntl_14 value: 33 Option left: name:-gcrSchur_fieldsplit_a_11_mat_pardiso_69 value: 11 From knepley at gmail.com Thu Sep 11 11:26:26 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Sep 2014 11:26:26 -0500 Subject: [petsc-users] Curiosity about MatSetOptionsPrefix on a_11 in PCSetUp_FieldSplit In-Reply-To: <5411C0F9.6090106@giref.ulaval.ca> References: <54042CDA.9080002@ntnu.no> <5411C0F9.6090106@giref.ulaval.ca> Message-ID: On Thu, Sep 11, 2014 at 10:34 AM, Eric Chamberland < Eric.Chamberland at giref.ulaval.ca> wrote: > Hi, > > I was just curious to know why the prefix of the sub-matrix a_{11} in a > matnest is forced to the ksp prefix by PCSetUp_FieldSplit? > > In my case it changes from "gcrSchur_fieldsplit_a_11_" to > "gcrSchur_fieldsplit_schur_" after the PCSetUp_FieldSplit, by this (I > think) lines of code in fieldsplit.c (petsc 3.5.2): > > const char *prefix; > ierr = MatGetSubMatrix(pc->pmat,ilink->is,ilink->is_col,MAT_ > INITIAL_MATRIX,&jac->pmat[i]);CHKERRQ(ierr); > ierr = KSPGetOptionsPrefix(ilink->ksp,&prefix);CHKERRQ(ierr); > ierr = MatSetOptionsPrefix(jac->pmat[i],prefix);CHKERRQ(ierr); > ierr = MatViewFromOptions(jac->pmat[i],NULL,"-mat_view");CHKERRQ(ierr); > I did this. My thinking was that if someone wanted to view the Schur complement matrix, then this was the correct prefix. > We wanted to pass options to the a_11 matrix, but using the > PC_COMPOSITE_SCHUR, we have to give the a_{11} matrix a unique prefix > different from the ksp prefix used to solve the shur complement (which we > named "schur") and have MatSetFromOptions use this unique prefixe. It all > worked, but we saw this "curiosity" in the kspview (view attached log) > about the PCSetUp_FieldSplit which rename our matrix (after all the > precautions we took to name it differently)!!!! > I am trying to understand the problem. This matrix is created by by the MatGetSubMatrix() call. 1) What options are you trying to pass to it? 2) Why would they interfere with the KSP options? Thanks, Matt > The code is working, so maybe this is not an issue, but I can't tell from > my knowledge if it can be harmful? > > thank you! > > Eric > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Thu Sep 11 11:47:08 2014 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Thu, 11 Sep 2014 12:47:08 -0400 Subject: [petsc-users] Curiosity about MatSetOptionsPrefix on a_11 in PCSetUp_FieldSplit In-Reply-To: References: <54042CDA.9080002@ntnu.no> <5411C0F9.6090106@giref.ulaval.ca> Message-ID: <5411D20C.4010305@giref.ulaval.ca> On 09/11/2014 12:26 PM, Matthew Knepley wrote: > I am trying to understand the problem. This matrix is created by by the > MatGetSubMatrix() call. If this is right, then I misunderstood the effect of MatGetSubMatrix on a MatNest. We created a MatNest which holds the a_11 matrix... > > 1) What options are you trying to pass to it? mat_type sbaij mat_block_size 3 (not in this example however) > > 2) Why would they interfere with the KSP options? I didn't mentioned that they could interfere, just that the name was changed and that it was "curious" from my point of view. But if the sub-matrix extracted from the matnest is not the a_11 matrix we gave to the matnest, then maybe we did something wrong? Or you can't verify that the extracted matrix did exists before the extraction? Again, the options we gave are treated correctly, since we create the matrix and give it to the nest. The "curious" thing is when petsc retreive (the same?) matrix and changes its name... Thanks again! Eric From knepley at gmail.com Thu Sep 11 11:52:05 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Sep 2014 11:52:05 -0500 Subject: [petsc-users] Curiosity about MatSetOptionsPrefix on a_11 in PCSetUp_FieldSplit In-Reply-To: <5411D20C.4010305@giref.ulaval.ca> References: <54042CDA.9080002@ntnu.no> <5411C0F9.6090106@giref.ulaval.ca> <5411D20C.4010305@giref.ulaval.ca> Message-ID: On Thu, Sep 11, 2014 at 11:47 AM, Eric Chamberland < Eric.Chamberland at giref.ulaval.ca> wrote: > On 09/11/2014 12:26 PM, Matthew Knepley wrote: > >> I am trying to understand the problem. This matrix is created by by the >> MatGetSubMatrix() call. >> > > If this is right, then I misunderstood the effect of MatGetSubMatrix on a > MatNest. We created a MatNest which holds the a_11 matrix... > MatNest is absolutely the worst thing in the PETSc interface. You should never ever ever ever be calling MatNest directly. You should be assembling into one matrix from views. Then MatNest can be used for optimization in the background. >> 1) What options are you trying to pass to it? >> > > mat_type sbaij > mat_block_size 3 (not in this example however) > These are reasonable, but they really apply at creation time (since you would not want to convert after values have been set), and it sounds like that is what you are doing. > >> 2) Why would they interfere with the KSP options? >> > > I didn't mentioned that they could interfere, just that the name was > changed and that it was "curious" from my point of view. > Okay, so the name change is strange, and happens because MatNest returns a reference to the inner matrix rather than some view which gets created and destroyed. Let me talk to Jed. Thanks, Matt > But if the sub-matrix extracted from the matnest is not the a_11 matrix we > gave to the matnest, then maybe we did something wrong? Or you can't > verify that the extracted matrix did exists before the extraction? > > Again, the options we gave are treated correctly, since we create the > matrix and give it to the nest. The "curious" thing is when petsc retreive > (the same?) matrix and changes its name... > > Thanks again! > > Eric > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Thu Sep 11 12:31:17 2014 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Thu, 11 Sep 2014 13:31:17 -0400 Subject: [petsc-users] Curiosity about MatSetOptionsPrefix on a_11 in PCSetUp_FieldSplit In-Reply-To: References: <54042CDA.9080002@ntnu.no> <5411C0F9.6090106@giref.ulaval.ca> <5411D20C.4010305@giref.ulaval.ca> Message-ID: <5411DC65.4020505@giref.ulaval.ca> On 09/11/2014 12:52 PM, Matthew Knepley wrote: > On Thu, Sep 11, 2014 at 11:47 AM, Eric Chamberland > > MatNest is absolutely the worst thing in the PETSc interface. You should Sounds so strange to me... :-) ... because we decided to use MatNest to create sub-matrices for linear vs quadratic parts of a velocity field (http://onlinelibrary.wiley.com/doi/10.1002/nla.757/abstract). We choosed MatNest to minimize the memory used, and "overloaded" the MatSetValues calls to be able to do the assembly of u-u blocks in all 4 sub-matrices (doing global to local conversions with strides only). This prevent us to do 4 different loops over all the elements (in other words, it allows the assembly part of our code to not be aware that we are doing the assembly into a matnest)! We just have to do the numbering of the dofs correctly before everything. > never ever ever ever > be calling MatNest directly. You should be assembling into one matrix > from views. Then MatNest > can be used for optimization in the background. By "views" you mean doing MatGetLocalSubMatrix and MatRestoreLocalSubMatrix like in http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex28.c.html ? but... If I do a velocity(u)-pressure(p) field problem, how do I fix a block size of 3 for the u-u part and 1 for other parts without using a MatNest (with sub-matrices with good options)? (this will allows me to use a "fieldsplit_0_pc_type ml" with block size 3 on the u-u part for example) > These are reasonable, but they really apply at creation time (since you > would not want > to convert after values have been set), and it sounds like that is what > you are doing. yep. > Okay, so the name change is strange, and happens because MatNest returns > a reference > to the inner matrix rather than some view which gets created and > destroyed. Let me talk > to Jed. ok, thanks for your help!. Eric From evanum at gmail.com Thu Sep 11 15:24:36 2014 From: evanum at gmail.com (Evan Um) Date: Thu, 11 Sep 2014 13:24:36 -0700 Subject: [petsc-users] Using iterative refinement of MUMPS from PETSC Message-ID: Dear PETSC and MUMPS users, I try to use an iterative refinement option (ICNTL(10)=max # of iterative refinement) of MUMPS in my PETSC application. MUMPS manual says that if the solution is kept distributed (ICNTL(21)=1), the iterative refinement option is disabled. When a problem is solved using KSPSolve() with multiple cores, the solution is automatically kept distributed over the processors. Does this mean that iterative refinement of MUMPS is available from PETSC only when we run our application on a single core? I tried to solve my problem on multiple cores with ICNTL(10)=10 and INCTL(21)=0, but the program crashed. Does anyone know how to use iterative refinement option of MUMPS from PETSC on a parallel computer? In advance, thanks for your comments. Regards, Evan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 11 15:31:51 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Sep 2014 15:31:51 -0500 Subject: [petsc-users] Using iterative refinement of MUMPS from PETSC In-Reply-To: References: Message-ID: On Thu, Sep 11, 2014 at 3:24 PM, Evan Um wrote: > Dear PETSC and MUMPS users, > > I try to use an iterative refinement option (ICNTL(10)=max # of iterative > refinement) of MUMPS in my PETSC application. MUMPS manual says that if the > solution is kept distributed (ICNTL(21)=1), the iterative refinement option > is disabled. When a problem is solved using KSPSolve() with multiple cores, > the solution is automatically kept distributed over the processors. Does > this mean that iterative refinement of MUMPS is available from PETSC only > when we run our application on a single core? I tried to solve my problem > on multiple cores with ICNTL(10)=10 and INCTL(21)=0, but the program > crashed. Does anyone know how to use iterative refinement option of MUMPS > from PETSC on a parallel computer? In advance, thanks for your comments. > Just use Newton to solve your system, which for a linear operator is iterative refinement. Matt > Regards, > Evan > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Sep 11 15:48:01 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 11 Sep 2014 15:48:01 -0500 Subject: [petsc-users] Using iterative refinement of MUMPS from PETSC In-Reply-To: References: Message-ID: <8BC05E73-B16E-42D4-8107-F4DD17159DAD@mcs.anl.gov> Evan, Just use PETSc to do the iterative refinement for you. For example -ksp_type gmres -pc_type lu -ksp_max_its 2 Barry On Sep 11, 2014, at 3:24 PM, Evan Um wrote: > Dear PETSC and MUMPS users, > > I try to use an iterative refinement option (ICNTL(10)=max # of iterative refinement) of MUMPS in my PETSC application. MUMPS manual says that if the solution is kept distributed (ICNTL(21)=1), the iterative refinement option is disabled. When a problem is solved using KSPSolve() with multiple cores, the solution is automatically kept distributed over the processors. Does this mean that iterative refinement of MUMPS is available from PETSC only when we run our application on a single core? I tried to solve my problem on multiple cores with ICNTL(10)=10 and INCTL(21)=0, but the program crashed. Does anyone know how to use iterative refinement option of MUMPS from PETSC on a parallel computer? In advance, thanks for your comments. > > Regards, > Evan > > > From jed at jedbrown.org Thu Sep 11 20:17:28 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 11 Sep 2014 19:17:28 -0600 Subject: [petsc-users] Curiosity about MatSetOptionsPrefix on a_11 in PCSetUp_FieldSplit In-Reply-To: <5411DC65.4020505@giref.ulaval.ca> References: <54042CDA.9080002@ntnu.no> <5411C0F9.6090106@giref.ulaval.ca> <5411D20C.4010305@giref.ulaval.ca> <5411DC65.4020505@giref.ulaval.ca> Message-ID: <87iokty6vr.fsf@jedbrown.org> Eric Chamberland writes: > On 09/11/2014 12:52 PM, Matthew Knepley wrote: >> On Thu, Sep 11, 2014 at 11:47 AM, Eric Chamberland >> >> MatNest is absolutely the worst thing in the PETSc interface. You should > > Sounds so strange to me... :-) MatNest is really supposed to be a backend optimization, not a public interface. Perhaps irresponsibly, we expose a few MatNest*() functions because those interfaces are convenient for a few legacy applications that made poor design/workflow decisions. There are a few other functions in PETSc with similar status, e.g., MatCreateMPIAIJWithSplitArrays(), but surely Matt would rate MatNest as the most popular and harmful such interface. > ... because we decided to use MatNest to create sub-matrices for linear > vs quadratic parts of a velocity field > (http://onlinelibrary.wiley.com/doi/10.1002/nla.757/abstract). We > choosed MatNest to minimize the memory used, and "overloaded" the > MatSetValues calls to be able to do the assembly of u-u blocks in all 4 > sub-matrices (doing global to local conversions with strides only). Better to work with local indices all along. The result is cleaner and faster. I understand that this option may not be available due to prior design choices. > This prevent us to do 4 different loops over all the elements (in other > words, it allows the assembly part of our code to not be aware that we > are doing the assembly into a matnest)! We just have to do the > numbering of the dofs correctly before everything. > >> never ever ever ever >> be calling MatNest directly. You should be assembling into one matrix >> from views. Then MatNest >> can be used for optimization in the background. > > By "views" you mean doing MatGetLocalSubMatrix and > MatRestoreLocalSubMatrix like in > http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex28.c.html > ? > > but... If I do a velocity(u)-pressure(p) field problem, how do I fix a > block size of 3 for the u-u part and 1 for other parts without using a > MatNest (with sub-matrices with good options)? The isrow and iscol arguments should have a block size of 3. Then the Mat returned by MatGetLocalSubMatrix will inherit this block size such that MatSetValuesBlockedLocal behaves as desired. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From mailinglists at xgm.de Fri Sep 12 02:32:15 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Fri, 12 Sep 2014 09:32:15 +0200 Subject: [petsc-users] Putting petsc in a namespace In-Reply-To: <8761gwby3j.fsf@jedbrown.org> References: <2589998.d4cIGtPG42@asaru> <8761gwby3j.fsf@jedbrown.org> Message-ID: <2161102.IeOYBfoxz2@asaru> Thanks to both of you for your input. Actually I'm not too sure what I want, and my wrapper will evolve... I'm not going for a complete wrapper, more a set of helper functions that are contained in an object. Rgds, ... Florian Am Dienstag, 9. September 2014, 09:41:20 schrieb Jed Brown: > Florian Lindner writes: > > > If a user of my class library does an #include "mypetsc.h" he should not get all the petsc symbols in his global namespace, e.g. > > > > #include "mypetsc.h" > > > > Vector v; // is an object of my class lib > > Vec pv; // should not work, since I do not want petsc symbols in my global namespace > > petsc::Vec pv; // fine, since petsc symbols are contained in namespace petsc. But this does not seem to be possible. > > > > petsc::Vec = v.vector // vector of type petsc::Vec is a public member of Vector. That's why "mypetsc.h" needs to #include "petscvec.h". Because of that an #include "mypetsc" imports all the symbols of "petscvec" into anyone wanting to use Vector. > > In your first email, you wrote: > > #include "petsc.h" > #include // if the user wants he can import parts of petsc > > This is trying to include the same file twice, which you cannot do. You > can wrap a namespace around a PETSc header, but it will break if the > user includes a PETSc file directly either before or after, and macros > like ADD_VALUES, NORM_2, PETSC_VIEWER_STDOUT_WORLD, etc., will not have > the petsc:: namespace. > > I think that your wrapper class with public members provides no value > and will be more difficult to use. At least this has been the case in > every instance I have seen, and almost everyone that tries later > concludes that it was a bad idea and abandons it. What value do you > think it provides? It's not encapsulation because the implementation is > public and it doesn't break any dependencies because the header is still > included. So it's mostly an extra layer of (logical) indirection > providing only a minor syntax change that will not be familiar to other > users of PETSc. What value is that? > > > Now if you mean to request that all PETSc symbols be unified in the > (C-style) Petsc* namespace instead of occupying a handful of others > (Vec*, Mat*, KSP*, etc.), we know that it's the right thing to do and > have been holding off mainly to limit the amount of changes that we > force users to deal with in order to upgrade. From mailinglists at xgm.de Fri Sep 12 02:56:35 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Fri, 12 Sep 2014 09:56:35 +0200 Subject: [petsc-users] Setting entries of symmetric matrix Message-ID: <1748903.GXlls7oxDY@asaru> Hello, I have a matrix that have the option set MAT_SYMMETRY_ETERNAL and set some values in the upper triangular. When reading values I was expecting that Petsc makes it a symmetric matrix, but the lower triangular is empty like it was initialized. Thanks, Florian Example code: #include #include "petscmat.h" #include "petscviewer.h" using namespace std; // Compiling with: mpic++ -g3 -Wall -I ~/software/petsc/include -I ~/software/petsc/arch-linux2-c-debug/include -L ~/software/petsc/arch-linux2-c-debug/lib -lpetsc test.cpp int main(int argc, char **args) { PetscInitialize(&argc, &args, "", NULL); PetscErrorCode ierr = 0; int N = 4; Mat matrix; // Create dense matrix, but code should work for sparse, too (I hope) // dense is more convenient to MatView. ierr = MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, N, N, NULL, &matrix); CHKERRQ(ierr); ierr = MatSetUp(matrix); CHKERRQ(ierr); ierr = MatSetOption(matrix, MAT_SYMMETRY_ETERNAL, PETSC_TRUE); CHKERRQ(ierr); MatSetValue(matrix, 1, 1, 1, INSERT_VALUES); MatSetValue(matrix, 1, 2, 2, INSERT_VALUES); MatSetValue(matrix, 1, 3, 3, INSERT_VALUES); MatSetValue(matrix, 2, 3, 4, INSERT_VALUES); ierr = MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); ierr = MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); const PetscScalar *vals; ierr = MatGetRow(matrix, 2, NULL, NULL, &vals); cout << "Vals = " << vals[0] << " " << vals[1] << " " << vals[2] << " " << vals[3] << endl; // prints: Vals = 0 0 0 4 // excepted: Vals = 0 2 0 4 ierr = MatRestoreRow(matrix, 2, NULL, NULL, &vals); ierr = MatView(matrix, PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); PetscFinalize(); return 0; } From knepley at gmail.com Fri Sep 12 07:49:37 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Sep 2014 07:49:37 -0500 Subject: [petsc-users] Setting entries of symmetric matrix In-Reply-To: <1748903.GXlls7oxDY@asaru> References: <1748903.GXlls7oxDY@asaru> Message-ID: On Fri, Sep 12, 2014 at 2:56 AM, Florian Lindner wrote: > Hello, > > I have a matrix that have the option set MAT_SYMMETRY_ETERNAL and set some > values in the upper triangular. When reading values I was expecting that > Petsc makes it a symmetric matrix, but the lower triangular is empty like > it was initialized. > This is not the point of the functions. The mechanics of symmetry are handled by the matrix type, SBAIJ. This option is an optimization hint. It says that the matrix will respond that it is symmetric if asked, which may be much much cheaper than checking whether it is actually symmetric. Matt > Thanks, > Florian > > Example code: > > > #include > #include "petscmat.h" > #include "petscviewer.h" > > using namespace std; > > // Compiling with: mpic++ -g3 -Wall -I ~/software/petsc/include -I > ~/software/petsc/arch-linux2-c-debug/include -L > ~/software/petsc/arch-linux2-c-debug/lib -lpetsc test.cpp > > int main(int argc, char **args) > { > PetscInitialize(&argc, &args, "", NULL); > > PetscErrorCode ierr = 0; > int N = 4; > Mat matrix; > > // Create dense matrix, but code should work for sparse, too (I hope) > // dense is more convenient to MatView. > ierr = MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, N, > N, NULL, &matrix); CHKERRQ(ierr); > ierr = MatSetUp(matrix); CHKERRQ(ierr); > ierr = MatSetOption(matrix, MAT_SYMMETRY_ETERNAL, PETSC_TRUE); > CHKERRQ(ierr); > > MatSetValue(matrix, 1, 1, 1, INSERT_VALUES); > MatSetValue(matrix, 1, 2, 2, INSERT_VALUES); > MatSetValue(matrix, 1, 3, 3, INSERT_VALUES); > MatSetValue(matrix, 2, 3, 4, INSERT_VALUES); > > ierr = MatAssemblyBegin(matrix, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > ierr = MatAssemblyEnd(matrix, MAT_FINAL_ASSEMBLY); CHKERRQ(ierr); > > const PetscScalar *vals; > ierr = MatGetRow(matrix, 2, NULL, NULL, &vals); > cout << "Vals = " << vals[0] << " " << vals[1] << " " << vals[2] << " " > << vals[3] << endl; > // prints: Vals = 0 0 0 4 > // excepted: Vals = 0 2 0 4 > ierr = MatRestoreRow(matrix, 2, NULL, NULL, &vals); > > ierr = MatView(matrix, PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr); > > PetscFinalize(); > return 0; > } > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From zeusdvds at gmail.com Fri Sep 12 09:32:26 2014 From: zeusdvds at gmail.com (Zeus) Date: Fri, 12 Sep 2014 07:32:26 -0700 Subject: [petsc-users] Zeus Newsletter- New Releases and More Message-ID: <541303fab72d4_2a4837fe9c16997579@prd-nresque02.sf.verticalresponse.com.mail> View Online ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cHM6Ly92cjIudmVydGljYWxyZXNwb25zZS5jb20vZW1haWxzLzIxOTkwMjMyNjk5NzA_Y29udGFjdF9pZD0yMTk5MDI5NDY4ODQ0/JJJbbQUylqrfY38UFxet_Q==&merge_field_type=%7BVR_HOSTED_LINK%7D ) Hello From Mount Olympus?I have news and new releases from Zeus DVDs?Dear Zeus Worshippers,?Did you know that all orders ship completely free regardless of destination? ?I will ship anywhere in the world. ?The default payment method on the website is by credit card but I can also accept PayPal, Checks and Money Orders. ?Just send an email to?milo at zeusdvds.com ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/bWFpbHRvOm1pbG9AemV1c2R2ZHMuY29t/tWtTUKUaBpCkcUH8n3tO6Q==&merge_field_type=(?x-mi:(?%3C=href=)[ )?if you want to use one of those alternate payment methods and it will be taken care of. ?I cannot take payment over the phone.Return customers will automatically receive a discount when they checkout. ?They are also able to purchase DVDs priced at $11.99 for only $9.99. ?If you put an $11.99 DVD in your shopping cart, when you checkout you will automatically receive a discount bringing the price down to $9.99. ?This is in addition to your repeat patronage discount.?Here is the list of New Releases that are now shipping. ?Just click on the movie title and it will take you to the movie listing on the website.?Weekend In Havana (1941) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vd2Vlay1lbmQtaW4taGF2YW5hLTE5NDEtZHZkLw==/aguftLcp9dRbd9PJsW2IRg==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Wild Frontier (1947) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLXdpbGQtZnJvbnRpZXItMTk0Ny1kdmQv/p8Qz9gjUuVW_FOtQn75JyA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Le dimanche de la vie (1967) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/ZmlsZTovLy9DOi9aZXVzJTIwTmV3c2xldHRlcnMvTGUlMjBkaW1hbmNoZSUyMGRlJTIwbGElMjB2aWUlMjAoMTk2Nyk=/cbQyvSm1AekBfuGw1rLExA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Bandits of Dark Canyon (1947) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYmFuZGl0cy1vZi1kYXJrLWNhbnlvbi0xOTQ3LWR2ZC8=/xNqky_N4ztbemetqx7ADBw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Oklahoma Badlands (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vb2tsYWhvbWEtYmFkbGFuZHMtMTk0OC1kdmQv/XIr-0QRP9jtXiU3D0Xfkog==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Deadly Affair (1966) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWRlYWRseS1hZmZhaXItMTk2Ni1kdmQv/G9z6ybgqnRSJvwmPCAVxHQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Bold Frontiersman (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWJvbGQtZnJvbnRpZXJzbWFuLTE5NDgtZHZkLw==/MKJ-szBIBw4cz8WZhr0Ckw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Carson City Raiders (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vY2Fyc29uLWNpdHktcmFpZGVycy0xOTQ4LWR2ZC8=/Q7wy--K9H9xriVfD3dCFhw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Marshall of Amarillo (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vbWFyc2hhbGwtb2YtYW1hcmlsbG8tMTk0OC1kdmQv/i3pBNWfQZhGBnkytK-VpXg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Desperadoes of Dodge City (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZGVzcGVyYWRvZXMtb2YtZG9kZ2UtY2l0eS0xOTQ4LWR2ZC8=/pocWdrn4bXypHOT7UYROuQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Denver Kid (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWRlbnZlci1raWQtMTk0OC1kdmQv/WBrBe257pYgZ2M4JBD0k5A==&merge_field_type=(?x-mi:(?%3C=href=)[ )Sundown in Santa Fe (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vc3VuZG93bi1pbi1zYW50YS1mZS0xOTQ4LWR2ZC8=/uskaRhwpxfjzMwqtrjlGAw==&merge_field_type=(?x-mi:(?%3C=href=)[ )??If you are interested in any of the following, please send email inquiries to?milo at zeusdvds.com ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/bWFpbHRvOm1pbG9AemV1c2R2ZHMuY29tP3N1YmplY3Q9T2ZmbGluZSUyMFRpdGxlJTIwUmVxdWVzdA==/BEwe3e7b8cH1m6AluvFW9Q==&merge_field_type=(?x-mi:(?%3C=href=)[ ):The Americanization of Emily (1964)Government Girl (1943)Count Your Blessings (1957)The Barretts of Wimpole Street (1957)These are the films that have been selling the most lately. ?Just click on the movie title and it will take you to the movie listing on the website.Beau Geste (1966) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYmVhdS1nZXN0ZS0xOTY2LWR2ZC8=/MZb0x2SSxMTcLPUqHjOhZg==&merge_field_type=(?x-mi:(?%3C=href=)[ )A Man Could Get Killed (1966) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYS1tYW4tY291bGQtZ2V0LWtpbGxlZC0xOTY2LWR2ZC8=/I38fxqt428UhP4O-5wA_iA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Kiss of Fire (1955) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20va2lzcy1vZi1maXJlLTE5NTUtZHZkLw==/wvKLIGJREQ3myn6Ze1EQFg==&merge_field_type=(?x-mi:(?%3C=href=)[ )A Perilous Journey (1953) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYS1wZXJpbG91cy1qb3VybmV5LTE5NTMtZHZkLw==/g5_x5jxBSvEKitbVBAf1sQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Law And Order (1976) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vbGF3LWFuZC1vcmRlci0xOTc2LWR2ZC8=/qhVuhqH-bhddOC-ILU_R-A==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Honkers (1972) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWhvbmtlcnMtMTk3Mi1kdmQv/q_utLy6_N8enIMF7a0KSYA==&merge_field_type=(?x-mi:(?%3C=href=)[ )South of Dixie (1944) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vc291dGgtb2YtZGl4aWUtMTk0NC1kdmQv/oJYRJAjg1E9v-iYPDZs3-w==&merge_field_type=(?x-mi:(?%3C=href=)[ )You Never Can Tell (1951) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20veW91LW5ldmVyLWNhbi10ZWxsLTE5NTEtZHZkLw==/-v2mwOwiDZVzP2k019Wmxg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Kitty (1945) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20va2l0dHktMTk0NS1kdmQv/k7ckaAFFG70BL_ubYMVevA==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Hawaiians (1970) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWhhd2FpaWFucy0xOTcwLWR2ZC8=/lMteJ1EBHxxskYN_0OCqRA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Aloha Summer (1988) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYWxvaGEtc3VtbWVyLTE5ODgtZHZkLw==/MvyR1vVFvVy_xvTKfW00qQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Once More, My Darling (1949) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vb25jZS1tb3JlLW15LWRhcmxpbmctMTk0OS1kdmQv/Fc7_7KXi8k-AL8xJpmM9lg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Desperado- Badlands Justice (1989) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZGVzcGVyYWRvLWJhZGxhbmRzLWp1c3RpY2UtMTk4OS1kdmQv/T83yu6QXBmrzdT_VeDuTww==&merge_field_type=(?x-mi:(?%3C=href=)[ )Desperado- The Outlaw Wars (1989) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZGVzcGVyYWRvLXRoZS1vdXRsYXctd2Fycy0xOTg5LWR2ZC8=/2bjF-UyCcr9FwalIR1xdmw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Aladdin And His Lamp (1952) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYWxhZGRpbi1hbmQtaGlzLWxhbXAtMTk1Mi1kdmQv/yWQeHA_GtyBwJ7nP5kLI8Q==&merge_field_type=(?x-mi:(?%3C=href=)[ )Women Of All Nations (1931) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vd29tZW4tb2YtYWxsLW5hdGlvbnMtMTkzMS1kdmQv/a__HsKMeZgEV84txNi-8HA==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Dupont Show Of The Week (1962) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWR1cG9udC1zaG93LW9mLXRoZS13ZWVrLTE5NjItZHZkLw==/8hqTPZoAbGUgXLSAgwXKaQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Bliss (1985) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYmxpc3MtMTk4NS1kdmQv/7h9CM0QguChxT29ArlveeQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Joker Is Wild (1957)H ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWpva2VyLWlzLXdpbGQtMTk1Ny1kdmQv/AdEW6y6s8mED4I1l6_OTpQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )old That Ghost (1941) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vaG9sZC10aGF0LWdob3N0LTE5NDEtZHZkLw==/mhe6jDDx4vtGS5iM5UJu8w==&merge_field_type=(?x-mi:(?%3C=href=)[ )Cody of the Pony Express (1950) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vY29keS1vZi10aGUtcG9ueS1leHByZXNzLTE5NTAtZHZkLw==/IZY0r8naTlrmwLJnM_qi7Q==&merge_field_type=(?x-mi:(?%3C=href=)[ )Child of Rage (1992) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vY2hpbGQtb2YtcmFnZS0xOTkyLWR2ZC8=/CRZLf65PqARCuUbSVkA2Pw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Quick LinksComedy ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vY29tZWR5Lw==/jivr80YuW_AxRFTQ7glKbA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Drama ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZHJhbWEv/EtbYmOSAVGDJnqgRf2dBMA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Western ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vd2VzdGVybi8=/0dn9bdMOZD_aVlpQp0qfOw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Musical ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vbXVzaWNhbC8=/n1P2_oG5uyiy5BxllnQyuw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Adventure ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYWR2ZW50dXJlLw==/54zBNIzoBJ247qjOH2Bkug==&merge_field_type=(?x-mi:(?%3C=href=)[ )Documentary ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZG9jdW1lbnRhcnkv/TPzdv836H2HbNEGMyXC1ig==&merge_field_type=(?x-mi:(?%3C=href=)[ )TV Shows ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdHYtc2hvd3Mv/t_qETA_E7RxcOdm9ONCY3A==&merge_field_type=(?x-mi:(?%3C=href=)[ )Animation ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYW5pbWF0aW9uLw==/BAclFnBWipMKT2aufM9uQg==&merge_field_type=(?x-mi:(?%3C=href=)[ )This is the link to the search utility at the website. ?Search for all the movies, actors, actresses or directors you have been looking for.Zeus Advanced Search Utility ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vc2VhcmNoLnBocD9tb2RlPWFkdmFuY2Vk/9HJdsbAaT83P9HRpiH5KeA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Social Media LinksZeus DVDs on Tumblr ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cHM6Ly93d3cudHVtYmxyLmNvbS9ibG9nL3pldXNkdmRz/FoY4tpJHKIMWCb4yESq9Dg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Flickr ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cHM6Ly93d3cuZmxpY2tyLmNvbS9waG90b3MvMTI1ODM5NzU1QE4wNC8=/htipBvIE-59funLB1tOPzg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Facebook ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cHM6Ly93d3cuZmFjZWJvb2suY29tL3BhZ2VzL1pldXMtRFZEcy83ODAzNTA5MjUzMTk3OTg_cmVmX3R5cGU9Ym9va21hcms=/YlkD6gX7XLPwX5mImYwTpw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Twitter ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cHM6Ly90d2l0dGVyLmNvbS96ZXVzZHZkcw==/49m4h0yAStfiWhcp2oVzgw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Pinterest ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy5waW50ZXJlc3QuY29tL3pldXNkdmRzL3pldXMtZHZkcy8=/TNMQRDyms3L_yr2uuQMdpw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Instagram ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL2luc3RhZ3JhbS5jb20vemV1c2R2ZHMv/UoLasZIEZPSniyY3poSSdw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs Blog ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYmxvZy8=/Pvn-o08bC8hQOqttahWslQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Like or Follow me on any of the social media sites and I will send you a promo code for a discount.?Big SavingsPurchase your own coupon code and save 50% on everything you order and it will never expire.? Send an email to milo at zeusdvds.com ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/bWFpbHRvOm1pbG9AemV1c2R2ZHMuY29tP3N1YmplY3Q9UHVyY2hhc2UlMjBDb3Vwb24=/z6bDarXe31OJWOe8VPFbKw==&merge_field_type=(?x-mi:(?%3C=href=)[ ) to find out how to purchase one.Zeus DVDs7027 W. Broward Blvd. #309Plantation, FL? 33317 Zeus7027 W Broward Blvd #309Plantation, FL 33317 Update your Email Preferences ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cHM6Ly92cjIudmVydGljYWxyZXNwb25zZS5jb20vY29udGFjdHMvMjE5OTAyOTQ2ODg0NC9lZGl0P2VtYWlsX2lkPTIxOTkwMjMyNjk5NzA=/sDP9FjNZ8RnlRmNt8xfifQ==&merge_field_type=%7BEMAIL_PREFERENCES%7D ) or Unsubscribe ( https://cts.vrmailer1.com/unsub?sk=apKkB2JgCjBJ3INVQAP0F5Fl0As6CtCptGLZF_LB9hAw=/aHR0cHM6Ly92cjIudmVydGljYWxyZXNwb25zZS5jb20vY29udGFjdHMvMjE5OTAyOTQ2ODg0NC91bnN1YnNjcmliZT9lbWFpbF9pZD0yMTk5MDIzMjY5OTcw/h5BuDcTs10pn0PLvcOTnIg==&merge_field_type=%7BUNSUBSCRIBE_LINK%7D ) ( http://www.verticalresponse.com/?utm_campaign=email-footer&utm_medium=referral&utm_source=footer&utm_content=CamID2199023269970&sn=CamID2199023269970 ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Fri Sep 12 13:34:32 2014 From: danyang.su at gmail.com (Danyang Su) Date: Fri, 12 Sep 2014 11:34:32 -0700 Subject: [petsc-users] MPI_FILE_OPEN in PETSc crashes Message-ID: <54133CB8.9090703@gmail.com> Hi There, I have some parallel mpi output codes that works fine without PETSc but crashes when compiled with PETSc. To make the problem easy, I test the following example which has the same problem. This example is modified form http://www.mcs.anl.gov/research/projects/mpi/usingmpi2/examples/starting/io3f_f90.htm. It works without PETSc but if I comment out "use mpi" and add PETSc include, it crashes at MPI_FILE_OPEN because of access violation. Shall I rewrite all the MPI Parallel output with PetscBinaryOpen or PetscViewerBinaryOpen relative functions? Considering the parallel I/O efficiency, which is more preferable? Thanks and regards, Danyang PROGRAM main ! Fortran 90 users can (and should) use !use mpi ! instead of include 'mpif.h' if their MPI implementation provides a ! mpi module. !include 'mpif.h' !For PETSc, use the following "include"s #include #include #include integer ierr, i, myrank, BUFSIZE, thefile parameter (BUFSIZE=10) integer buf(BUFSIZE) integer(kind=MPI_OFFSET_KIND) disp call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) do i = 1, BUFSIZE buf(i) = myrank * BUFSIZE + i enddo write(*,'(a,1x,i6,1x,a,1x,10(i6,1x))') "myrank", myrank, "buf",buf call MPI_FILE_OPEN(MPI_COMM_WORLD, 'testfile.txt', & MPI_MODE_CREATE + MPI_MODE_WRONLY, & MPI_INFO_NULL, thefile, ierr) ! assume 4-byte integers disp = myrank * BUFSIZE * 4 !Use the following two functions !call MPI_FILE_SET_VIEW(thefile, disp, MPI_INTEGER, & ! MPI_INTEGER, 'native', & ! MPI_INFO_NULL, ierr) !call MPI_FILE_WRITE(thefile, buf, BUFSIZE, MPI_INTEGER, & ! MPI_STATUS_IGNORE, ierr) !Or use the following one function call MPI_FILE_WRITE_AT(thefile, disp, buf, BUFSIZE, MPI_INTEGER, & MPI_STATUS_IGNORE, ierr) call MPI_FILE_CLOSE(thefile, ierr) call MPI_FINALIZE(ierr) END PROGRAM main -------------- next part -------------- An HTML attachment was scrubbed... URL: From James.Balasalle at digitalglobe.com Fri Sep 12 13:39:16 2014 From: James.Balasalle at digitalglobe.com (James Balasalle) Date: Fri, 12 Sep 2014 18:39:16 +0000 Subject: [petsc-users] Valgrind Errors Message-ID: <06D6C4A02103674E8911418149538BA4121946DE@PW00INFMAI003.digitalglobe.com> Hello, I'm getting some valgrind errors in my PETSc code that looks like it's related to MatTranspose(). I just figured I was doing something wrong. But I ran one of the examples (snes/ex70) which uses MatTranpose() through valgrind and see the same errors there as well. It seems that when the result of a MatTranspose is used as input to a MatMatMult() call valgrind is unhappy. Here's the valgrind output. I'm not concerned with the first MPI uninitialized error. But that invalid read of size 8 in mpiaij.c looks a bit concerning. I'm probably doing something wrong. Any ideas? Thanks, James bash-4.1$ valgrind ./ex70 ==21117== Memcheck, a memory error detector ==21117== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==21117== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info ==21117== Command: ./ex70 ==21117== ==21117== Syscall param writev(vector[...]) points to uninitialised byte(s) ==21117== at 0x39898E0B2B: writev (in /lib64/libc-2.12.so) ==21117== by 0x8996F16: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:249) ==21117== by 0x8997F3C: mca_oob_tcp_peer_send (oob_tcp_peer.c:204) ==21117== by 0x899A2DC: mca_oob_tcp_send_nb (oob_tcp_send.c:167) ==21117== by 0x8388955: orte_rml_oob_send (rml_oob_send.c:136) ==21117== by 0x8388B9F: orte_rml_oob_send_buffer (rml_oob_send.c:270) ==21117== by 0x8DA4F97: modex (grpcomm_bad_module.c:573) ==21117== by 0x6E31E6A: ompi_mpi_init (ompi_mpi_init.c:541) ==21117== by 0x6E4860F: PMPI_Init_thread (pinit_thread.c:84) ==21117== by 0x4DAA379: PetscInitialize (pinit.c:781) ==21117== by 0x409E29: main (ex70.c:668) ==21117== Address 0x9c7e261 is 161 bytes inside a block of size 256 alloc'd ==21117== at 0x4A06C9C: realloc (vg_replace_malloc.c:687) ==21117== by 0x6EB7FF2: opal_dss_buffer_extend (dss_internal_functions.c:63) ==21117== by 0x6EB81B4: opal_dss_copy_payload (dss_load_unload.c:164) ==21117== by 0x6E90C36: orte_grpcomm_base_pack_modex_entries (grpcomm_base_modex.c:861) ==21117== by 0x8DA4F4C: modex (grpcomm_bad_module.c:563) ==21117== by 0x6E31E6A: ompi_mpi_init (ompi_mpi_init.c:541) ==21117== by 0x6E4860F: PMPI_Init_thread (pinit_thread.c:84) ==21117== by 0x4DAA379: PetscInitialize (pinit.c:781) ==21117== by 0x409E29: main (ex70.c:668) ==21117== ==21117== Invalid read of size 8 ==21117== at 0x553E504: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5220) ==21117== by 0x557D53D: MatMatMultSymbolic_MPIAIJ_MPIAIJ (mpimatmatmult.c:677) ==21117== by 0x55758FC: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:33) ==21117== by 0x5601808: MatMatMult (matrix.c:8714) ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== Address 0x9cdd420 is 0 bytes after a block of size 48 alloc'd ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) ==21117== by 0x5016A8A: VecScatterCreate (vscat.c:1168) ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== ==21117== Invalid read of size 8 ==21117== at 0x553E516: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5221) ==21117== by 0x557D53D: MatMatMultSymbolic_MPIAIJ_MPIAIJ (mpimatmatmult.c:677) ==21117== by 0x55758FC: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:33) ==21117== by 0x5601808: MatMatMult (matrix.c:8714) ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== Address 0x9cdd3d0 is not stack'd, malloc'd or (recently) free'd ==21117== ==21117== Invalid read of size 8 ==21117== at 0x553E64D: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5226) ==21117== by 0x557D53D: MatMatMultSymbolic_MPIAIJ_MPIAIJ (mpimatmatmult.c:677) ==21117== by 0x55758FC: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:33) ==21117== by 0x5601808: MatMatMult (matrix.c:8714) ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== Address 0x9cdd3b0 is 0 bytes after a block of size 16 alloc'd ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) ==21117== by 0x5016A58: VecScatterCreate (vscat.c:1168) ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== ==21117== Invalid read of size 8 ==21117== at 0x553E66B: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5228) ==21117== by 0x557D53D: MatMatMultSymbolic_MPIAIJ_MPIAIJ (mpimatmatmult.c:677) ==21117== by 0x55758FC: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:33) ==21117== by 0x5601808: MatMatMult (matrix.c:8714) ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== Address 0x9cdd3b8 is 8 bytes after a block of size 16 alloc'd ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) ==21117== by 0x5016A58: VecScatterCreate (vscat.c:1168) ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== ==21117== Invalid read of size 8 ==21117== at 0x553E504: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5220) ==21117== by 0x557C680: MatMatMultNumeric_MPIAIJ_MPIAIJ_Scalable (mpimatmatmult.c:560) ==21117== by 0x5575BBE: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:39) ==21117== by 0x5601808: MatMatMult (matrix.c:8714) ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== Address 0x9cdd420 is 0 bytes after a block of size 48 alloc'd ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) ==21117== by 0x5016A8A: VecScatterCreate (vscat.c:1168) ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== ==21117== Invalid read of size 8 ==21117== at 0x553E516: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5221) ==21117== by 0x557C680: MatMatMultNumeric_MPIAIJ_MPIAIJ_Scalable (mpimatmatmult.c:560) ==21117== by 0x5575BBE: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:39) ==21117== by 0x5601808: MatMatMult (matrix.c:8714) ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== Address 0x9cdd3d0 is not stack'd, malloc'd or (recently) free'd ==21117== ==21117== Invalid read of size 8 ==21117== at 0x553E64D: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5226) ==21117== by 0x557C680: MatMatMultNumeric_MPIAIJ_MPIAIJ_Scalable (mpimatmatmult.c:560) ==21117== by 0x5575BBE: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:39) ==21117== by 0x5601808: MatMatMult (matrix.c:8714) ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== Address 0x9cdd3b0 is 0 bytes after a block of size 16 alloc'd ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) ==21117== by 0x5016A58: VecScatterCreate (vscat.c:1168) ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== ==21117== Invalid read of size 8 ==21117== at 0x553E66B: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5228) ==21117== by 0x557C680: MatMatMultNumeric_MPIAIJ_MPIAIJ_Scalable (mpimatmatmult.c:560) ==21117== by 0x5575BBE: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:39) ==21117== by 0x5601808: MatMatMult (matrix.c:8714) ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== Address 0x9cdd3b8 is 8 bytes after a block of size 16 alloc'd ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) ==21117== by 0x5016A58: VecScatterCreate (vscat.c:1168) ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) ==21117== by 0x40A0D3: main (ex70.c:679) ==21117== residual u = 3.56267e-06 residual p = 1.14951e-05 residual [u,p] = 1.20346e-05 discretization error u = 0.0106477 discretization error p = 1.85783 discretization error [u,p] = 1.85786 ==21117== ==21117== HEAP SUMMARY: ==21117== in use at exit: 345,301 bytes in 3,773 blocks ==21117== total heap usage: 24,730 allocs, 20,957 frees, 16,608,714 bytes allocated ==21117== ==21117== LEAK SUMMARY: ==21117== definitely lost: 42,743 bytes in 40 blocks ==21117== indirectly lost: 11,134 bytes in 28 blocks ==21117== possibly lost: 0 bytes in 0 blocks ==21117== still reachable: 291,424 bytes in 3,705 blocks ==21117== suppressed: 0 bytes in 0 blocks ==21117== Rerun with --leak-check=full to see details of leaked memory ==21117== ==21117== For counts of detected and suppressed errors, rerun with: -v ==21117== Use --track-origins=yes to see where uninitialised values come from ==21117== ERROR SUMMARY: 9 errors from 9 contexts (suppressed: 6 from 6) This electronic communication and any attachments may contain confidential and proprietary information of DigitalGlobe, Inc. If you are not the intended recipient, or an agent or employee responsible for delivering this communication to the intended recipient, or if you have received this communication in error, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this communication in error, and delete the copy you received. DigitalGlobe reserves the right to monitor any electronic communication sent or received by its employees, agents or representatives. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Sep 12 14:10:06 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 12 Sep 2014 14:10:06 -0500 Subject: [petsc-users] MPI_FILE_OPEN in PETSc crashes In-Reply-To: <54133CB8.9090703@gmail.com> References: <54133CB8.9090703@gmail.com> Message-ID: On Sep 12, 2014, at 1:34 PM, Danyang Su wrote: > Hi There, > > I have some parallel mpi output codes that works fine without PETSc but crashes when compiled with PETSc. To make the problem easy, I test the following example which has the same problem. This example is modified form http://www.mcs.anl.gov/research/projects/mpi/usingmpi2/examples/starting/io3f_f90.htm. It works without PETSc but if I comment out "use mpi" and add PETSc include, it crashes at MPI_FILE_OPEN because of access violation. You should not comment out use mpi; you need that to use the MPI calls you are making! You absolutely should have an implicit none at the beginning of your program so the Fortran compiler reports undeclared variables > > Shall I rewrite all the MPI Parallel output with PetscBinaryOpen or PetscViewerBinaryOpen relative functions? No. You should be able to use the MPI IO with PETSc code. First figure out how to get the example working with the use mpi PLUS petsc include files. Let us know if you have any problems ASAP Barry > Considering the parallel I/O efficiency, which is more preferable? > > Thanks and regards, > > Danyang > > PROGRAM main > ! Fortran 90 users can (and should) use > !use mpi > ! instead of include 'mpif.h' if their MPI implementation provides a > ! mpi module. > !include 'mpif.h' > > !For PETSc, use the following "include"s > #include > #include > #include > integer ierr, i, myrank, BUFSIZE, thefile > parameter (BUFSIZE=10) > integer buf(BUFSIZE) > integer(kind=MPI_OFFSET_KIND) disp > > call MPI_INIT(ierr) > call MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) > > do i = 1, BUFSIZE > buf(i) = myrank * BUFSIZE + i > enddo > > write(*,'(a,1x,i6,1x,a,1x,10(i6,1x))') "myrank", myrank, "buf",buf > > call MPI_FILE_OPEN(MPI_COMM_WORLD, 'testfile.txt', & > MPI_MODE_CREATE + MPI_MODE_WRONLY, & > MPI_INFO_NULL, thefile, ierr) > ! assume 4-byte integers > disp = myrank * BUFSIZE * 4 > > !Use the following two functions > !call MPI_FILE_SET_VIEW(thefile, disp, MPI_INTEGER, & > ! MPI_INTEGER, 'native', & > ! MPI_INFO_NULL, ierr) > !call MPI_FILE_WRITE(thefile, buf, BUFSIZE, MPI_INTEGER, & > ! MPI_STATUS_IGNORE, ierr) > > !Or use the following one function > call MPI_FILE_WRITE_AT(thefile, disp, buf, BUFSIZE, MPI_INTEGER, & > MPI_STATUS_IGNORE, ierr) > > call MPI_FILE_CLOSE(thefile, ierr) > call MPI_FINALIZE(ierr) > > END PROGRAM main From bsmith at mcs.anl.gov Fri Sep 12 15:11:48 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 12 Sep 2014 15:11:48 -0500 Subject: [petsc-users] Valgrind Errors In-Reply-To: <06D6C4A02103674E8911418149538BA4121946DE@PW00INFMAI003.digitalglobe.com> References: <06D6C4A02103674E8911418149538BA4121946DE@PW00INFMAI003.digitalglobe.com> Message-ID: <6F575569-1E91-4530-98CE-EB89A4D99E8F@mcs.anl.gov> James (and Hong), Do you ever see this problem in parallel runs? You are not doing anything wrong. Here is what is happening. MatGetBrowsOfAoCols_MPIAIJ() which is used by MatMatMult_MPIAIJ_MPIAIJ() assumes that the VecScatters for the matrix-vector products are gen_to = (VecScatter_MPI_General*)ctx->todata; gen_from = (VecScatter_MPI_General*)ctx->from data; but when run on one process the scatters are not of that form; hence the code accesses values in what it thinks is one struct but is actually a different one. Hence the valgrind errors. But since the matrix only lives on one process there is actually nothing to move between processors hence no error happens in the computation. You can avoid the issue completely by using MATAIJ matrix for the type instead of MATMPIAIJ and then on one process it automatically uses MATSEQAIJ. I don?t think the bug has anything in particular to do with the MatTranspose. Hong, Can you please fix this code? Essentially you can by pass parts of the code when the Mat is on only one process. (Maybe this also happens for MPIBAIJ matrices?) Send a response letting me know you saw this. Thanks Barry On Sep 12, 2014, at 1:39 PM, James Balasalle wrote: > Hello, > > I?m getting some valgrind errors in my PETSc code that looks like it?s related to MatTranspose(). I just figured I was doing something wrong. But I ran one of the examples (snes/ex70) which uses MatTranpose() through valgrind and see the same errors there as well. It seems that when the result of a MatTranspose is used as input to a MatMatMult() call valgrind is unhappy. > > Here?s the valgrind output. I?m not concerned with the first MPI uninitialized error. But that invalid read of size 8 in mpiaij.c looks a bit concerning. > > I?m probably doing something wrong. Any ideas? > > Thanks, > > James > > > bash-4.1$ valgrind ./ex70 > ==21117== Memcheck, a memory error detector > ==21117== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. > ==21117== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info > ==21117== Command: ./ex70 > ==21117== > ==21117== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==21117== at 0x39898E0B2B: writev (in /lib64/libc-2.12.so) > ==21117== by 0x8996F16: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:249) > ==21117== by 0x8997F3C: mca_oob_tcp_peer_send (oob_tcp_peer.c:204) > ==21117== by 0x899A2DC: mca_oob_tcp_send_nb (oob_tcp_send.c:167) > ==21117== by 0x8388955: orte_rml_oob_send (rml_oob_send.c:136) > ==21117== by 0x8388B9F: orte_rml_oob_send_buffer (rml_oob_send.c:270) > ==21117== by 0x8DA4F97: modex (grpcomm_bad_module.c:573) > ==21117== by 0x6E31E6A: ompi_mpi_init (ompi_mpi_init.c:541) > ==21117== by 0x6E4860F: PMPI_Init_thread (pinit_thread.c:84) > ==21117== by 0x4DAA379: PetscInitialize (pinit.c:781) > ==21117== by 0x409E29: main (ex70.c:668) > ==21117== Address 0x9c7e261 is 161 bytes inside a block of size 256 alloc'd > ==21117== at 0x4A06C9C: realloc (vg_replace_malloc.c:687) > ==21117== by 0x6EB7FF2: opal_dss_buffer_extend (dss_internal_functions.c:63) > ==21117== by 0x6EB81B4: opal_dss_copy_payload (dss_load_unload.c:164) > ==21117== by 0x6E90C36: orte_grpcomm_base_pack_modex_entries (grpcomm_base_modex.c:861) > ==21117== by 0x8DA4F4C: modex (grpcomm_bad_module.c:563) > ==21117== by 0x6E31E6A: ompi_mpi_init (ompi_mpi_init.c:541) > ==21117== by 0x6E4860F: PMPI_Init_thread (pinit_thread.c:84) > ==21117== by 0x4DAA379: PetscInitialize (pinit.c:781) > ==21117== by 0x409E29: main (ex70.c:668) > ==21117== > ==21117== Invalid read of size 8 > ==21117== at 0x553E504: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5220) > ==21117== by 0x557D53D: MatMatMultSymbolic_MPIAIJ_MPIAIJ (mpimatmatmult.c:677) > ==21117== by 0x55758FC: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:33) > ==21117== by 0x5601808: MatMatMult (matrix.c:8714) > ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) > ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== Address 0x9cdd420 is 0 bytes after a block of size 48 alloc'd > ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) > ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) > ==21117== by 0x5016A8A: VecScatterCreate (vscat.c:1168) > ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) > ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) > ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) > ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) > ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) > ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) > ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== > ==21117== Invalid read of size 8 > ==21117== at 0x553E516: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5221) > ==21117== by 0x557D53D: MatMatMultSymbolic_MPIAIJ_MPIAIJ (mpimatmatmult.c:677) > ==21117== by 0x55758FC: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:33) > ==21117== by 0x5601808: MatMatMult (matrix.c:8714) > ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) > ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== Address 0x9cdd3d0 is not stack'd, malloc'd or (recently) free'd > ==21117== > ==21117== Invalid read of size 8 > ==21117== at 0x553E64D: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5226) > ==21117== by 0x557D53D: MatMatMultSymbolic_MPIAIJ_MPIAIJ (mpimatmatmult.c:677) > ==21117== by 0x55758FC: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:33) > ==21117== by 0x5601808: MatMatMult (matrix.c:8714) > ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) > ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== Address 0x9cdd3b0 is 0 bytes after a block of size 16 alloc'd > ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) > ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) > ==21117== by 0x5016A58: VecScatterCreate (vscat.c:1168) > ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) > ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) > ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) > ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) > ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) > ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) > ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== > ==21117== Invalid read of size 8 > ==21117== at 0x553E66B: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5228) > ==21117== by 0x557D53D: MatMatMultSymbolic_MPIAIJ_MPIAIJ (mpimatmatmult.c:677) > ==21117== by 0x55758FC: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:33) > ==21117== by 0x5601808: MatMatMult (matrix.c:8714) > ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) > ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== Address 0x9cdd3b8 is 8 bytes after a block of size 16 alloc'd > ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) > ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) > ==21117== by 0x5016A58: VecScatterCreate (vscat.c:1168) > ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) > ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) > ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) > ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) > ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) > ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) > ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== > ==21117== Invalid read of size 8 > ==21117== at 0x553E504: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5220) > ==21117== by 0x557C680: MatMatMultNumeric_MPIAIJ_MPIAIJ_Scalable (mpimatmatmult.c:560) > ==21117== by 0x5575BBE: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:39) > ==21117== by 0x5601808: MatMatMult (matrix.c:8714) > ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) > ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== Address 0x9cdd420 is 0 bytes after a block of size 48 alloc'd > ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) > ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) > ==21117== by 0x5016A8A: VecScatterCreate (vscat.c:1168) > ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) > ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) > ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) > ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) > ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) > ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) > ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== > ==21117== Invalid read of size 8 > ==21117== at 0x553E516: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5221) > ==21117== by 0x557C680: MatMatMultNumeric_MPIAIJ_MPIAIJ_Scalable (mpimatmatmult.c:560) > ==21117== by 0x5575BBE: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:39) > ==21117== by 0x5601808: MatMatMult (matrix.c:8714) > ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) > ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== Address 0x9cdd3d0 is not stack'd, malloc'd or (recently) free'd > ==21117== > ==21117== Invalid read of size 8 > ==21117== at 0x553E64D: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5226) > ==21117== by 0x557C680: MatMatMultNumeric_MPIAIJ_MPIAIJ_Scalable (mpimatmatmult.c:560) > ==21117== by 0x5575BBE: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:39) > ==21117== by 0x5601808: MatMatMult (matrix.c:8714) > ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) > ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== Address 0x9cdd3b0 is 0 bytes after a block of size 16 alloc'd > ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) > ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) > ==21117== by 0x5016A58: VecScatterCreate (vscat.c:1168) > ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) > ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) > ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) > ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) > ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) > ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) > ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== > ==21117== Invalid read of size 8 > ==21117== at 0x553E66B: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:5228) > ==21117== by 0x557C680: MatMatMultNumeric_MPIAIJ_MPIAIJ_Scalable (mpimatmatmult.c:560) > ==21117== by 0x5575BBE: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:39) > ==21117== by 0x5601808: MatMatMult (matrix.c:8714) > ==21117== by 0x4067D0: StokesSetupApproxSchur (ex70.c:379) > ==21117== by 0x406DB5: StokesSetupMatrix (ex70.c:399) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== Address 0x9cdd3b8 is 8 bytes after a block of size 16 alloc'd > ==21117== at 0x4A055DC: memalign (vg_replace_malloc.c:755) > ==21117== by 0x4D42117: PetscMallocAlign (mal.c:27) > ==21117== by 0x5016A58: VecScatterCreate (vscat.c:1168) > ==21117== by 0x5547B10: MatSetUpMultiply_MPIAIJ (mmaij.c:116) > ==21117== by 0x5509F30: MatAssemblyEnd_MPIAIJ (mpiaij.c:702) > ==21117== by 0x55D978A: MatAssemblyEnd (matrix.c:4901) > ==21117== by 0x551D7AD: MatTranspose_MPIAIJ (mpiaij.c:2024) > ==21117== by 0x55D394A: MatTranspose (matrix.c:4382) > ==21117== by 0x405CE4: StokesSetupMatBlock10 (ex70.c:337) > ==21117== by 0x406C60: StokesSetupMatrix (ex70.c:396) > ==21117== by 0x40A0D3: main (ex70.c:679) > ==21117== > residual u = 3.56267e-06 > residual p = 1.14951e-05 > residual [u,p] = 1.20346e-05 > discretization error u = 0.0106477 > discretization error p = 1.85783 > discretization error [u,p] = 1.85786 > ==21117== > ==21117== HEAP SUMMARY: > ==21117== in use at exit: 345,301 bytes in 3,773 blocks > ==21117== total heap usage: 24,730 allocs, 20,957 frees, 16,608,714 bytes allocated > ==21117== > ==21117== LEAK SUMMARY: > ==21117== definitely lost: 42,743 bytes in 40 blocks > ==21117== indirectly lost: 11,134 bytes in 28 blocks > ==21117== possibly lost: 0 bytes in 0 blocks > ==21117== still reachable: 291,424 bytes in 3,705 blocks > ==21117== suppressed: 0 bytes in 0 blocks > ==21117== Rerun with --leak-check=full to see details of leaked memory > ==21117== > ==21117== For counts of detected and suppressed errors, rerun with: -v > ==21117== Use --track-origins=yes to see where uninitialised values come from > ==21117== ERROR SUMMARY: 9 errors from 9 contexts (suppressed: 6 from 6) > > > This electronic communication and any attachments may contain confidential and proprietary > information of DigitalGlobe, Inc. If you are not the intended recipient, or an agent or employee > responsible for delivering this communication to the intended recipient, or if you have received > this communication in error, please do not print, copy, retransmit, disseminate or > otherwise use the information. Please indicate to the sender that you have received this > communication in error, and delete the copy you received. DigitalGlobe reserves the > right to monitor any electronic communication sent or received by its employees, agents > or representatives. > From dmeiser at txcorp.com Fri Sep 12 15:40:09 2014 From: dmeiser at txcorp.com (Dominic Meiser) Date: Fri, 12 Sep 2014 14:40:09 -0600 Subject: [petsc-users] Valgrind Errors In-Reply-To: <6F575569-1E91-4530-98CE-EB89A4D99E8F@mcs.anl.gov> References: <06D6C4A02103674E8911418149538BA4121946DE@PW00INFMAI003.digitalglobe.com> <6F575569-1E91-4530-98CE-EB89A4D99E8F@mcs.anl.gov> Message-ID: <54135A29.1060601@txcorp.com> On 09/12/2014 02:11 PM, Barry Smith wrote: > James (and Hong), > > Do you ever see this problem in parallel runs? > > You are not doing anything wrong. > > Here is what is happening. > > MatGetBrowsOfAoCols_MPIAIJ() which is used by MatMatMult_MPIAIJ_MPIAIJ() assumes that the VecScatters for the matrix-vector products are > > gen_to = (VecScatter_MPI_General*)ctx->todata; > gen_from = (VecScatter_MPI_General*)ctx->from data; > > but when run on one process the scatters are not of that form; hence the code accesses values in what it thinks is one struct but is actually a different one. Hence the valgrind errors. > > But since the matrix only lives on one process there is actually nothing to move between processors hence no error happens in the computation. You can avoid the issue completely by using MATAIJ matrix for the type instead of MATMPIAIJ and then on one process it automatically uses MATSEQAIJ. > > I don?t think the bug has anything in particular to do with the MatTranspose. > > Hong, > > Can you please fix this code? Essentially you can by pass parts of the code when the Mat is on only one process. (Maybe this also happens for MPIBAIJ matrices?) Send a response letting me know you saw this. > > Thanks > > Barry I had to fix a few issues similar to this a while back. The method VecScatterGetTypes_Private introduced in pull request 176 might be useful in this context. Cheers, Dominic From danyang.su at gmail.com Fri Sep 12 16:09:12 2014 From: danyang.su at gmail.com (Danyang Su) Date: Fri, 12 Sep 2014 14:09:12 -0700 Subject: [petsc-users] MPI_FILE_OPEN in PETSc crashes In-Reply-To: References: <54133CB8.9090703@gmail.com> Message-ID: <541360F8.3090305@gmail.com> On 12/09/2014 12:10 PM, Barry Smith wrote: > On Sep 12, 2014, at 1:34 PM, Danyang Su wrote: > >> Hi There, >> >> I have some parallel mpi output codes that works fine without PETSc but crashes when compiled with PETSc. To make the problem easy, I test the following example which has the same problem. This example is modified form http://www.mcs.anl.gov/research/projects/mpi/usingmpi2/examples/starting/io3f_f90.htm. It works without PETSc but if I comment out "use mpi" and add PETSc include, it crashes at MPI_FILE_OPEN because of access violation. > You should not comment out use mpi; you need that to use the MPI calls you are making! > > You absolutely should have an implicit none at the beginning of your program so the Fortran compiler reports undeclared variables Done > >> Shall I rewrite all the MPI Parallel output with PetscBinaryOpen or PetscViewerBinaryOpen relative functions? > No. You should be able to use the MPI IO with PETSc code. > > First figure out how to get the example working with the use mpi PLUS petsc include files. The problem is there is a lot of name conflict if both "use mpi" and "petsc include" are included. I will consider rewrite these routines. > > Let us know if you have any problems ASAP > > Barry > >> Considering the parallel I/O efficiency, which is more preferable? >> >> Thanks and regards, >> >> Danyang >> >> PROGRAM main >> ! Fortran 90 users can (and should) use >> !use mpi >> ! instead of include 'mpif.h' if their MPI implementation provides a >> ! mpi module. >> !include 'mpif.h' >> >> !For PETSc, use the following "include"s >> #include >> #include >> #include >> integer ierr, i, myrank, BUFSIZE, thefile >> parameter (BUFSIZE=10) >> integer buf(BUFSIZE) >> integer(kind=MPI_OFFSET_KIND) disp >> >> call MPI_INIT(ierr) >> call MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) >> >> do i = 1, BUFSIZE >> buf(i) = myrank * BUFSIZE + i >> enddo >> >> write(*,'(a,1x,i6,1x,a,1x,10(i6,1x))') "myrank", myrank, "buf",buf >> >> call MPI_FILE_OPEN(MPI_COMM_WORLD, 'testfile.txt', & >> MPI_MODE_CREATE + MPI_MODE_WRONLY, & >> MPI_INFO_NULL, thefile, ierr) >> ! assume 4-byte integers >> disp = myrank * BUFSIZE * 4 >> >> !Use the following two functions >> !call MPI_FILE_SET_VIEW(thefile, disp, MPI_INTEGER, & >> ! MPI_INTEGER, 'native', & >> ! MPI_INFO_NULL, ierr) >> !call MPI_FILE_WRITE(thefile, buf, BUFSIZE, MPI_INTEGER, & >> ! MPI_STATUS_IGNORE, ierr) >> >> !Or use the following one function >> call MPI_FILE_WRITE_AT(thefile, disp, buf, BUFSIZE, MPI_INTEGER, & >> MPI_STATUS_IGNORE, ierr) >> >> call MPI_FILE_CLOSE(thefile, ierr) >> call MPI_FINALIZE(ierr) >> >> END PROGRAM main From bsmith at mcs.anl.gov Fri Sep 12 16:15:44 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 12 Sep 2014 16:15:44 -0500 Subject: [petsc-users] MPI_FILE_OPEN in PETSc crashes In-Reply-To: <541360F8.3090305@gmail.com> References: <54133CB8.9090703@gmail.com> <541360F8.3090305@gmail.com> Message-ID: <2129733C-F97F-4A05-8C62-9C1AF4E369D0@mcs.anl.gov> Maybe put the code that does the MPI IO in a separate file that gets called and does not need to also have all the PETSc includes Barry On Sep 12, 2014, at 4:09 PM, Danyang Su wrote: > On 12/09/2014 12:10 PM, Barry Smith wrote: >> On Sep 12, 2014, at 1:34 PM, Danyang Su wrote: >> >>> Hi There, >>> >>> I have some parallel mpi output codes that works fine without PETSc but crashes when compiled with PETSc. To make the problem easy, I test the following example which has the same problem. This example is modified form http://www.mcs.anl.gov/research/projects/mpi/usingmpi2/examples/starting/io3f_f90.htm. It works without PETSc but if I comment out "use mpi" and add PETSc include, it crashes at MPI_FILE_OPEN because of access violation. >> You should not comment out use mpi; you need that to use the MPI calls you are making! >> >> You absolutely should have an implicit none at the beginning of your program so the Fortran compiler reports undeclared variables > Done >> >>> Shall I rewrite all the MPI Parallel output with PetscBinaryOpen or PetscViewerBinaryOpen relative functions? >> No. You should be able to use the MPI IO with PETSc code. >> >> First figure out how to get the example working with the use mpi PLUS petsc include files. > The problem is there is a lot of name conflict if both "use mpi" and "petsc include" are included. I will consider rewrite these routines. >> >> Let us know if you have any problems ASAP >> >> Barry >> >>> Considering the parallel I/O efficiency, which is more preferable? >>> >>> Thanks and regards, >>> >>> Danyang >>> >>> PROGRAM main >>> ! Fortran 90 users can (and should) use >>> !use mpi >>> ! instead of include 'mpif.h' if their MPI implementation provides a >>> ! mpi module. >>> !include 'mpif.h' >>> !For PETSc, use the following "include"s >>> #include >>> #include >>> #include >>> integer ierr, i, myrank, BUFSIZE, thefile >>> parameter (BUFSIZE=10) >>> integer buf(BUFSIZE) >>> integer(kind=MPI_OFFSET_KIND) disp >>> call MPI_INIT(ierr) >>> call MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr) >>> do i = 1, BUFSIZE >>> buf(i) = myrank * BUFSIZE + i >>> enddo >>> write(*,'(a,1x,i6,1x,a,1x,10(i6,1x))') "myrank", myrank, "buf",buf >>> call MPI_FILE_OPEN(MPI_COMM_WORLD, 'testfile.txt', & >>> MPI_MODE_CREATE + MPI_MODE_WRONLY, & >>> MPI_INFO_NULL, thefile, ierr) >>> ! assume 4-byte integers >>> disp = myrank * BUFSIZE * 4 >>> !Use the following two functions >>> !call MPI_FILE_SET_VIEW(thefile, disp, MPI_INTEGER, & >>> ! MPI_INTEGER, 'native', & >>> ! MPI_INFO_NULL, ierr) >>> !call MPI_FILE_WRITE(thefile, buf, BUFSIZE, MPI_INTEGER, & >>> ! MPI_STATUS_IGNORE, ierr) >>> !Or use the following one function >>> call MPI_FILE_WRITE_AT(thefile, disp, buf, BUFSIZE, MPI_INTEGER, & >>> MPI_STATUS_IGNORE, ierr) >>> call MPI_FILE_CLOSE(thefile, ierr) >>> call MPI_FINALIZE(ierr) >>> END PROGRAM main From hzhang at mcs.anl.gov Fri Sep 12 17:28:31 2014 From: hzhang at mcs.anl.gov (Hong) Date: Fri, 12 Sep 2014 17:28:31 -0500 Subject: [petsc-users] Valgrind Errors In-Reply-To: <54135A29.1060601@txcorp.com> References: <06D6C4A02103674E8911418149538BA4121946DE@PW00INFMAI003.digitalglobe.com> <6F575569-1E91-4530-98CE-EB89A4D99E8F@mcs.anl.gov> <54135A29.1060601@txcorp.com> Message-ID: I'll check it. Hong On Fri, Sep 12, 2014 at 3:40 PM, Dominic Meiser wrote: > On 09/12/2014 02:11 PM, Barry Smith wrote: >> >> James (and Hong), >> >> Do you ever see this problem in parallel runs? >> >> You are not doing anything wrong. >> >> Here is what is happening. >> >> MatGetBrowsOfAoCols_MPIAIJ() which is used by MatMatMult_MPIAIJ_MPIAIJ() >> assumes that the VecScatters for the matrix-vector products are >> >> gen_to = (VecScatter_MPI_General*)ctx->todata; >> gen_from = (VecScatter_MPI_General*)ctx->from data; >> >> but when run on one process the scatters are not of that form; hence the >> code accesses values in what it thinks is one struct but is actually a >> different one. Hence the valgrind errors. >> >> But since the matrix only lives on one process there is actually nothing >> to move between processors hence no error happens in the computation. You >> can avoid the issue completely by using MATAIJ matrix for the type instead >> of MATMPIAIJ and then on one process it automatically uses MATSEQAIJ. >> >> I don?t think the bug has anything in particular to do with the >> MatTranspose. >> >> Hong, >> >> Can you please fix this code? Essentially you can by pass parts of >> the code when the Mat is on only one process. (Maybe this also happens for >> MPIBAIJ matrices?) Send a response letting me know you saw this. >> >> Thanks >> >> Barry > > I had to fix a few issues similar to this a while back. The method > VecScatterGetTypes_Private introduced in pull request 176 might be useful in > this context. > > Cheers, > Dominic > From zeusdvds at gmail.com Sat Sep 13 07:56:24 2014 From: zeusdvds at gmail.com (Zeus) Date: Sat, 13 Sep 2014 05:56:24 -0700 Subject: [petsc-users] Zeus Newsletter- New Releases and More Message-ID: <54143ef856255_4c171051ea45851967d@prd-nresque01.sf.verticalresponse.com.mail> View Online ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cHM6Ly92cjIudmVydGljYWxyZXNwb25zZS5jb20vZW1haWxzLzIxOTkwMjMyNzAxMjY_Y29udGFjdF9pZD0yMTk5MDI5NDY4ODQ0/DwettdLl5-mRbZbBJK2NJQ==&merge_field_type=%7BVR_HOSTED_LINK%7D ) Hello From Mount Olympus?I have news and new releases from Zeus DVDs?Dear Zeus Worshippers,?Did you know that all orders ship completely free regardless of destination? ?I will ship anywhere in the world. ?The default payment method on the website is by credit card but I can also accept PayPal, Checks and Money Orders. ?Just send an email to?milo at zeusdvds.com ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/bWFpbHRvOm1pbG9AemV1c2R2ZHMuY29t/cjod-lG7WHscHssVxG-PTg==&merge_field_type=(?x-mi:(?%3C=href=)[ )?if you want to use one of those alternate payment methods and it will be taken care of. ?I cannot take payment over the phone.Return customers will automatically receive a discount when they checkout. ?They are also able to purchase DVDs priced at $11.99 for only $9.99. ?If you put an $11.99 DVD in your shopping cart, when you checkout you will automatically receive a discount bringing the price down to $9.99. ?This is in addition to your repeat patronage discount.?Here is the list of New Releases that are now shipping. ?Just click on the movie title and it will take you to the movie listing on the website.?Weekend In Havana (1941) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vd2Vlay1lbmQtaW4taGF2YW5hLTE5NDEtZHZkLw==/9SFsilXDLEbl4SwdpcIaHg==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Wild Frontier (1947) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLXdpbGQtZnJvbnRpZXItMTk0Ny1kdmQv/NSMZhegjs_aRKqoWBDeQFQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Le dimanche de la vie (1967) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/ZmlsZTovLy9DOi9aZXVzJTIwTmV3c2xldHRlcnMvTGUlMjBkaW1hbmNoZSUyMGRlJTIwbGElMjB2aWUlMjAoMTk2Nyk=/hyab0wOd6PKeNENwygijXQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Bandits of Dark Canyon (1947) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYmFuZGl0cy1vZi1kYXJrLWNhbnlvbi0xOTQ3LWR2ZC8=/tCKzEfxvh88MeFtYuSFW1Q==&merge_field_type=(?x-mi:(?%3C=href=)[ )Oklahoma Badlands (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vb2tsYWhvbWEtYmFkbGFuZHMtMTk0OC1kdmQv/jfA0v4ydwpwar-i9meqHsA==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Deadly Affair (1966) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWRlYWRseS1hZmZhaXItMTk2Ni1kdmQv/rzaELm2UhdDZ1gJDiV3XeA==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Bold Frontiersman (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWJvbGQtZnJvbnRpZXJzbWFuLTE5NDgtZHZkLw==/tUBTo8K-41oyc6VNUHqK7g==&merge_field_type=(?x-mi:(?%3C=href=)[ )Carson City Raiders (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vY2Fyc29uLWNpdHktcmFpZGVycy0xOTQ4LWR2ZC8=/e2NDchXkIU-GH3I7ioxR8Q==&merge_field_type=(?x-mi:(?%3C=href=)[ )Marshall of Amarillo (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vbWFyc2hhbGwtb2YtYW1hcmlsbG8tMTk0OC1kdmQv/Q7-hpsj4koEvMIeo9YXloQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Desperadoes of Dodge City (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZGVzcGVyYWRvZXMtb2YtZG9kZ2UtY2l0eS0xOTQ4LWR2ZC8=/EJ57ZNqQ0jyONjzTDMEwbw==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Denver Kid (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWRlbnZlci1raWQtMTk0OC1kdmQv/o3FaYt0dps_qFbMnvMUqIg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Sundown in Santa Fe (1948) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vc3VuZG93bi1pbi1zYW50YS1mZS0xOTQ4LWR2ZC8=/49eewXaSaa9KV8UnU5_0sw==&merge_field_type=(?x-mi:(?%3C=href=)[ )??If you are interested in any of the following, please send email inquiries to?milo at zeusdvds.com ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/bWFpbHRvOm1pbG9AemV1c2R2ZHMuY29tP3N1YmplY3Q9T2ZmbGluZSUyMFRpdGxlJTIwUmVxdWVzdA==/I9typM4R4qYjRTGv0BFMWA==&merge_field_type=(?x-mi:(?%3C=href=)[ ):The Americanization of Emily (1964)Government Girl (1943)Count Your Blessings (1957)The Barretts of Wimpole Street (1957)These are the films that have been selling the most lately. ?Just click on the movie title and it will take you to the movie listing on the website.Beau Geste (1966) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYmVhdS1nZXN0ZS0xOTY2LWR2ZC8=/5_HN_LlRYpyj8y3fmIaUkw==&merge_field_type=(?x-mi:(?%3C=href=)[ )A Man Could Get Killed (1966) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYS1tYW4tY291bGQtZ2V0LWtpbGxlZC0xOTY2LWR2ZC8=/Y2_YbAe1NL4DgK_35rjqQA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Kiss of Fire (1955) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20va2lzcy1vZi1maXJlLTE5NTUtZHZkLw==/bksA7KA5mz_5WU2ucYyodw==&merge_field_type=(?x-mi:(?%3C=href=)[ )A Perilous Journey (1953) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYS1wZXJpbG91cy1qb3VybmV5LTE5NTMtZHZkLw==/b4fpVwhVuxhhX_OtcOPxDg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Law And Order (1976) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vbGF3LWFuZC1vcmRlci0xOTc2LWR2ZC8=/q4GGU4JuHX93Tr-lr7_S7w==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Honkers (1972) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWhvbmtlcnMtMTk3Mi1kdmQv/1UVRJLMABy6fpvJosH435Q==&merge_field_type=(?x-mi:(?%3C=href=)[ )South of Dixie (1944) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vc291dGgtb2YtZGl4aWUtMTk0NC1kdmQv/l_gQ1KqYBq6o0JqiD9ZZBw==&merge_field_type=(?x-mi:(?%3C=href=)[ )You Never Can Tell (1951) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20veW91LW5ldmVyLWNhbi10ZWxsLTE5NTEtZHZkLw==/4pqbN9OYsDj1FBT-PJqXXg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Kitty (1945) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20va2l0dHktMTk0NS1kdmQv/GQpTAtGJV7h7rdWOg-VGHQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Hawaiians (1970) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWhhd2FpaWFucy0xOTcwLWR2ZC8=/pXo3xAhPngcmxdlcJDXmQg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Aloha Summer (1988) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYWxvaGEtc3VtbWVyLTE5ODgtZHZkLw==/wcwEPKfA_vpqvJPnahkU7g==&merge_field_type=(?x-mi:(?%3C=href=)[ )Once More, My Darling (1949) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vb25jZS1tb3JlLW15LWRhcmxpbmctMTk0OS1kdmQv/xrqty_2aJo2fU9UGu8S1kg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Desperado- Badlands Justice (1989) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZGVzcGVyYWRvLWJhZGxhbmRzLWp1c3RpY2UtMTk4OS1kdmQv/Mtj8_bT7fbjOTkB73Vu6oA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Desperado- The Outlaw Wars (1989) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZGVzcGVyYWRvLXRoZS1vdXRsYXctd2Fycy0xOTg5LWR2ZC8=/AMTlL9hm51Ux1NJ6UoCKDQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Aladdin And His Lamp (1952) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYWxhZGRpbi1hbmQtaGlzLWxhbXAtMTk1Mi1kdmQv/XFY8vLRqkDWLrwPQ2uchQA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Women Of All Nations (1931) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vd29tZW4tb2YtYWxsLW5hdGlvbnMtMTkzMS1kdmQv/oIGdfIdhhNNxiXOzTuR8wg==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Dupont Show Of The Week (1962) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWR1cG9udC1zaG93LW9mLXRoZS13ZWVrLTE5NjItZHZkLw==/Xcwicgyo7LtYws6DONaZLw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Bliss (1985) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYmxpc3MtMTk4NS1kdmQv/WRSzDlFIGFye8B2QwocU6g==&merge_field_type=(?x-mi:(?%3C=href=)[ )The Joker Is Wild (1957)H ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdGhlLWpva2VyLWlzLXdpbGQtMTk1Ny1kdmQv/lcuB3ee-MoDWqfp02BMyRg==&merge_field_type=(?x-mi:(?%3C=href=)[ )old That Ghost (1941) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vaG9sZC10aGF0LWdob3N0LTE5NDEtZHZkLw==/TzOEpMH_GAVyXysurhx1nQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Cody of the Pony Express (1950) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vY29keS1vZi10aGUtcG9ueS1leHByZXNzLTE5NTAtZHZkLw==/11tPyHecCk0JD7hwrUs0OA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Child of Rage (1992) ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vY2hpbGQtb2YtcmFnZS0xOTkyLWR2ZC8=/oJAzCc1ncUFNT8CqlNoaWA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Quick LinksComedy ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vY29tZWR5Lw==/QlDuiEzsCQvzlUrMqRuUww==&merge_field_type=(?x-mi:(?%3C=href=)[ )Drama ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZHJhbWEv/QZ_AauTk7ws8AivxVWqqIQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Western ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vd2VzdGVybi8=/y7ZtdZ0ZbVr8eKq2rCUcTw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Musical ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vbXVzaWNhbC8=/7c8RPul-cE3MQ9fUeOznlw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Adventure ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYWR2ZW50dXJlLw==/XgMZLbmbmErIMrzqFpESrQ==&merge_field_type=(?x-mi:(?%3C=href=)[ )Documentary ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vZG9jdW1lbnRhcnkv/f3NbHzt4fWVDBj14vHoNOg==&merge_field_type=(?x-mi:(?%3C=href=)[ )TV Shows ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vdHYtc2hvd3Mv/Mou3A4f76wzg5fEMmL_Vzg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Animation ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYW5pbWF0aW9uLw==/RcRKSk-uJztVEejbwxEBQg==&merge_field_type=(?x-mi:(?%3C=href=)[ )This is the link to the search utility at the website. ?Search for all the movies, actors, actresses or directors you have been looking for.Zeus Advanced Search Utility ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vc2VhcmNoLnBocD9tb2RlPWFkdmFuY2Vk/kHMCUJ4au50yUWclXbmuhg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Social Media LinksZeus DVDs on Tumblr ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cHM6Ly93d3cudHVtYmxyLmNvbS9ibG9nL3pldXNkdmRz/_q8v84Bsj4KzmCuxASO7xg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Flickr ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cHM6Ly93d3cuZmxpY2tyLmNvbS9waG90b3MvMTI1ODM5NzU1QE4wNC8=/f0MOx_69SSEA4h9F4mljqA==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Facebook ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cHM6Ly93d3cuZmFjZWJvb2suY29tL3BhZ2VzL1pldXMtRFZEcy83ODAzNTA5MjUzMTk3OTg_cmVmX3R5cGU9Ym9va21hcms=/_E9-8cTF3GirGafi4iWm-A==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Twitter ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cHM6Ly90d2l0dGVyLmNvbS96ZXVzZHZkcw==/IkjbINRzpRRYCZGSwkECfg==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Pinterest ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy5waW50ZXJlc3QuY29tL3pldXNkdmRzL3pldXMtZHZkcy8=/mnPdCvsZbDE0OpWt_zfKmw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs on Instagram ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL2luc3RhZ3JhbS5jb20vemV1c2R2ZHMv/nxK4kis_zr4nMumsirgd2w==&merge_field_type=(?x-mi:(?%3C=href=)[ )Zeus DVDs Blog ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cDovL3d3dy56ZXVzZHZkcy5jb20vYmxvZy8=/muQzylhXADwzCWPAUPt_Cw==&merge_field_type=(?x-mi:(?%3C=href=)[ )Like or Follow me on any of the social media sites and I will send you a promo code for a discount.?Big SavingsPurchase your own coupon code and save 50% on everything you order and it will never expire.? Send an email to milo at zeusdvds.com ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/bWFpbHRvOm1pbG9AemV1c2R2ZHMuY29tP3N1YmplY3Q9UHVyY2hhc2UlMjBDb3Vwb24=/qh11zWFmKv3JnOyZp6xUCA==&merge_field_type=(?x-mi:(?%3C=href=)[ ) to find out how to purchase one.Zeus DVDs7027 W. Broward Blvd. #309Plantation, FL? 33317 Zeus7027 W Broward Blvd #309Plantation, FL 33317 Update your Email Preferences ( https://cts.vrmailer1.com/click?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cHM6Ly92cjIudmVydGljYWxyZXNwb25zZS5jb20vY29udGFjdHMvMjE5OTAyOTQ2ODg0NC9lZGl0P2VtYWlsX2lkPTIxOTkwMjMyNzAxMjY=/K0LT_GUlfT17sDznGm2V_A==&merge_field_type=%7BEMAIL_PREFERENCES%7D ) or Unsubscribe ( https://cts.vrmailer1.com/unsub?sk=apKkB2JgCjBIIRNdQAP0F5PnZJHdHMT6S9aQTG-PW_fg=/aHR0cHM6Ly92cjIudmVydGljYWxyZXNwb25zZS5jb20vY29udGFjdHMvMjE5OTAyOTQ2ODg0NC91bnN1YnNjcmliZT9lbWFpbF9pZD0yMTk5MDIzMjcwMTI2/ftV0OuqnYEyCK2tjc1SgGg==&merge_field_type=%7BUNSUBSCRIBE_LINK%7D ) ( http://www.verticalresponse.com/?utm_campaign=email-footer&utm_medium=referral&utm_source=footer&utm_content=CamID2199023270126&sn=CamID2199023270126 ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From szczerba.dominik at gmail.com Mon Sep 15 09:01:46 2014 From: szczerba.dominik at gmail.com (Dominik Szczerba) Date: Mon, 15 Sep 2014 16:01:46 +0200 Subject: [petsc-users] ERROR: Cannot mix add values and insert values! Message-ID: I am indeed mixing ADD with INSERT. But when linking my code with release version of Petsc I do not get this error, only when I link with the debug version. I wanted to make sure it is indeed wrong to do so, and if so, why the error does not come up also in the release version. Many thanks! [0]PETSC ERROR: Object is in wrong state! [0]PETSC ERROR: Cannot mix add values and insert values! From knepley at gmail.com Mon Sep 15 09:05:18 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 Sep 2014 09:05:18 -0500 Subject: [petsc-users] ERROR: Cannot mix add values and insert values! In-Reply-To: References: Message-ID: On Mon, Sep 15, 2014 at 9:01 AM, Dominik Szczerba < szczerba.dominik at gmail.com> wrote: > I am indeed mixing ADD with INSERT. But when linking my code with > release version of Petsc I do not get this error, only when I link > with the debug version. > Yes, we only guarantee error checking in debug mode. > I wanted to make sure it is indeed wrong to do so, and if so, why the > error does not come up also in the release version. > Yes, it is wrong. You must call Assemble in between (although calling it with ASSEMBLY_FLUSH is fine). Thanks, Matt > Many thanks! > > [0]PETSC ERROR: Object is in wrong state! > [0]PETSC ERROR: Cannot mix add values and insert values! > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gbisht at lbl.gov Mon Sep 15 10:44:10 2014 From: gbisht at lbl.gov (Gautam Bisht) Date: Mon, 15 Sep 2014 08:44:10 -0700 Subject: [petsc-users] Identifier for DMs within DMComposite Message-ID: Hi, In my application, the number and order of DMs within a DMComposite could vary. Is there a way to tag (perhaps using an integer value) individual DMs within a DMComposite? Thanks, Gautam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From katyghantous at gmail.com Mon Sep 15 12:45:28 2014 From: katyghantous at gmail.com (Katy Ghantous) Date: Mon, 15 Sep 2014 19:45:28 +0200 Subject: [petsc-users] speedup for TS solver using DMDA Message-ID: Hi, I am using DMDA to run in parallel TS to solves a set of N equations. I am using DMDAGetCorners in the RHSfunction with setting the stencil size at 2 to solve a set of coupled ODEs on 30 cores. The machine has 32 cores (2 physical CPUs with 2x8 core each with speed of 3.4Ghz per core). However, mpiexec with more than one core is showing no speedup. Also at the configuring/testing stage for petsc on that machine, there was no speedup and it only reported one node. Is there somehting wrong with how i configured petsc or is the approach inappropriate for the machine? I am not sure what files (or sections of the code) you would need to be able to answer my question. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 15 13:13:21 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 Sep 2014 13:13:21 -0500 Subject: [petsc-users] speedup for TS solver using DMDA In-Reply-To: References: Message-ID: On Mon, Sep 15, 2014 at 12:45 PM, Katy Ghantous wrote: > Hi, > I am using DMDA to run in parallel TS to solves a set of N equations. I am > using DMDAGetCorners in the RHSfunction with setting the stencil size at 2 > to solve a set of coupled ODEs on 30 cores. > The machine has 32 cores (2 physical CPUs with 2x8 core each with speed of > 3.4Ghz per core). > However, mpiexec with more than one core is showing no speedup. > Also at the configuring/testing stage for petsc on that machine, there was > no speedup and it only reported one node. > Is there somehting wrong with how i configured petsc or is the approach > inappropriate for the machine? > I am not sure what files (or sections of the code) you would need to be > able to answer my question. > The kind of code you describe sounds memory bandwidth limited. More information is here: http://www.mcs.anl.gov/petsc/documentation/faq.html#computers The STREAMS should give you an idea of the bandwidth, and running it on 2 procs vs 1 should give you an idea of the speedup to expect, no matter how many cores you use. Thanks, Matt > Thank you! > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 15 13:23:58 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 15 Sep 2014 13:23:58 -0500 Subject: [petsc-users] speedup for TS solver using DMDA In-Reply-To: References: Message-ID: <379CA09C-26C5-4298-B3A7-471BF2CF5527@mcs.anl.gov> Please send the output from running make steams NPMAX=32 in the PETSc root directory. Barry My guess is that it reports ?one node? is just because it uses the ?hostname? to distinguish nodes and though your machine has two CPUs, from the point of view of the OS it has only a single hostname and hence reports just one ?node?. On Sep 15, 2014, at 12:45 PM, Katy Ghantous wrote: > Hi, > I am using DMDA to run in parallel TS to solves a set of N equations. I am using DMDAGetCorners in the RHSfunction with setting the stencil size at 2 to solve a set of coupled ODEs on 30 cores. > The machine has 32 cores (2 physical CPUs with 2x8 core each with speed of 3.4Ghz per core). > However, mpiexec with more than one core is showing no speedup. > Also at the configuring/testing stage for petsc on that machine, there was no speedup and it only reported one node. > Is there somehting wrong with how i configured petsc or is the approach inappropriate for the machine? > I am not sure what files (or sections of the code) you would need to be able to answer my question. > > Thank you! From katyghantous at gmail.com Mon Sep 15 13:42:12 2014 From: katyghantous at gmail.com (Katy Ghantous) Date: Mon, 15 Sep 2014 20:42:12 +0200 Subject: [petsc-users] speedup for TS solver using DMDA In-Reply-To: <379CA09C-26C5-4298-B3A7-471BF2CF5527@mcs.anl.gov> References: <379CA09C-26C5-4298-B3A7-471BF2CF5527@mcs.anl.gov> Message-ID: Matt, thanks! i will look into that and find other ways to make the computation faster. Barry, the benchmark reports up to 2 speedup, but says 1 node in the end. but either way i was expecting a higher speedup.. 2 is the limit for two cpus despite the multiple cores? please let me know if the file attached is what you are asking for. Thank you! On Mon, Sep 15, 2014 at 8:23 PM, Barry Smith wrote: > > Please send the output from running > > make steams NPMAX=32 > > in the PETSc root directory. > > > Barry > > My guess is that it reports ?one node? is just because it uses the > ?hostname? to distinguish nodes and though your machine has two CPUs, from > the point of view of the OS it has only a single hostname and hence reports > just one ?node?. > > > On Sep 15, 2014, at 12:45 PM, Katy Ghantous > wrote: > > > Hi, > > I am using DMDA to run in parallel TS to solves a set of N equations. I > am using DMDAGetCorners in the RHSfunction with setting the stencil size at > 2 to solve a set of coupled ODEs on 30 cores. > > The machine has 32 cores (2 physical CPUs with 2x8 core each with speed > of 3.4Ghz per core). > > However, mpiexec with more than one core is showing no speedup. > > Also at the configuring/testing stage for petsc on that machine, there > was no speedup and it only reported one node. > > Is there somehting wrong with how i configured petsc or is the approach > inappropriate for the machine? > > I am not sure what files (or sections of the code) you would need to be > able to answer my question. > > > > Thank you! > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: scaling.log Type: text/x-log Size: 14779 bytes Desc: not available URL: From knepley at gmail.com Mon Sep 15 13:47:27 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 15 Sep 2014 13:47:27 -0500 Subject: [petsc-users] speedup for TS solver using DMDA In-Reply-To: References: <379CA09C-26C5-4298-B3A7-471BF2CF5527@mcs.anl.gov> Message-ID: On Mon, Sep 15, 2014 at 1:42 PM, Katy Ghantous wrote: > Matt, thanks! i will look into that and find other ways to make the > computation faster. > > Barry, the benchmark reports up to 2 speedup, but says 1 node in the end. > but either way i was expecting a higher speedup.. 2 is the limit for two > cpus despite the multiple cores? > This is a bit of a scam on the part of chip makers. The speed of your computations, say VecAXPY and simple stencil operations, is not determined by the flop rate, but by the memory bandwidth. They sell you a computer with a great flop rate, but not much bandwidth at all. This is much like a car dealer who sells you a car with an incredible amount of torque, just loads of torque, enough torque to tear down a building, but that is not going to make you go faster. Matt > please let me know if the file attached is what you are asking for. > Thank you! > > > On Mon, Sep 15, 2014 at 8:23 PM, Barry Smith wrote: > >> >> Please send the output from running >> >> make steams NPMAX=32 >> >> in the PETSc root directory. >> >> >> Barry >> >> My guess is that it reports ?one node? is just because it uses the >> ?hostname? to distinguish nodes and though your machine has two CPUs, from >> the point of view of the OS it has only a single hostname and hence reports >> just one ?node?. >> >> >> On Sep 15, 2014, at 12:45 PM, Katy Ghantous >> wrote: >> >> > Hi, >> > I am using DMDA to run in parallel TS to solves a set of N equations. I >> am using DMDAGetCorners in the RHSfunction with setting the stencil size at >> 2 to solve a set of coupled ODEs on 30 cores. >> > The machine has 32 cores (2 physical CPUs with 2x8 core each with speed >> of 3.4Ghz per core). >> > However, mpiexec with more than one core is showing no speedup. >> > Also at the configuring/testing stage for petsc on that machine, there >> was no speedup and it only reported one node. >> > Is there somehting wrong with how i configured petsc or is the approach >> inappropriate for the machine? >> > I am not sure what files (or sections of the code) you would need to be >> able to answer my question. >> > >> > Thank you! >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 15 14:08:01 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 15 Sep 2014 14:08:01 -0500 Subject: [petsc-users] speedup for TS solver using DMDA In-Reply-To: References: <379CA09C-26C5-4298-B3A7-471BF2CF5527@mcs.anl.gov> Message-ID: <01E70DF0-6B32-4AC8-89A9-B7C0A406F901@mcs.anl.gov> Based on the streams speedups below it looks like a single core can utilize roughly 1/2 of the memory bandwidth, leaving all the other cores only 1/2 of the bandwidth to utilize, so you can only expect at best a speedup of roughly 2 on this machine with traditional PETSc sparse solvers. To add insult to injury it appears that the threads are not being assigned to physical cores very well either. Under the best circumstance on this system one would like to see a speedup of about 2 when running with two processes but it actually delivers only 1.23 and the speedup of 2 only occurs with 5 processes. I attribute this to the MPI or OS not assigning the second MPI process to the ?best? core for memory bandwidth. Likely it should assign the second MPI process to the 2nd CPU but instead it is assigning it also to the first CPU and only when it gets to the 5th MPI process does the second CPU get utilized. You can look at the documentation for your MPI?s process affinity to see if you can force the 2nd MPI process onto the second CPU. Barry np speedup 1 1.0 2 1.23 3 1.3 4 1.75 5 2.18 6 1.22 7 2.3 8 1.22 9 2.01 10 1.19 11 1.93 12 1.93 13 1.73 14 2.17 15 1.99 16 2.08 17 2.16 18 1.47 19 1.95 20 2.09 21 1.9 22 1.96 23 1.92 24 2.02 25 1.96 26 1.89 27 1.93 28 1.97 29 1.96 30 1.93 31 2.16 32 2.12 Estimation of possible On Sep 15, 2014, at 1:42 PM, Katy Ghantous wrote: > Matt, thanks! i will look into that and find other ways to make the computation faster. > > Barry, the benchmark reports up to 2 speedup, but says 1 node in the end. but either way i was expecting a higher speedup.. 2 is the limit for two cpus despite the multiple cores? > > please let me know if the file attached is what you are asking for. > Thank you! > > > On Mon, Sep 15, 2014 at 8:23 PM, Barry Smith wrote: > > Please send the output from running > > make steams NPMAX=32 > > in the PETSc root directory. > > > Barry > > My guess is that it reports ?one node? is just because it uses the ?hostname? to distinguish nodes and though your machine has two CPUs, from the point of view of the OS it has only a single hostname and hence reports just one ?node?. > > > On Sep 15, 2014, at 12:45 PM, Katy Ghantous wrote: > > > Hi, > > I am using DMDA to run in parallel TS to solves a set of N equations. I am using DMDAGetCorners in the RHSfunction with setting the stencil size at 2 to solve a set of coupled ODEs on 30 cores. > > The machine has 32 cores (2 physical CPUs with 2x8 core each with speed of 3.4Ghz per core). > > However, mpiexec with more than one core is showing no speedup. > > Also at the configuring/testing stage for petsc on that machine, there was no speedup and it only reported one node. > > Is there somehting wrong with how i configured petsc or is the approach inappropriate for the machine? > > I am not sure what files (or sections of the code) you would need to be able to answer my question. > > > > Thank you! > > > From James.Balasalle at digitalglobe.com Mon Sep 15 16:40:43 2014 From: James.Balasalle at digitalglobe.com (James Balasalle) Date: Mon, 15 Sep 2014 21:40:43 +0000 Subject: [petsc-users] Valgrind Errors In-Reply-To: References: <06D6C4A02103674E8911418149538BA4121946DE@PW00INFMAI003.digitalglobe.com> <6F575569-1E91-4530-98CE-EB89A4D99E8F@mcs.anl.gov> <54135A29.1060601@txcorp.com> Message-ID: <06D6C4A02103674E8911418149538BA412195AB5@PW00INFMAI003.digitalglobe.com> Hi Barry, Thanks for the response. You're right, it (both ex70 and my own code) doesn't give those valgrind errors when I run it in parallel. Changing the type to MATAIJ also fixes the issue. Thanks for the help, I appreciate it. James > -----Original Message----- > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: Friday, September 12, 2014 4:29 PM > To: Dominic Meiser > Cc: Barry Smith; James Balasalle; Zhang, Hong; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Valgrind Errors > > I'll check it. > Hong > > On Fri, Sep 12, 2014 at 3:40 PM, Dominic Meiser > wrote: > > On 09/12/2014 02:11 PM, Barry Smith wrote: > >> > >> James (and Hong), > >> > >> Do you ever see this problem in parallel runs? > >> > >> You are not doing anything wrong. > >> > >> Here is what is happening. > >> > >> MatGetBrowsOfAoCols_MPIAIJ() which is used by > >> MatMatMult_MPIAIJ_MPIAIJ() assumes that the VecScatters for the > >> matrix-vector products are > >> > >> gen_to = (VecScatter_MPI_General*)ctx->todata; > >> gen_from = (VecScatter_MPI_General*)ctx->from data; > >> > >> but when run on one process the scatters are not of that form; hence > >> the code accesses values in what it thinks is one struct but is > >> actually a different one. Hence the valgrind errors. > >> > >> But since the matrix only lives on one process there is actually > >> nothing to move between processors hence no error happens in the > >> computation. You can avoid the issue completely by using MATAIJ > >> matrix for the type instead of MATMPIAIJ and then on one process it > automatically uses MATSEQAIJ. > >> > >> I don?t think the bug has anything in particular to do with the > >> MatTranspose. > >> > >> Hong, > >> > >> Can you please fix this code? Essentially you can by pass parts > >> of the code when the Mat is on only one process. (Maybe this also > >> happens for MPIBAIJ matrices?) Send a response letting me know you > saw this. > >> > >> Thanks > >> > >> Barry > > > > I had to fix a few issues similar to this a while back. The method > > VecScatterGetTypes_Private introduced in pull request 176 might be > > useful in this context. > > > > Cheers, > > Dominic > > This electronic communication and any attachments may contain confidential and proprietary information of DigitalGlobe, Inc. If you are not the intended recipient, or an agent or employee responsible for delivering this communication to the intended recipient, or if you have received this communication in error, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this communication in error, and delete the copy you received. DigitalGlobe reserves the right to monitor any electronic communication sent or received by its employees, agents or representatives. From hzhang at mcs.anl.gov Mon Sep 15 17:05:00 2014 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 15 Sep 2014 17:05:00 -0500 Subject: [petsc-users] Valgrind Errors In-Reply-To: <06D6C4A02103674E8911418149538BA412195AB5@PW00INFMAI003.digitalglobe.com> References: <06D6C4A02103674E8911418149538BA4121946DE@PW00INFMAI003.digitalglobe.com> <6F575569-1E91-4530-98CE-EB89A4D99E8F@mcs.anl.gov> <54135A29.1060601@txcorp.com> <06D6C4A02103674E8911418149538BA412195AB5@PW00INFMAI003.digitalglobe.com> Message-ID: James : I'm fixing it in branch hzhang/matmatmult-bugfix https://bitbucket.org/petsc/petsc/commits/a7c7454dd425191f4a23aa5860b8c6bac03cfd7b Once it is further cleaned, and other routines are checked, I will patch petsc-release. Hong > Hi Barry, > > Thanks for the response. You're right, it (both ex70 and my own code) doesn't give those valgrind errors when I run it in parallel. Changing the type to MATAIJ also fixes the issue. > > Thanks for the help, I appreciate it. > > James > > > > > >> -----Original Message----- >> From: Hong [mailto:hzhang at mcs.anl.gov] >> Sent: Friday, September 12, 2014 4:29 PM >> To: Dominic Meiser >> Cc: Barry Smith; James Balasalle; Zhang, Hong; petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] Valgrind Errors >> >> I'll check it. >> Hong >> >> On Fri, Sep 12, 2014 at 3:40 PM, Dominic Meiser >> wrote: >> > On 09/12/2014 02:11 PM, Barry Smith wrote: >> >> >> >> James (and Hong), >> >> >> >> Do you ever see this problem in parallel runs? >> >> >> >> You are not doing anything wrong. >> >> >> >> Here is what is happening. >> >> >> >> MatGetBrowsOfAoCols_MPIAIJ() which is used by >> >> MatMatMult_MPIAIJ_MPIAIJ() assumes that the VecScatters for the >> >> matrix-vector products are >> >> >> >> gen_to = (VecScatter_MPI_General*)ctx->todata; >> >> gen_from = (VecScatter_MPI_General*)ctx->from data; >> >> >> >> but when run on one process the scatters are not of that form; hence >> >> the code accesses values in what it thinks is one struct but is >> >> actually a different one. Hence the valgrind errors. >> >> >> >> But since the matrix only lives on one process there is actually >> >> nothing to move between processors hence no error happens in the >> >> computation. You can avoid the issue completely by using MATAIJ >> >> matrix for the type instead of MATMPIAIJ and then on one process it >> automatically uses MATSEQAIJ. >> >> >> >> I don?t think the bug has anything in particular to do with the >> >> MatTranspose. >> >> >> >> Hong, >> >> >> >> Can you please fix this code? Essentially you can by pass parts >> >> of the code when the Mat is on only one process. (Maybe this also >> >> happens for MPIBAIJ matrices?) Send a response letting me know you >> saw this. >> >> >> >> Thanks >> >> >> >> Barry >> > >> > I had to fix a few issues similar to this a while back. The method >> > VecScatterGetTypes_Private introduced in pull request 176 might be >> > useful in this context. >> > >> > Cheers, >> > Dominic >> > > > > > > > > > > This electronic communication and any attachments may contain confidential and proprietary > information of DigitalGlobe, Inc. If you are not the intended recipient, or an agent or employee > responsible for delivering this communication to the intended recipient, or if you have received > this communication in error, please do not print, copy, retransmit, disseminate or > otherwise use the information. Please indicate to the sender that you have received this > communication in error, and delete the copy you received. DigitalGlobe reserves the > right to monitor any electronic communication sent or received by its employees, agents > or representatives. > From jed at jedbrown.org Mon Sep 15 18:05:20 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 15 Sep 2014 17:05:20 -0600 Subject: [petsc-users] Identifier for DMs within DMComposite In-Reply-To: References: Message-ID: <8761gov61b.fsf@jedbrown.org> Gautam Bisht writes: > Hi, > > In my application, the number and order of DMs within a DMComposite could > vary. Is there a way to tag (perhaps using an integer value) individual DMs > within a DMComposite? One option is to name the DMs via PetscObjectSetName and programmatically get them out using DMCompositeGetEntriesArray and friends. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From mailinglists at xgm.de Tue Sep 16 06:37:59 2014 From: mailinglists at xgm.de (Florian Lindner) Date: Tue, 16 Sep 2014 13:37:59 +0200 Subject: [petsc-users] Force running petsc sequentially Message-ID: <5577707.QnDV7Lnqcy@asaru> Hello, I'm currently replacing an RBF implementation with petsc linear algebra. The program itself runs parallel using MPI but the piece of code I work on runs strictly sequentially without making any use of MPI, just the same code on every node. Right now we're more interessted in patsc sparse matrix abilities then in its parallelization. Though parallelization is certainly interesting later.... What is the best way to run petsc sequentially? 1) MatSetType the matrix to MATSEQSBAIJ e.g. -> expects MPI communicator of size 1. 2) MatSetSizes(matrix, n, n, n, n) does not work. 2) MatCreate not with PETSC_COMM_WORLD but with the communicator of size 1. Where do I get it from? (probably MPI_Comm_create and friends) Is there another more petsc like way? Thanks, Florian From bsmith at mcs.anl.gov Tue Sep 16 07:42:32 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 16 Sep 2014 07:42:32 -0500 Subject: [petsc-users] Force running petsc sequentially In-Reply-To: <5577707.QnDV7Lnqcy@asaru> References: <5577707.QnDV7Lnqcy@asaru> Message-ID: On Sep 16, 2014, at 6:37 AM, Florian Lindner wrote: > Hello, > > I'm currently replacing an RBF implementation with petsc linear algebra. The program itself runs parallel using MPI but the piece of code I work on runs strictly sequentially without making any use of MPI, just the same code on every node. Right now we're more interessted in patsc sparse matrix abilities then in its parallelization. Though parallelization is certainly interesting later.... > > What is the best way to run petsc sequentially? > > 1) MatSetType the matrix to MATSEQSBAIJ e.g. -> expects MPI communicator of size 1. > 2) MatSetSizes(matrix, n, n, n, n) does not work. This should certainly work on one process > 2) MatCreate not with PETSC_COMM_WORLD but with the communicator of size 1. Where do I get it from? (probably MPI_Comm_create and friends) Just use PETSC_COMM_SELF > > Is there another more petsc like way? > > Thanks, > Florian From hillsmattc at outlook.com Tue Sep 16 07:41:48 2014 From: hillsmattc at outlook.com (Matthew Hills) Date: Tue, 16 Sep 2014 14:41:48 +0200 Subject: [petsc-users] PETSc/TAU configuration Message-ID: Hi PETSc Team, I am experiencing difficulties with configuring PETSc with TAU. I have replaced the standard compilers with the tau_cc.sh, tau_cxx.sh, and tau_f90.sh scripts but this produces multiple errors. I have also attempted to use OpenMPI and MPICH, but both produce their own unique errors. After successfully compiling PDT, TAU was compiled with: ./configure -prefix=`pwd` -cc=gcc -c++=g++ -fortran=gfortran -pdt=${SESKADIR}/packages/pdt -mpiinc=${PETSC_DIR}/${PETSC_ARCH}/include -mpilib=${PETSC_DIR}/${PETSC_ARCH}/lib -bfd=download Attached you'll find the PETSc configuration logs. If any more information is needed please let me know. Warm regards, Matthew HillsUniversity of Cape Town -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: configure_mpich.log URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: configure_openmpi.log URL: From balay at mcs.anl.gov Tue Sep 16 08:21:41 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 16 Sep 2014 08:21:41 -0500 Subject: [petsc-users] PETSc/TAU configuration In-Reply-To: References: Message-ID: I haven't tried using TAU in a while - but here are some obvious things to try. 1. --download-mpich [or openmpi] with TAU does not make sense. You would have to build MPICH/OpenMPI first. Then build TAU to use this MPI. And then build PETSc to use this TAU_CC/MPI 2. I would use only tau_cc.sh - and not bother with c++/fortran i.e [with TAU build with a given mpicc] - configure PETSc with: ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 3. Do not use any --download-package when using tau_cc.sh. First check if you are able to use TAU with PETSc - without externalpackages [you would need blas,mpi. Use system blas/lapack for blas/lapack - and build MPI as mentioned above for use with TAU and later PETSc] And if you really need these externalpackage [assuming the above basic build with TAU works] - I would recommend the following 2 step build process: 4.1. ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-PACKAGE PETSC_ARCH=arch-packages 4.2. Now strip out the petsc relavent stuff from this location rm -f arch-packages/include/petsc*.h 4.3. Now build PETSc with TAU - using these prebuilt-packages ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 PETSC_ARCH=arch-tau --with-PACKAGE-dir=`pwd`/arch-packages BTW: the current release is petsc-3.5 - we recommend upgrading to using it [as we usually support the latest release wrt debugging/bug fixes] Satish 1. dont use any download packages. If you need them - you have to build them separately. One way to do this - is to let petsc do the build like: ./configure --download-package PETSC_ARCH=arch-packages [then remove On Tue, 16 Sep 2014, Matthew Hills wrote: > > > > Hi PETSc Team, > I am experiencing difficulties with configuring PETSc with TAU. I have replaced the standard compilers with the tau_cc.sh, tau_cxx.sh, and tau_f90.sh scripts but this produces multiple errors. I have also attempted to use OpenMPI and MPICH, but both produce their own unique errors. > After successfully compiling PDT, TAU was compiled with: > > > > > > > > > > ./configure > -prefix=`pwd` > -cc=gcc > -c++=g++ -fortran=gfortran > -pdt=${SESKADIR}/packages/pdt > -mpiinc=${PETSC_DIR}/${PETSC_ARCH}/include > -mpilib=${PETSC_DIR}/${PETSC_ARCH}/lib -bfd=download > > > > Attached you'll find the PETSc configuration logs. If any more information is needed please let me know. > > Warm regards, > > Matthew HillsUniversity of Cape Town > > From katyghantous at gmail.com Tue Sep 16 10:08:20 2014 From: katyghantous at gmail.com (Katy Ghantous) Date: Tue, 16 Sep 2014 17:08:20 +0200 Subject: [petsc-users] speedup for TS solver using DMDA In-Reply-To: <01E70DF0-6B32-4AC8-89A9-B7C0A406F901@mcs.anl.gov> References: <379CA09C-26C5-4298-B3A7-471BF2CF5527@mcs.anl.gov> <01E70DF0-6B32-4AC8-89A9-B7C0A406F901@mcs.anl.gov> Message-ID: thank you! this has been extremely useful in figuring out a plan of action. On Mon, Sep 15, 2014 at 9:08 PM, Barry Smith wrote: > > Based on the streams speedups below it looks like a single core can > utilize roughly 1/2 of the memory bandwidth, leaving all the other cores > only 1/2 of the bandwidth to utilize, so you can only expect at best a > speedup of roughly 2 on this machine with traditional PETSc sparse solvers. > > To add insult to injury it appears that the threads are not being > assigned to physical cores very well either. Under the best circumstance > on this system one would like to see a speedup of about 2 when running with > two processes but it actually delivers only 1.23 and the speedup of 2 only > occurs with 5 processes. I attribute this to the MPI or OS not assigning > the second MPI process to the ?best? core for memory bandwidth. Likely it > should assign the second MPI process to the 2nd CPU but instead it is > assigning it also to the first CPU and only when it gets to the 5th MPI > process does the second CPU get utilized. > > You can look at the documentation for your MPI?s process affinity to > see if you can force the 2nd MPI process onto the second CPU. > > Barry > > > np speedup > 1 1.0 > 2 1.23 > 3 1.3 > 4 1.75 > 5 2.18 > > > 6 1.22 > 7 2.3 > 8 1.22 > 9 2.01 > 10 1.19 > 11 1.93 > 12 1.93 > 13 1.73 > 14 2.17 > 15 1.99 > 16 2.08 > 17 2.16 > 18 1.47 > 19 1.95 > 20 2.09 > 21 1.9 > 22 1.96 > 23 1.92 > 24 2.02 > 25 1.96 > 26 1.89 > 27 1.93 > 28 1.97 > 29 1.96 > 30 1.93 > 31 2.16 > 32 2.12 > Estimation of possible > > On Sep 15, 2014, at 1:42 PM, Katy Ghantous wrote: > > > Matt, thanks! i will look into that and find other ways to make the > computation faster. > > > > Barry, the benchmark reports up to 2 speedup, but says 1 node in the > end. but either way i was expecting a higher speedup.. 2 is the limit for > two cpus despite the multiple cores? > > > > please let me know if the file attached is what you are asking for. > > Thank you! > > > > > > On Mon, Sep 15, 2014 at 8:23 PM, Barry Smith wrote: > > > > Please send the output from running > > > > make steams NPMAX=32 > > > > in the PETSc root directory. > > > > > > Barry > > > > My guess is that it reports ?one node? is just because it uses the > ?hostname? to distinguish nodes and though your machine has two CPUs, from > the point of view of the OS it has only a single hostname and hence reports > just one ?node?. > > > > > > On Sep 15, 2014, at 12:45 PM, Katy Ghantous > wrote: > > > > > Hi, > > > I am using DMDA to run in parallel TS to solves a set of N equations. > I am using DMDAGetCorners in the RHSfunction with setting the stencil size > at 2 to solve a set of coupled ODEs on 30 cores. > > > The machine has 32 cores (2 physical CPUs with 2x8 core each with > speed of 3.4Ghz per core). > > > However, mpiexec with more than one core is showing no speedup. > > > Also at the configuring/testing stage for petsc on that machine, there > was no speedup and it only reported one node. > > > Is there somehting wrong with how i configured petsc or is the > approach inappropriate for the machine? > > > I am not sure what files (or sections of the code) you would need to > be able to answer my question. > > > > > > Thank you! > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Tue Sep 16 10:48:59 2014 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 16 Sep 2014 17:48:59 +0200 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains Message-ID: <54185BEB.9060803@gmail.com> For the purposes of reproducing an example from a paper, I'd like to use PCASM with subdomains which 'overlap minimally' (though this is probably never a good idea in practice). In one dimension with 7 unknowns and 2 domains, this might look like 0 1 2 3 4 5 6 (unknowns) ------------ (first subdomain : 0 .. 3) ----------- (second subdomain : 3 .. 6) The subdomains share only a single grid point, which differs from the way PCASM is used in most of the examples. In two dimensions, minimally overlapping rectangular subdomains would overlap one exactly one row or column of the grid. Thus, for example, if the grid unknowns were 0 1 2 3 4 5 | 6 7 8 9 10 11 | | 12 13 14 15 16 17 | -------- ----------- then one minimally-overlapping set of 4 subdomains would be 0 1 2 3 6 7 8 9 3 4 5 9 10 11 6 7 8 9 12 13 14 15 9 10 11 15 16 17 as suggested by the dashes and pipes above. The subdomains only overlap by a single row or column of the grid. My question is whether and how one can use the PCASM interface to work with these sorts of decompositions (It's fine for my purposes to use a single MPI process). In particular, I don't quite understand if should be possible to define these decompositions by correctly providing is and is_local arguments to PCASMSetLocalSubdomains. I have gotten code to run defining the is_local entries to be subsets of the is entries which define a partition of the global degrees of freedom*, but I'm not certain that this was the correct choice, as it appears to produce an unsymmetric preconditioner for a symmetric system when I use direct subdomain solves and the 'basic' type for PCASM. * For example, in the 1D example above this would correspond to is[0] <-- 0 1 2 3 is[1] <-- 3 4 5 6 is_local[0] <-- 0 1 2 is_local[1] <-- 3 4 5 6 From Carol.Brickley at awe.co.uk Tue Sep 16 11:13:48 2014 From: Carol.Brickley at awe.co.uk (Carol.Brickley at awe.co.uk) Date: Tue, 16 Sep 2014 16:13:48 +0000 Subject: [petsc-users] Error compiling f90 code with petsc 3.5.2 Message-ID: <201409161613.s8GGDosG010334@msw1.awe.co.uk> Hi, When I try to compile an F90 code and include the 3.5.2 petsc built under intelmpi-4.1.3 and intel13.1, I get: ...error #5082: Syntax error, found '::' when expecting one of: ( % [ : . = => DMDABoundaryType :: bx, by, bz ! boundary type ...............................................^ This compiles fine with petsc 3.4.3. Any ideas? Carol Dr Carol Brickley ___________________________________________________ ____________________________ The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. AWE Plc Registered in England and Wales Registration No 02763902 AWE, Aldermaston, Reading, RG7 4PR -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 16 11:27:22 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Sep 2014 11:27:22 -0500 Subject: [petsc-users] Error compiling f90 code with petsc 3.5.2 In-Reply-To: <201409161613.s8GGDosG010334@msw1.awe.co.uk> References: <201409161613.s8GGDosG010334@msw1.awe.co.uk> Message-ID: On Tue, Sep 16, 2014 at 11:13 AM, wrote: > Hi, > > > > When I try to compile an F90 code and include the 3.5.2 petsc built under > intelmpi-4.1.3 and intel13.1, I get: > > > > *?error #5082: Syntax error, found ?::? when expecting one of: ( % [ : . = > =>* > > * DMDABoundaryType :: bx, by, bz ! boundary type* > > *???????????????..^* > > > > This compiles fine with petsc 3.4.3. > > > > Any ideas? > This has been changed to DMBoundaryType since now all DMs should respect boundaries. See http://www.mcs.anl.gov/petsc/documentation/changes/35.html Thanks, Matt > > > Carol > > > > *Dr Carol Brickley * > > > > > > > > ___________________________________________________ > ____________________________ The information in this email and in any > attachment(s) is commercial in confidence. If you are not the named > addressee(s) or if you receive this email in error then any distribution, > copying or use of this communication or the information in it is strictly > prohibited. Please notify us immediately by email at admin.internet(at) > awe.co.uk, and then delete this message from your computer. While > attachments are virus checked, AWE plc does not accept any liability in > respect of any virus which is not detected. AWE Plc Registered in England > and Wales Registration No 02763902 AWE, Aldermaston, Reading, RG7 4PR > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From sghosh2012 at gatech.edu Tue Sep 16 13:10:56 2014 From: sghosh2012 at gatech.edu (Ghosh, Swarnava) Date: Tue, 16 Sep 2014 14:10:56 -0400 (EDT) Subject: [petsc-users] General query, default preconditioner In-Reply-To: <839192039.1228659.1410888328262.JavaMail.root@mail.gatech.edu> Message-ID: <1285214876.1279813.1410891056689.JavaMail.root@mail.gatech.edu> Hello, I just had a general query if PETSC has something like a "default preconditioner" when preconditioner type is not set and equations are solved using KSPSolve and GMRES method. Specifically, I am using the following lines of code: KSPCreate(PETSC_COMM_WORLD,&pOfdft->ksp); KSPSetType(pOfdft->ksp, KSPGMRES); KSPSetOperators(pOfdft->ksp,A,A,SAME_NONZERO_PATTERN); KSPSetTolerances(pOfdft->ksp,KSPTOL,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT); KSPSetFromOptions(pOfdft->ksp); KSPSetUp(pOfdft->ksp); Regards, Swarnava -- From abhyshr at mcs.anl.gov Tue Sep 16 13:47:12 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Tue, 16 Sep 2014 18:47:12 +0000 Subject: [petsc-users] General query, default preconditioner In-Reply-To: <1285214876.1279813.1410891056689.JavaMail.root@mail.gatech.edu> Message-ID: PETSc defaults to using an ILU(0) preconditioner in serial and Block-Jacobi + ILU(0) on blocks in parallel. You can see the preconditioner details using the option -ksp_view. Shri -----Original Message----- From: "Ghosh, Swarnava" > Date: Tue, 16 Sep 2014 14:10:56 -0400 To: > Subject: [petsc-users] General query, default preconditioner Hello, I just had a general query if PETSC has something like a "default preconditioner" when preconditioner type is not set and equations are solved using KSPSolve and GMRES method. Specifically, I am using the following lines of code: KSPCreate(PETSC_COMM_WORLD,&pOfdft->ksp); KSPSetType(pOfdft->ksp, KSPGMRES); KSPSetOperators(pOfdft->ksp,A,A,SAME_NONZERO_PATTERN); KSPSetTolerances(pOfdft->ksp,KSPTOL,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT); KSPSetFromOptions(pOfdft->ksp); KSPSetUp(pOfdft->ksp); Regards, Swarnava -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 16 14:04:02 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 16 Sep 2014 14:04:02 -0500 Subject: [petsc-users] speedup for TS solver using DMDA In-Reply-To: References: <379CA09C-26C5-4298-B3A7-471BF2CF5527@mcs.anl.gov> <01E70DF0-6B32-4AC8-89A9-B7C0A406F901@mcs.anl.gov> Message-ID: <69AA1321-AF0E-460B-98BD-D457AAC4C2EF@mcs.anl.gov> The tool hwloc can be useful in understanding the organization of cores and memories on a machine. For example I run lstopo --no-icaches --no-io --ignore PU (along with make streams in the root PETSc directory) on my laptop and it shows np speedup 1 1.0 2 1.43 3 1.47 4 1.45 Estimation of possible speedup of MPI programs based on Streams benchmark. It appears you have 1 node(s) See graph in the file src/benchmarks/streams/scaling.png Machine (16GB) + NUMANode L#0 (P#0 16GB) + L3 L#0 (6144KB) L2 L#0 (256KB) + L1d L#0 (32KB) + Core L#0 L2 L#1 (256KB) + L1d L#1 (32KB) + Core L#1 L2 L#2 (256KB) + L1d L#2 (32KB) + Core L#2 L2 L#3 (256KB) + L1d L#3 (32KB) + Core L#3 This system has one ?memory bank?, 1 CPU and 4 cores. Note that when two cores are running the streams benchmark they are essentially utilizing all of the memory bandwidth hence you get no further speed up after two cores. Next I run on a ?server? class workstation with two ?memory banks?, each associated with a CPU with 8 cores np speedup 1 1.0 2 1.8 3 2.21 4 2.35 5 2.4 6 2.41 7 3.3 8 2.4 9 2.66 10 2.22 11 2.28 12 4.04 13 2.46 14 2.61 15 4.11 16 3.01 Estimation of possible speedup of MPI programs based on Streams benchmark. It appears you have 1 node(s) See graph in the file src/benchmarks/streams/scaling.png Machine (128GB) NUMANode L#0 (P#0 64GB) + Socket L#0 + L3 L#0 (20MB) L2 L#0 (256KB) + L1d L#0 (32KB) + Core L#0 L2 L#1 (256KB) + L1d L#1 (32KB) + Core L#1 L2 L#2 (256KB) + L1d L#2 (32KB) + Core L#2 L2 L#3 (256KB) + L1d L#3 (32KB) + Core L#3 L2 L#4 (256KB) + L1d L#4 (32KB) + Core L#4 L2 L#5 (256KB) + L1d L#5 (32KB) + Core L#5 L2 L#6 (256KB) + L1d L#6 (32KB) + Core L#6 L2 L#7 (256KB) + L1d L#7 (32KB) + Core L#7 NUMANode L#1 (P#1 64GB) + Socket L#1 + L3 L#1 (20MB) L2 L#8 (256KB) + L1d L#8 (32KB) + Core L#8 L2 L#9 (256KB) + L1d L#9 (32KB) + Core L#9 L2 L#10 (256KB) + L1d L#10 (32KB) + Core L#10 L2 L#11 (256KB) + L1d L#11 (32KB) + Core L#11 L2 L#12 (256KB) + L1d L#12 (32KB) + Core L#12 L2 L#13 (256KB) + L1d L#13 (32KB) + Core L#13 L2 L#14 (256KB) + L1d L#14 (32KB) + Core L#14 L2 L#15 (256KB) + L1d L#15 (32KB) + Core L#15 Note the speedup gets to be as high as 4 meaning that the memory is fast enough to fully server at least four cores. But the speed up jumps all over the place when using from 1 to 16 cores. I am guessing that is because the MPI processes are not being well mapped to cores. So I run with the additional MPICH mpiexec options -bind-to socket -map-by hwthread and get np speedup 1 1.0 2 2.26 3 2.79 4 2.93 5 2.99 6 3.0 7 3.01 8 2.99 9 2.81 10 2.81 11 2.9 12 2.94 13 2.94 14 2.94 15 2.93 16 2.93 Estimation of possible speedup of MPI programs based on Streams benchmark. The I run with just the -bind-to socket and get much better numbers np speedup 1 1.0 2 2.41 3 3.36 4 4.45 5 4.51 6 5.45 7 5.07 8 5.81 9 5.27 10 5.93 11 5.42 12 5.95 13 5.49 14 5.94 15 5.56 16 5.88 Using this option I get roughly a speedup of 6. See http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding for more information on these options Barry On Sep 16, 2014, at 10:08 AM, Katy Ghantous wrote: > thank you! this has been extremely useful in figuring out a plan of action. > > > On Mon, Sep 15, 2014 at 9:08 PM, Barry Smith wrote: > > Based on the streams speedups below it looks like a single core can utilize roughly 1/2 of the memory bandwidth, leaving all the other cores only 1/2 of the bandwidth to utilize, so you can only expect at best a speedup of roughly 2 on this machine with traditional PETSc sparse solvers. > > To add insult to injury it appears that the threads are not being assigned to physical cores very well either. Under the best circumstance on this system one would like to see a speedup of about 2 when running with two processes but it actually delivers only 1.23 and the speedup of 2 only occurs with 5 processes. I attribute this to the MPI or OS not assigning the second MPI process to the ?best? core for memory bandwidth. Likely it should assign the second MPI process to the 2nd CPU but instead it is assigning it also to the first CPU and only when it gets to the 5th MPI process does the second CPU get utilized. > > You can look at the documentation for your MPI?s process affinity to see if you can force the 2nd MPI process onto the second CPU. > > Barry > > > np speedup > 1 1.0 > 2 1.23 > 3 1.3 > 4 1.75 > 5 2.18 > > > 6 1.22 > 7 2.3 > 8 1.22 > 9 2.01 > 10 1.19 > 11 1.93 > 12 1.93 > 13 1.73 > 14 2.17 > 15 1.99 > 16 2.08 > 17 2.16 > 18 1.47 > 19 1.95 > 20 2.09 > 21 1.9 > 22 1.96 > 23 1.92 > 24 2.02 > 25 1.96 > 26 1.89 > 27 1.93 > 28 1.97 > 29 1.96 > 30 1.93 > 31 2.16 > 32 2.12 > Estimation of possible > > On Sep 15, 2014, at 1:42 PM, Katy Ghantous wrote: > > > Matt, thanks! i will look into that and find other ways to make the computation faster. > > > > Barry, the benchmark reports up to 2 speedup, but says 1 node in the end. but either way i was expecting a higher speedup.. 2 is the limit for two cpus despite the multiple cores? > > > > please let me know if the file attached is what you are asking for. > > Thank you! > > > > > > On Mon, Sep 15, 2014 at 8:23 PM, Barry Smith wrote: > > > > Please send the output from running > > > > make steams NPMAX=32 > > > > in the PETSc root directory. > > > > > > Barry > > > > My guess is that it reports ?one node? is just because it uses the ?hostname? to distinguish nodes and though your machine has two CPUs, from the point of view of the OS it has only a single hostname and hence reports just one ?node?. > > > > > > On Sep 15, 2014, at 12:45 PM, Katy Ghantous wrote: > > > > > Hi, > > > I am using DMDA to run in parallel TS to solves a set of N equations. I am using DMDAGetCorners in the RHSfunction with setting the stencil size at 2 to solve a set of coupled ODEs on 30 cores. > > > The machine has 32 cores (2 physical CPUs with 2x8 core each with speed of 3.4Ghz per core). > > > However, mpiexec with more than one core is showing no speedup. > > > Also at the configuring/testing stage for petsc on that machine, there was no speedup and it only reported one node. > > > Is there somehting wrong with how i configured petsc or is the approach inappropriate for the machine? > > > I am not sure what files (or sections of the code) you would need to be able to answer my question. > > > > > > Thank you! > > > > > > > > From bsmith at mcs.anl.gov Tue Sep 16 14:23:42 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 16 Sep 2014 14:23:42 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: <54185BEB.9060803@gmail.com> References: <54185BEB.9060803@gmail.com> Message-ID: <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> Patrick, This "local part of the subdomains for this processor? term in PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I think that if you set the is_local[] to be different than the is[] you will always end up with a nonsymetric preconditioner. I think for one dimension you need to use > is[0] <-- 0 1 2 3 > is[1] <-- 3 4 5 6 > is_local[0] <-- 0 1 2 3 > is_local[1] <-- 3 4 5 6 Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); Barry Note that is_local[] doesn?t have to be non-overlapping or anything. On Sep 16, 2014, at 10:48 AM, Patrick Sanan wrote: > For the purposes of reproducing an example from a paper, I'd like to use PCASM with subdomains which 'overlap minimally' (though this is probably never a good idea in practice). > > In one dimension with 7 unknowns and 2 domains, this might look like > > 0 1 2 3 4 5 6 (unknowns) > ------------ (first subdomain : 0 .. 3) > ----------- (second subdomain : 3 .. 6) > > The subdomains share only a single grid point, which differs from the way PCASM is used in most of the examples. > > In two dimensions, minimally overlapping rectangular subdomains would overlap one exactly one row or column of the grid. Thus, for example, if the grid unknowns were > > 0 1 2 3 4 5 | > 6 7 8 9 10 11 | | > 12 13 14 15 16 17 | > -------- > ----------- > > then one minimally-overlapping set of 4 subdomains would be > 0 1 2 3 6 7 8 9 > 3 4 5 9 10 11 > 6 7 8 9 12 13 14 15 > 9 10 11 15 16 17 > as suggested by the dashes and pipes above. The subdomains only overlap by a single row or column of the grid. > > My question is whether and how one can use the PCASM interface to work with these sorts of decompositions (It's fine for my purposes to use a single MPI process). In particular, I don't quite understand if should be possible to define these decompositions by correctly providing is and is_local arguments to PCASMSetLocalSubdomains. > > I have gotten code to run defining the is_local entries to be subsets of the is entries which define a partition of the global degrees of freedom*, but I'm not certain that this was the correct choice, as it appears to produce an unsymmetric preconditioner for a symmetric system when I use direct subdomain solves and the 'basic' type for PCASM. > > * For example, in the 1D example above this would correspond to > is[0] <-- 0 1 2 3 > is[1] <-- 3 4 5 6 > is_local[0] <-- 0 1 2 > is_local[1] <-- 3 4 5 6 > > > > From knepley at gmail.com Tue Sep 16 14:29:46 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Sep 2014 14:29:46 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> Message-ID: On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith wrote: > > Patrick, > > This "local part of the subdomains for this processor? term in > PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I > think that if you set the is_local[] to be different than the is[] you will > always end up with a nonsymetric preconditioner. I think for one dimension > you need to use No I don't think that is right. The problem below is that you have overlap in only one direction. Process 0 overlaps Process 1, but Process 1 has no overlap of Process 0. This is not how Schwarz is generally envisioned. Imagine the linear algebra viewpoint, which I think is cleaner here. You partition the matrix rows into non-overlapping sets. These sets are is_local[]. Then any information you get from another domain is another row, which is put into is[]. You can certainly have a non-symmetric overlap, which you have below, but it mean one way information transmission which is strange for convergence. Matt > > > is[0] <-- 0 1 2 3 > > is[1] <-- 3 4 5 6 > > is_local[0] <-- 0 1 2 3 > > is_local[1] <-- 3 4 5 6 > > Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); > > Barry > > > Note that is_local[] doesn?t have to be non-overlapping or anything. > > > On Sep 16, 2014, at 10:48 AM, Patrick Sanan > wrote: > > > For the purposes of reproducing an example from a paper, I'd like to use > PCASM with subdomains which 'overlap minimally' (though this is probably > never a good idea in practice). > > > > In one dimension with 7 unknowns and 2 domains, this might look like > > > > 0 1 2 3 4 5 6 (unknowns) > > ------------ (first subdomain : 0 .. 3) > > ----------- (second subdomain : 3 .. 6) > > > > The subdomains share only a single grid point, which differs from the > way PCASM is used in most of the examples. > > > > In two dimensions, minimally overlapping rectangular subdomains would > overlap one exactly one row or column of the grid. Thus, for example, if > the grid unknowns were > > > > 0 1 2 3 4 5 | > > 6 7 8 9 10 11 | | > > 12 13 14 15 16 17 | > > -------- > > ----------- > > > > then one minimally-overlapping set of 4 subdomains would be > > 0 1 2 3 6 7 8 9 > > 3 4 5 9 10 11 > > 6 7 8 9 12 13 14 15 > > 9 10 11 15 16 17 > > as suggested by the dashes and pipes above. The subdomains only overlap > by a single row or column of the grid. > > > > My question is whether and how one can use the PCASM interface to work > with these sorts of decompositions (It's fine for my purposes to use a > single MPI process). In particular, I don't quite understand if should be > possible to define these decompositions by correctly providing is and > is_local arguments to PCASMSetLocalSubdomains. > > > > I have gotten code to run defining the is_local entries to be subsets of > the is entries which define a partition of the global degrees of freedom*, > but I'm not certain that this was the correct choice, as it appears to > produce an unsymmetric preconditioner for a symmetric system when I use > direct subdomain solves and the 'basic' type for PCASM. > > > > * For example, in the 1D example above this would correspond to > > is[0] <-- 0 1 2 3 > > is[1] <-- 3 4 5 6 > > is_local[0] <-- 0 1 2 > > is_local[1] <-- 3 4 5 6 > > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 16 14:43:16 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 16 Sep 2014 14:43:16 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> Message-ID: <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> On Sep 16, 2014, at 2:29 PM, Matthew Knepley wrote: > On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith wrote: > > Patrick, > > This "local part of the subdomains for this processor? term in PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I think that if you set the is_local[] to be different than the is[] you will always end up with a nonsymetric preconditioner. I think for one dimension you need to use > > No I don't think that is right. The problem below is that you have overlap in only one direction. Process 0 overlaps > Process 1, but Process 1 has no overlap of Process 0. This is not how Schwarz is generally envisioned. Sure it is. > > Imagine the linear algebra viewpoint, which I think is cleaner here. You partition the matrix rows into non-overlapping > sets. These sets are is_local[]. Then any information you get from another domain is another row, which is put into > is[]. You can certainly have a non-symmetric overlap, which you have below, but it mean one way information > transmission which is strange for convergence. No, not a all. | 0 1 2 3 4 5 6 | Domain 0 is the region from | to 4 with Dirichlet boundary conditions at each end (| and 4). Domain 1 is from 2 to | with Dirichlet boundary conditions at each end (2 and |) . If you look at the PCSetUp_ASM() and PCApply_ASM() you?ll see all kinds of VecScatter creations from the various is and is_local, ?restriction?, ?prolongation? and ?localization? then in the apply the different scatters are applied in the two directions, which results in a non-symmetric operator. Barry > > Matt > > > > is[0] <-- 0 1 2 3 > > is[1] <-- 3 4 5 6 > > is_local[0] <-- 0 1 2 3 > > is_local[1] <-- 3 4 5 6 > > Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); > > Barry > > > Note that is_local[] doesn?t have to be non-overlapping or anything. > > > On Sep 16, 2014, at 10:48 AM, Patrick Sanan wrote: > > > For the purposes of reproducing an example from a paper, I'd like to use PCASM with subdomains which 'overlap minimally' (though this is probably never a good idea in practice). > > > > In one dimension with 7 unknowns and 2 domains, this might look like > > > > 0 1 2 3 4 5 6 (unknowns) > > ------------ (first subdomain : 0 .. 3) > > ----------- (second subdomain : 3 .. 6) > > > > The subdomains share only a single grid point, which differs from the way PCASM is used in most of the examples. > > > > In two dimensions, minimally overlapping rectangular subdomains would overlap one exactly one row or column of the grid. Thus, for example, if the grid unknowns were > > > > 0 1 2 3 4 5 | > > 6 7 8 9 10 11 | | > > 12 13 14 15 16 17 | > > -------- > > ----------- > > > > then one minimally-overlapping set of 4 subdomains would be > > 0 1 2 3 6 7 8 9 > > 3 4 5 9 10 11 > > 6 7 8 9 12 13 14 15 > > 9 10 11 15 16 17 > > as suggested by the dashes and pipes above. The subdomains only overlap by a single row or column of the grid. > > > > My question is whether and how one can use the PCASM interface to work with these sorts of decompositions (It's fine for my purposes to use a single MPI process). In particular, I don't quite understand if should be possible to define these decompositions by correctly providing is and is_local arguments to PCASMSetLocalSubdomains. > > > > I have gotten code to run defining the is_local entries to be subsets of the is entries which define a partition of the global degrees of freedom*, but I'm not certain that this was the correct choice, as it appears to produce an unsymmetric preconditioner for a symmetric system when I use direct subdomain solves and the 'basic' type for PCASM. > > > > * For example, in the 1D example above this would correspond to > > is[0] <-- 0 1 2 3 > > is[1] <-- 3 4 5 6 > > is_local[0] <-- 0 1 2 > > is_local[1] <-- 3 4 5 6 > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From 4bikerboyjohn at gmail.com Tue Sep 16 18:14:06 2014 From: 4bikerboyjohn at gmail.com (John Alletto) Date: Tue, 16 Sep 2014 16:14:06 -0700 Subject: [petsc-users] 4th order stencil within PETSc Message-ID: I have worked with the 7-point star stencil and the 27-point Box stencil. Each goes out from its center +- 1 position in i,j,k. I have a 13-point star stencil (4th order) which goes out +- 2 positions from its center in i,j,k. Can I use this in the PETSc ? Do I have to do something different than what is done with smaller conventional stencil?s? Respectfully John -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 16 18:28:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Sep 2014 18:28:52 -0500 Subject: [petsc-users] 4th order stencil within PETSc In-Reply-To: References: Message-ID: On Tue, Sep 16, 2014 at 6:14 PM, John Alletto <4bikerboyjohn at gmail.com> wrote: > > I have worked with the 7-point star stencil and the 27-point Box stencil. > Each goes out from its center +- 1 position in i,j,k. > > I have a 13-point star stencil (4th order) which goes out +- 2 positions > from its center in i,j,k. > > Can I use this in the PETSc ? Do I have to do something different than > what is done with smaller conventional stencil?s? > When you create the DMDA, give a stencil width of 2 (the s parameter). Then you can index +- 2 outside of your partition. Thanks, Matt > > Respectfully > John > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From csp at info.szwgroup.com Wed Sep 17 02:46:40 2014 From: csp at info.szwgroup.com (=?utf-8?B?TXMuIEVsbGEgV2Vp?=) Date: Wed, 17 Sep 2014 15:46:40 +0800 (CST) Subject: [petsc-users] =?utf-8?q?IDC_invests_big_in_South_Africa_CSP_proje?= =?utf-8?q?cts?= Message-ID: <20140917074640.EA4EB43F6987@mx2.easyiye.com> An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Wed Sep 17 15:03:35 2014 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Wed, 17 Sep 2014 22:03:35 +0200 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> Message-ID: <5419E917.2060405@gmail.com> On 9/16/14 9:43 PM, Barry Smith wrote: > On Sep 16, 2014, at 2:29 PM, Matthew Knepley wrote: > >> On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith wrote: >> >> Patrick, >> >> This "local part of the subdomains for this processor? term in PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I think that if you set the is_local[] to be different than the is[] you will always end up with a nonsymetric preconditioner. I think for one dimension you need to use >> >> No I don't think that is right. The problem below is that you have overlap in only one direction. Process 0 overlaps >> Process 1, but Process 1 has no overlap of Process 0. This is not how Schwarz is generally envisioned. > Sure it is. >> Imagine the linear algebra viewpoint, which I think is cleaner here. You partition the matrix rows into non-overlapping >> sets. These sets are is_local[]. Then any information you get from another domain is another row, which is put into >> is[]. You can certainly have a non-symmetric overlap, which you have below, but it mean one way information >> transmission which is strange for convergence. > No, not a all. > > > | 0 1 2 3 4 5 6 | > > Domain 0 is the region from | to 4 with Dirichlet boundary conditions at each end (| and 4). Domain 1 is from 2 to | with Dirichlet boundary conditions at each end (2 and |) . > > If you look at the PCSetUp_ASM() and PCApply_ASM() you?ll see all kinds of VecScatter creations from the various is and is_local, ?restriction?, ?prolongation? and ?localization? then in the apply the different scatters are applied in the two directions, which results in a non-symmetric operator. I was able to get my uniprocessor example to give the (symmetric) preconditioner I expected by commenting out the check in PCSetUp_ASM (line 311 in asm.c) and using PCASMSetLocalSubdomains with the same (overlapping) IS's for both is and is_local ([0 1 2 3] and [3 4 5 6] in the example above). It also works passing NULL for is_local. I assume that the purpose of the check mentioned above is to ensure that every grid point is assigned to exactly one processor, which is needed by whatever interprocess scattering goes on in the implementation. Also, I assume that augmenting the domain definition with an explicit specification of the way domains are distributed over processes allows for more controllable use of PC_ASM_RESTRICT, with all its attractive properties. Anyhow, Barry's advice previously in this thread works locally (for one test case) if you remove the check above, but the current implementation enforces something related to what Matt describes, which might be overly restrictive if multiple domains share a process. The impression I got initially from the documentation was that if one uses PC_ASM_BASIC, the choice of is_local should only influence the details of the communication pattern, not (in exact arithmetic, with process-count-independent subsolves) the preconditioner being defined. For regular grids this all seems pretty pathological (in practice I imagine people want to use symmetric overlaps, and I assume that one domain per node is the most common use case), but I could imagine it being more of a real concern when working with unstructured grids. > > Barry > > > >> Matt >> >> >>> is[0] <-- 0 1 2 3 >>> is[1] <-- 3 4 5 6 >>> is_local[0] <-- 0 1 2 3 >>> is_local[1] <-- 3 4 5 6 >> Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); >> >> Barry >> >> >> Note that is_local[] doesn?t have to be non-overlapping or anything. >> >> >> On Sep 16, 2014, at 10:48 AM, Patrick Sanan wrote: >> >>> For the purposes of reproducing an example from a paper, I'd like to use PCASM with subdomains which 'overlap minimally' (though this is probably never a good idea in practice). >>> >>> In one dimension with 7 unknowns and 2 domains, this might look like >>> >>> 0 1 2 3 4 5 6 (unknowns) >>> ------------ (first subdomain : 0 .. 3) >>> ----------- (second subdomain : 3 .. 6) >>> >>> The subdomains share only a single grid point, which differs from the way PCASM is used in most of the examples. >>> >>> In two dimensions, minimally overlapping rectangular subdomains would overlap one exactly one row or column of the grid. Thus, for example, if the grid unknowns were >>> >>> 0 1 2 3 4 5 | >>> 6 7 8 9 10 11 | | >>> 12 13 14 15 16 17 | >>> -------- >>> ----------- >>> >>> then one minimally-overlapping set of 4 subdomains would be >>> 0 1 2 3 6 7 8 9 >>> 3 4 5 9 10 11 >>> 6 7 8 9 12 13 14 15 >>> 9 10 11 15 16 17 >>> as suggested by the dashes and pipes above. The subdomains only overlap by a single row or column of the grid. >>> >>> My question is whether and how one can use the PCASM interface to work with these sorts of decompositions (It's fine for my purposes to use a single MPI process). In particular, I don't quite understand if should be possible to define these decompositions by correctly providing is and is_local arguments to PCASMSetLocalSubdomains. >>> >>> I have gotten code to run defining the is_local entries to be subsets of the is entries which define a partition of the global degrees of freedom*, but I'm not certain that this was the correct choice, as it appears to produce an unsymmetric preconditioner for a symmetric system when I use direct subdomain solves and the 'basic' type for PCASM. >>> >>> * For example, in the 1D example above this would correspond to >>> is[0] <-- 0 1 2 3 >>> is[1] <-- 3 4 5 6 >>> is_local[0] <-- 0 1 2 >>> is_local[1] <-- 3 4 5 6 >>> >>> >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener From bsmith at mcs.anl.gov Wed Sep 17 15:12:28 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 Sep 2014 15:12:28 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: <5419E917.2060405@gmail.com> References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> <5419E917.2060405@gmail.com> Message-ID: <3C54C269-CF81-45E2-8BB2-85ACB67545D6@mcs.anl.gov> On Sep 17, 2014, at 3:03 PM, Patrick Sanan wrote: > On 9/16/14 9:43 PM, Barry Smith wrote: >> On Sep 16, 2014, at 2:29 PM, Matthew Knepley wrote: >> >>> On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith wrote: >>> >>> Patrick, >>> >>> This "local part of the subdomains for this processor? term in PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I think that if you set the is_local[] to be different than the is[] you will always end up with a nonsymetric preconditioner. I think for one dimension you need to use >>> >>> No I don't think that is right. The problem below is that you have overlap in only one direction. Process 0 overlaps >>> Process 1, but Process 1 has no overlap of Process 0. This is not how Schwarz is generally envisioned. >> Sure it is. >>> Imagine the linear algebra viewpoint, which I think is cleaner here. You partition the matrix rows into non-overlapping >>> sets. These sets are is_local[]. Then any information you get from another domain is another row, which is put into >>> is[]. You can certainly have a non-symmetric overlap, which you have below, but it mean one way information >>> transmission which is strange for convergence. >> No, not a all. >> >> >> | 0 1 2 3 4 5 6 | >> >> Domain 0 is the region from | to 4 with Dirichlet boundary conditions at each end (| and 4). Domain 1 is from 2 to | with Dirichlet boundary conditions at each end (2 and |) . >> >> If you look at the PCSetUp_ASM() and PCApply_ASM() you?ll see all kinds of VecScatter creations from the various is and is_local, ?restriction?, ?prolongation? and ?localization? then in the apply the different scatters are applied in the two directions, which results in a non-symmetric operator. > > I was able to get my uniprocessor example to give the (symmetric) preconditioner I expected by commenting out the check in PCSetUp_ASM (line 311 in asm.c) if (firstRow != lastRow) SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_PLIB, "Specified ASM subdomain sizes were invalid: %d != %d", firstRow, lastRow); This check is absurd and needs to be removed. > and using PCASMSetLocalSubdomains with the same (overlapping) IS's for both is and is_local ([0 1 2 3] and [3 4 5 6] in the example above). It also works passing NULL for is_local. Great. > > I assume that the purpose of the check mentioned above is to ensure that every grid point is assigned to exactly one processor, which is needed by whatever interprocess scattering goes on in the implementation. Also, I assume that augmenting the domain definition with an explicit specification of the way domains are distributed over processes allows for more controllable use of PC_ASM_RESTRICT, with all its attractive properties. > > Anyhow, Barry's advice previously in this thread works locally (for one test case) if you remove the check above, but the current implementation enforces something related to what Matt describes, which might be overly restrictive if multiple domains share a process. The impression I got initially from the documentation was that if one uses PC_ASM_BASIC, the choice of is_local should only influence the details of the communication pattern, not (in exact arithmetic, with process-count-independent subsolves) the preconditioner being defined. The ?communication pattern? does determine the preconditioner being defined. The introduction of is_local[] broke the clean usage of PC_ASM_* that use to exist, so your confusion is our fault, not yours. > > > For regular grids this all seems pretty pathological (in practice I imagine people want to use symmetric overlaps, As I said above you are using "symmetric overlaps,?. It just looks ?unsymmetric? if you introduce this concept of ?non-overlapping initial subdomains? which is an unneeded harmful concept. Barry > and I assume that one domain per node is the most common use case), but I could imagine it being more of a real concern when working with unstructured grids. > >> >> Barry >> >> >> >>> Matt >>> >>>> is[0] <-- 0 1 2 3 >>>> is[1] <-- 3 4 5 6 >>>> is_local[0] <-- 0 1 2 3 >>>> is_local[1] <-- 3 4 5 6 >>> Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); >>> >>> Barry >>> >>> >>> Note that is_local[] doesn?t have to be non-overlapping or anything. >>> >>> >>> On Sep 16, 2014, at 10:48 AM, Patrick Sanan wrote: >>> >>>> For the purposes of reproducing an example from a paper, I'd like to use PCASM with subdomains which 'overlap minimally' (though this is probably never a good idea in practice). >>>> >>>> In one dimension with 7 unknowns and 2 domains, this might look like >>>> >>>> 0 1 2 3 4 5 6 (unknowns) >>>> ------------ (first subdomain : 0 .. 3) >>>> ----------- (second subdomain : 3 .. 6) >>>> >>>> The subdomains share only a single grid point, which differs from the way PCASM is used in most of the examples. >>>> >>>> In two dimensions, minimally overlapping rectangular subdomains would overlap one exactly one row or column of the grid. Thus, for example, if the grid unknowns were >>>> >>>> 0 1 2 3 4 5 | >>>> 6 7 8 9 10 11 | | >>>> 12 13 14 15 16 17 | >>>> -------- >>>> ----------- >>>> >>>> then one minimally-overlapping set of 4 subdomains would be >>>> 0 1 2 3 6 7 8 9 >>>> 3 4 5 9 10 11 >>>> 6 7 8 9 12 13 14 15 >>>> 9 10 11 15 16 17 >>>> as suggested by the dashes and pipes above. The subdomains only overlap by a single row or column of the grid. >>>> >>>> My question is whether and how one can use the PCASM interface to work with these sorts of decompositions (It's fine for my purposes to use a single MPI process). In particular, I don't quite understand if should be possible to define these decompositions by correctly providing is and is_local arguments to PCASMSetLocalSubdomains. >>>> >>>> I have gotten code to run defining the is_local entries to be subsets of the is entries which define a partition of the global degrees of freedom*, but I'm not certain that this was the correct choice, as it appears to produce an unsymmetric preconditioner for a symmetric system when I use direct subdomain solves and the 'basic' type for PCASM. >>>> >>>> * For example, in the 1D example above this would correspond to >>>> is[0] <-- 0 1 2 3 >>>> is[1] <-- 3 4 5 6 >>>> is_local[0] <-- 0 1 2 >>>> is_local[1] <-- 3 4 5 6 >>>> >>>> >>>> >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener From knepley at gmail.com Wed Sep 17 15:38:38 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Sep 2014 15:38:38 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: <3C54C269-CF81-45E2-8BB2-85ACB67545D6@mcs.anl.gov> References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> <5419E917.2060405@gmail.com> <3C54C269-CF81-45E2-8BB2-85ACB67545D6@mcs.anl.gov> Message-ID: On Wed, Sep 17, 2014 at 3:12 PM, Barry Smith wrote: > > On Sep 17, 2014, at 3:03 PM, Patrick Sanan > wrote: > > > On 9/16/14 9:43 PM, Barry Smith wrote: > >> On Sep 16, 2014, at 2:29 PM, Matthew Knepley wrote: > >> > >>> On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith > wrote: > >>> > >>> Patrick, > >>> > >>> This "local part of the subdomains for this processor? term in > PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I > think that if you set the is_local[] to be different than the is[] you will > always end up with a nonsymetric preconditioner. I think for one dimension > you need to use > >>> > >>> No I don't think that is right. The problem below is that you have > overlap in only one direction. Process 0 overlaps > >>> Process 1, but Process 1 has no overlap of Process 0. This is not how > Schwarz is generally envisioned. > >> Sure it is. > >>> Imagine the linear algebra viewpoint, which I think is cleaner here. > You partition the matrix rows into non-overlapping > >>> sets. These sets are is_local[]. Then any information you get from > another domain is another row, which is put into > >>> is[]. You can certainly have a non-symmetric overlap, which you have > below, but it mean one way information > >>> transmission which is strange for convergence. > >> No, not a all. > >> > >> > >> | 0 1 2 3 4 5 6 | > >> > >> Domain 0 is the region from | to 4 with Dirichlet boundary > conditions at each end (| and 4). Domain 1 is from 2 to | with Dirichlet > boundary conditions at each end (2 and |) . > >> > >> If you look at the PCSetUp_ASM() and PCApply_ASM() you?ll see all > kinds of VecScatter creations from the various is and is_local, > ?restriction?, ?prolongation? and ?localization? then in the apply the > different scatters are applied in the two directions, which results in a > non-symmetric operator. > > > > I was able to get my uniprocessor example to give the (symmetric) > preconditioner I expected by commenting out the check in PCSetUp_ASM (line > 311 in asm.c) > > if (firstRow != lastRow) SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_PLIB, > "Specified ASM subdomain sizes were invalid: %d != %d", firstRow, lastRow); > > This check is absurd and needs to be removed. > > > and using PCASMSetLocalSubdomains with the same (overlapping) IS's for > both is and is_local ([0 1 2 3] and [3 4 5 6] in the example above). It > also works passing NULL for is_local. > > Great. > > > > I assume that the purpose of the check mentioned above is to ensure that > every grid point is assigned to exactly one processor, which is needed by > whatever interprocess scattering goes on in the implementation. Also, I > assume that augmenting the domain definition with an explicit specification > of the way domains are distributed over processes allows for more > controllable use of PC_ASM_RESTRICT, with all its attractive properties. > > > > Anyhow, Barry's advice previously in this thread works locally (for one > test case) if you remove the check above, but the current implementation > enforces something related to what Matt describes, which might be overly > restrictive if multiple domains share a process. The impression I got > initially from the documentation was that if one uses PC_ASM_BASIC, the > choice of is_local should only influence the details of the communication > pattern, not (in exact arithmetic, with process-count-independent > subsolves) the preconditioner being defined. > > The ?communication pattern? does determine the preconditioner being > defined. > > The introduction of is_local[] broke the clean usage of PC_ASM_* that > use to exist, so your confusion is our fault, not yours. > > > > > > For regular grids this all seems pretty pathological (in practice I > imagine people want to use symmetric overlaps, > > As I said above you are using "symmetric overlaps,?. It just looks > ?unsymmetric? if you introduce this concept of ?non-overlapping initial > subdomains? which is an unneeded harmful concept. > I really do not understand what you are saying here. If you want to do RASM, then you must be able to tell the difference between your domain and the overlap. That is all that the distinction between is and islocal does. Your original implementation of RASM was wanting because it merely dropped communication. If you have several domains on a process, then this is not RASM. Matt > > Barry > > > > and I assume that one domain per node is the most common use case), but > I could imagine it being more of a real concern when working with > unstructured grids. > > > >> > >> Barry > >> > >> > >> > >>> Matt > >>> > >>>> is[0] <-- 0 1 2 3 > >>>> is[1] <-- 3 4 5 6 > >>>> is_local[0] <-- 0 1 2 3 > >>>> is_local[1] <-- 3 4 5 6 > >>> Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); > >>> > >>> Barry > >>> > >>> > >>> Note that is_local[] doesn?t have to be non-overlapping or anything. > >>> > >>> > >>> On Sep 16, 2014, at 10:48 AM, Patrick Sanan > wrote: > >>> > >>>> For the purposes of reproducing an example from a paper, I'd like to > use PCASM with subdomains which 'overlap minimally' (though this is > probably never a good idea in practice). > >>>> > >>>> In one dimension with 7 unknowns and 2 domains, this might look like > >>>> > >>>> 0 1 2 3 4 5 6 (unknowns) > >>>> ------------ (first subdomain : 0 .. 3) > >>>> ----------- (second subdomain : 3 .. 6) > >>>> > >>>> The subdomains share only a single grid point, which differs from the > way PCASM is used in most of the examples. > >>>> > >>>> In two dimensions, minimally overlapping rectangular subdomains would > overlap one exactly one row or column of the grid. Thus, for example, if > the grid unknowns were > >>>> > >>>> 0 1 2 3 4 5 | > >>>> 6 7 8 9 10 11 | | > >>>> 12 13 14 15 16 17 | > >>>> -------- > >>>> ----------- > >>>> > >>>> then one minimally-overlapping set of 4 subdomains would be > >>>> 0 1 2 3 6 7 8 9 > >>>> 3 4 5 9 10 11 > >>>> 6 7 8 9 12 13 14 15 > >>>> 9 10 11 15 16 17 > >>>> as suggested by the dashes and pipes above. The subdomains only > overlap by a single row or column of the grid. > >>>> > >>>> My question is whether and how one can use the PCASM interface to > work with these sorts of decompositions (It's fine for my purposes to use a > single MPI process). In particular, I don't quite understand if should be > possible to define these decompositions by correctly providing is and > is_local arguments to PCASMSetLocalSubdomains. > >>>> > >>>> I have gotten code to run defining the is_local entries to be subsets > of the is entries which define a partition of the global degrees of > freedom*, but I'm not certain that this was the correct choice, as it > appears to produce an unsymmetric preconditioner for a symmetric system > when I use direct subdomain solves and the 'basic' type for PCASM. > >>>> > >>>> * For example, in the 1D example above this would correspond to > >>>> is[0] <-- 0 1 2 3 > >>>> is[1] <-- 3 4 5 6 > >>>> is_local[0] <-- 0 1 2 > >>>> is_local[1] <-- 3 4 5 6 > >>>> > >>>> > >>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>> -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 17 17:08:56 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 Sep 2014 17:08:56 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> <5419E917.2060405@gmail.com> <3C54C269-CF81-45E2-8BB2-85ACB67545D6@mcs.anl.gov> Message-ID: <3240385C-FF18-4622-931F-484C8A2AA554@mcs.anl.gov> On Sep 17, 2014, at 3:38 PM, Matthew Knepley wrote: > On Wed, Sep 17, 2014 at 3:12 PM, Barry Smith wrote: > > On Sep 17, 2014, at 3:03 PM, Patrick Sanan wrote: > > > On 9/16/14 9:43 PM, Barry Smith wrote: > >> On Sep 16, 2014, at 2:29 PM, Matthew Knepley wrote: > >> > >>> On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith wrote: > >>> > >>> Patrick, > >>> > >>> This "local part of the subdomains for this processor? term in PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I think that if you set the is_local[] to be different than the is[] you will always end up with a nonsymetric preconditioner. I think for one dimension you need to use > >>> > >>> No I don't think that is right. The problem below is that you have overlap in only one direction. Process 0 overlaps > >>> Process 1, but Process 1 has no overlap of Process 0. This is not how Schwarz is generally envisioned. > >> Sure it is. > >>> Imagine the linear algebra viewpoint, which I think is cleaner here. You partition the matrix rows into non-overlapping > >>> sets. These sets are is_local[]. Then any information you get from another domain is another row, which is put into > >>> is[]. You can certainly have a non-symmetric overlap, which you have below, but it mean one way information > >>> transmission which is strange for convergence. > >> No, not a all. > >> > >> > >> | 0 1 2 3 4 5 6 | > >> > >> Domain 0 is the region from | to 4 with Dirichlet boundary conditions at each end (| and 4). Domain 1 is from 2 to | with Dirichlet boundary conditions at each end (2 and |) . > >> > >> If you look at the PCSetUp_ASM() and PCApply_ASM() you?ll see all kinds of VecScatter creations from the various is and is_local, ?restriction?, ?prolongation? and ?localization? then in the apply the different scatters are applied in the two directions, which results in a non-symmetric operator. > > > > I was able to get my uniprocessor example to give the (symmetric) preconditioner I expected by commenting out the check in PCSetUp_ASM (line 311 in asm.c) > > if (firstRow != lastRow) SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_PLIB, "Specified ASM subdomain sizes were invalid: %d != %d", firstRow, lastRow); > > This check is absurd and needs to be removed. > > > and using PCASMSetLocalSubdomains with the same (overlapping) IS's for both is and is_local ([0 1 2 3] and [3 4 5 6] in the example above). It also works passing NULL for is_local. > > Great. > > > > I assume that the purpose of the check mentioned above is to ensure that every grid point is assigned to exactly one processor, which is needed by whatever interprocess scattering goes on in the implementation. Also, I assume that augmenting the domain definition with an explicit specification of the way domains are distributed over processes allows for more controllable use of PC_ASM_RESTRICT, with all its attractive properties. > > > > Anyhow, Barry's advice previously in this thread works locally (for one test case) if you remove the check above, but the current implementation enforces something related to what Matt describes, which might be overly restrictive if multiple domains share a process. The impression I got initially from the documentation was that if one uses PC_ASM_BASIC, the choice of is_local should only influence the details of the communication pattern, not (in exact arithmetic, with process-count-independent subsolves) the preconditioner being defined. > > The ?communication pattern? does determine the preconditioner being defined. > > The introduction of is_local[] broke the clean usage of PC_ASM_* that use to exist, so your confusion is our fault, not yours. > > > > > > For regular grids this all seems pretty pathological (in practice I imagine people want to use symmetric overlaps, > > As I said above you are using "symmetric overlaps,?. It just looks ?unsymmetric? if you introduce this concept of ?non-overlapping initial subdomains? which is an unneeded harmful concept. > > I really do not understand what you are saying here. If you want to do RASM, then you must be able to > tell the difference between your domain and the overlap. That is all that the distinction between is and > islocal does. Your original implementation of RASM was wanting because it merely dropped communication. > If you have several domains on a process, then this is not RASM. Correct. The problem is that your extra test prevented perfectly valid ASM configurations. You are right that these ?perfectly valid ASM configurations? do not have an RASM form (if we define RASM as strictly requiring non-overlapping domains that then get extended and either the restriction or prolongation skips the overlap region) but they are still valid ASM preconditioners so shouldn?t error out just because they cannot be used for RASM. Barry Note that I am calling any collection of domains which may or may not overlap, which may have a ?single grid point? overlap or more as valid ASM configurations, because they are (i.e. the set of valid ASM configurations is larger than the set of RASM configurations). So my valid ASM configurations is more then just domains obtained by taking a non-overlapping set of domains and then "growing the domains? and I wanted the code to support this, hence I removed the extra test. > > Matt > > > Barry > > > > and I assume that one domain per node is the most common use case), but I could imagine it being more of a real concern when working with unstructured grids. > > > >> > >> Barry > >> > >> > >> > >>> Matt > >>> > >>>> is[0] <-- 0 1 2 3 > >>>> is[1] <-- 3 4 5 6 > >>>> is_local[0] <-- 0 1 2 3 > >>>> is_local[1] <-- 3 4 5 6 > >>> Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); > >>> > >>> Barry > >>> > >>> > >>> Note that is_local[] doesn?t have to be non-overlapping or anything. > >>> > >>> > >>> On Sep 16, 2014, at 10:48 AM, Patrick Sanan wrote: > >>> > >>>> For the purposes of reproducing an example from a paper, I'd like to use PCASM with subdomains which 'overlap minimally' (though this is probably never a good idea in practice). > >>>> > >>>> In one dimension with 7 unknowns and 2 domains, this might look like > >>>> > >>>> 0 1 2 3 4 5 6 (unknowns) > >>>> ------------ (first subdomain : 0 .. 3) > >>>> ----------- (second subdomain : 3 .. 6) > >>>> > >>>> The subdomains share only a single grid point, which differs from the way PCASM is used in most of the examples. > >>>> > >>>> In two dimensions, minimally overlapping rectangular subdomains would overlap one exactly one row or column of the grid. Thus, for example, if the grid unknowns were > >>>> > >>>> 0 1 2 3 4 5 | > >>>> 6 7 8 9 10 11 | | > >>>> 12 13 14 15 16 17 | > >>>> -------- > >>>> ----------- > >>>> > >>>> then one minimally-overlapping set of 4 subdomains would be > >>>> 0 1 2 3 6 7 8 9 > >>>> 3 4 5 9 10 11 > >>>> 6 7 8 9 12 13 14 15 > >>>> 9 10 11 15 16 17 > >>>> as suggested by the dashes and pipes above. The subdomains only overlap by a single row or column of the grid. > >>>> > >>>> My question is whether and how one can use the PCASM interface to work with these sorts of decompositions (It's fine for my purposes to use a single MPI process). In particular, I don't quite understand if should be possible to define these decompositions by correctly providing is and is_local arguments to PCASMSetLocalSubdomains. > >>>> > >>>> I have gotten code to run defining the is_local entries to be subsets of the is entries which define a partition of the global degrees of freedom*, but I'm not certain that this was the correct choice, as it appears to produce an unsymmetric preconditioner for a symmetric system when I use direct subdomain solves and the 'basic' type for PCASM. > >>>> > >>>> * For example, in the 1D example above this would correspond to > >>>> is[0] <-- 0 1 2 3 > >>>> is[1] <-- 3 4 5 6 > >>>> is_local[0] <-- 0 1 2 > >>>> is_local[1] <-- 3 4 5 6 > >>>> > >>>> > >>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >>> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From knepley at gmail.com Wed Sep 17 19:29:13 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Sep 2014 19:29:13 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: <3240385C-FF18-4622-931F-484C8A2AA554@mcs.anl.gov> References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> <5419E917.2060405@gmail.com> <3C54C269-CF81-45E2-8BB2-85ACB67545D6@mcs.anl.gov> <3240385C-FF18-4622-931F-484C8A2AA554@mcs.anl.gov> Message-ID: On Wed, Sep 17, 2014 at 5:08 PM, Barry Smith wrote: > > On Sep 17, 2014, at 3:38 PM, Matthew Knepley wrote: > > > On Wed, Sep 17, 2014 at 3:12 PM, Barry Smith wrote: > > > > On Sep 17, 2014, at 3:03 PM, Patrick Sanan > wrote: > > > > > On 9/16/14 9:43 PM, Barry Smith wrote: > > >> On Sep 16, 2014, at 2:29 PM, Matthew Knepley > wrote: > > >> > > >>> On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith > wrote: > > >>> > > >>> Patrick, > > >>> > > >>> This "local part of the subdomains for this processor? term in > PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I > think that if you set the is_local[] to be different than the is[] you will > always end up with a nonsymetric preconditioner. I think for one dimension > you need to use > > >>> > > >>> No I don't think that is right. The problem below is that you have > overlap in only one direction. Process 0 overlaps > > >>> Process 1, but Process 1 has no overlap of Process 0. This is not > how Schwarz is generally envisioned. > > >> Sure it is. > > >>> Imagine the linear algebra viewpoint, which I think is cleaner here. > You partition the matrix rows into non-overlapping > > >>> sets. These sets are is_local[]. Then any information you get from > another domain is another row, which is put into > > >>> is[]. You can certainly have a non-symmetric overlap, which you have > below, but it mean one way information > > >>> transmission which is strange for convergence. > > >> No, not a all. > > >> > > >> > > >> | 0 1 2 3 4 5 6 | > > >> > > >> Domain 0 is the region from | to 4 with Dirichlet boundary > conditions at each end (| and 4). Domain 1 is from 2 to | with Dirichlet > boundary conditions at each end (2 and |) . > > >> > > >> If you look at the PCSetUp_ASM() and PCApply_ASM() you?ll see all > kinds of VecScatter creations from the various is and is_local, > ?restriction?, ?prolongation? and ?localization? then in the apply the > different scatters are applied in the two directions, which results in a > non-symmetric operator. > > > > > > I was able to get my uniprocessor example to give the (symmetric) > preconditioner I expected by commenting out the check in PCSetUp_ASM (line > 311 in asm.c) > > > > if (firstRow != lastRow) SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_PLIB, > "Specified ASM subdomain sizes were invalid: %d != %d", firstRow, lastRow); > > > > This check is absurd and needs to be removed. > > > > > and using PCASMSetLocalSubdomains with the same (overlapping) IS's for > both is and is_local ([0 1 2 3] and [3 4 5 6] in the example above). It > also works passing NULL for is_local. > > > > Great. > > > > > > I assume that the purpose of the check mentioned above is to ensure > that every grid point is assigned to exactly one processor, which is needed > by whatever interprocess scattering goes on in the implementation. Also, I > assume that augmenting the domain definition with an explicit specification > of the way domains are distributed over processes allows for more > controllable use of PC_ASM_RESTRICT, with all its attractive properties. > > > > > > Anyhow, Barry's advice previously in this thread works locally (for > one test case) if you remove the check above, but the current > implementation enforces something related to what Matt describes, which > might be overly restrictive if multiple domains share a process. The > impression I got initially from the documentation was that if one uses > PC_ASM_BASIC, the choice of is_local should only influence the details of > the communication pattern, not (in exact arithmetic, with > process-count-independent subsolves) the preconditioner being defined. > > > > The ?communication pattern? does determine the preconditioner being > defined. > > > > The introduction of is_local[] broke the clean usage of PC_ASM_* > that use to exist, so your confusion is our fault, not yours. > > > > > > > > > For regular grids this all seems pretty pathological (in practice I > imagine people want to use symmetric overlaps, > > > > As I said above you are using "symmetric overlaps,?. It just looks > ?unsymmetric? if you introduce this concept of ?non-overlapping initial > subdomains? which is an unneeded harmful concept. > > > > I really do not understand what you are saying here. If you want to do > RASM, then you must be able to > > tell the difference between your domain and the overlap. That is all > that the distinction between is and > > islocal does. Your original implementation of RASM was wanting because > it merely dropped communication. > > If you have several domains on a process, then this is not RASM. > > Correct. > > The problem is that your extra test prevented perfectly valid ASM > configurations. You are right that these ?perfectly valid ASM > configurations? do not have an RASM form (if we define RASM as strictly > requiring non-overlapping domains that then get extended and either the > restriction or prolongation skips the overlap region) but they are still > valid ASM preconditioners so shouldn?t error out just because they cannot > be used for RASM. > > Barry > > Note that I am calling any collection of domains which may or may not > overlap, which may have a ?single grid point? overlap or more as valid ASM > configurations, because they are (i.e. the set of valid ASM configurations > is larger than the set of RASM configurations). So my valid ASM > configurations is more then just domains obtained by taking a > non-overlapping set of domains and then "growing the domains? and I wanted > the code to support this, hence I removed the extra test. That is fine. We must make sure PETSc properly throws an error if someone selects PC_ASM_RESTRICT. Matt > > > > > Matt > > > > > > Barry > > > > > > > and I assume that one domain per node is the most common use case), > but I could imagine it being more of a real concern when working with > unstructured grids. > > > > > >> > > >> Barry > > >> > > >> > > >> > > >>> Matt > > >>> > > >>>> is[0] <-- 0 1 2 3 > > >>>> is[1] <-- 3 4 5 6 > > >>>> is_local[0] <-- 0 1 2 3 > > >>>> is_local[1] <-- 3 4 5 6 > > >>> Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); > > >>> > > >>> Barry > > >>> > > >>> > > >>> Note that is_local[] doesn?t have to be non-overlapping or anything. > > >>> > > >>> > > >>> On Sep 16, 2014, at 10:48 AM, Patrick Sanan > wrote: > > >>> > > >>>> For the purposes of reproducing an example from a paper, I'd like > to use PCASM with subdomains which 'overlap minimally' (though this is > probably never a good idea in practice). > > >>>> > > >>>> In one dimension with 7 unknowns and 2 domains, this might look like > > >>>> > > >>>> 0 1 2 3 4 5 6 (unknowns) > > >>>> ------------ (first subdomain : 0 .. 3) > > >>>> ----------- (second subdomain : 3 .. 6) > > >>>> > > >>>> The subdomains share only a single grid point, which differs from > the way PCASM is used in most of the examples. > > >>>> > > >>>> In two dimensions, minimally overlapping rectangular subdomains > would overlap one exactly one row or column of the grid. Thus, for example, > if the grid unknowns were > > >>>> > > >>>> 0 1 2 3 4 5 | > > >>>> 6 7 8 9 10 11 | | > > >>>> 12 13 14 15 16 17 | > > >>>> -------- > > >>>> ----------- > > >>>> > > >>>> then one minimally-overlapping set of 4 subdomains would be > > >>>> 0 1 2 3 6 7 8 9 > > >>>> 3 4 5 9 10 11 > > >>>> 6 7 8 9 12 13 14 15 > > >>>> 9 10 11 15 16 17 > > >>>> as suggested by the dashes and pipes above. The subdomains only > overlap by a single row or column of the grid. > > >>>> > > >>>> My question is whether and how one can use the PCASM interface to > work with these sorts of decompositions (It's fine for my purposes to use a > single MPI process). In particular, I don't quite understand if should be > possible to define these decompositions by correctly providing is and > is_local arguments to PCASMSetLocalSubdomains. > > >>>> > > >>>> I have gotten code to run defining the is_local entries to be > subsets of the is entries which define a partition of the global degrees of > freedom*, but I'm not certain that this was the correct choice, as it > appears to produce an unsymmetric preconditioner for a symmetric system > when I use direct subdomain solves and the 'basic' type for PCASM. > > >>>> > > >>>> * For example, in the 1D example above this would correspond to > > >>>> is[0] <-- 0 1 2 3 > > >>>> is[1] <-- 3 4 5 6 > > >>>> is_local[0] <-- 0 1 2 > > >>>> is_local[1] <-- 3 4 5 6 > > >>>> > > >>>> > > >>>> > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > >>> -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 17 20:36:59 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 Sep 2014 20:36:59 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> <5419E917.2060405@gmail.com> <3C54C269-CF81-45E2-8BB2-85ACB67545D6@mcs.anl.gov> <3240385C-FF18-4622-931F-484C8A2AA554@mcs.anl.gov> Message-ID: <2B231C4B-2E29-48D3-8EF3-E20F402A13F0@mcs.anl.gov> On Sep 17, 2014, at 7:29 PM, Matthew Knepley wrote: > On Wed, Sep 17, 2014 at 5:08 PM, Barry Smith wrote: > > On Sep 17, 2014, at 3:38 PM, Matthew Knepley wrote: > > > On Wed, Sep 17, 2014 at 3:12 PM, Barry Smith wrote: > > > > On Sep 17, 2014, at 3:03 PM, Patrick Sanan wrote: > > > > > On 9/16/14 9:43 PM, Barry Smith wrote: > > >> On Sep 16, 2014, at 2:29 PM, Matthew Knepley wrote: > > >> > > >>> On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith wrote: > > >>> > > >>> Patrick, > > >>> > > >>> This "local part of the subdomains for this processor? term in PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I think that if you set the is_local[] to be different than the is[] you will always end up with a nonsymetric preconditioner. I think for one dimension you need to use > > >>> > > >>> No I don't think that is right. The problem below is that you have overlap in only one direction. Process 0 overlaps > > >>> Process 1, but Process 1 has no overlap of Process 0. This is not how Schwarz is generally envisioned. > > >> Sure it is. > > >>> Imagine the linear algebra viewpoint, which I think is cleaner here. You partition the matrix rows into non-overlapping > > >>> sets. These sets are is_local[]. Then any information you get from another domain is another row, which is put into > > >>> is[]. You can certainly have a non-symmetric overlap, which you have below, but it mean one way information > > >>> transmission which is strange for convergence. > > >> No, not a all. > > >> > > >> > > >> | 0 1 2 3 4 5 6 | > > >> > > >> Domain 0 is the region from | to 4 with Dirichlet boundary conditions at each end (| and 4). Domain 1 is from 2 to | with Dirichlet boundary conditions at each end (2 and |) . > > >> > > >> If you look at the PCSetUp_ASM() and PCApply_ASM() you?ll see all kinds of VecScatter creations from the various is and is_local, ?restriction?, ?prolongation? and ?localization? then in the apply the different scatters are applied in the two directions, which results in a non-symmetric operator. > > > > > > I was able to get my uniprocessor example to give the (symmetric) preconditioner I expected by commenting out the check in PCSetUp_ASM (line 311 in asm.c) > > > > if (firstRow != lastRow) SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_PLIB, "Specified ASM subdomain sizes were invalid: %d != %d", firstRow, lastRow); > > > > This check is absurd and needs to be removed. > > > > > and using PCASMSetLocalSubdomains with the same (overlapping) IS's for both is and is_local ([0 1 2 3] and [3 4 5 6] in the example above). It also works passing NULL for is_local. > > > > Great. > > > > > > I assume that the purpose of the check mentioned above is to ensure that every grid point is assigned to exactly one processor, which is needed by whatever interprocess scattering goes on in the implementation. Also, I assume that augmenting the domain definition with an explicit specification of the way domains are distributed over processes allows for more controllable use of PC_ASM_RESTRICT, with all its attractive properties. > > > > > > Anyhow, Barry's advice previously in this thread works locally (for one test case) if you remove the check above, but the current implementation enforces something related to what Matt describes, which might be overly restrictive if multiple domains share a process. The impression I got initially from the documentation was that if one uses PC_ASM_BASIC, the choice of is_local should only influence the details of the communication pattern, not (in exact arithmetic, with process-count-independent subsolves) the preconditioner being defined. > > > > The ?communication pattern? does determine the preconditioner being defined. > > > > The introduction of is_local[] broke the clean usage of PC_ASM_* that use to exist, so your confusion is our fault, not yours. > > > > > > > > > For regular grids this all seems pretty pathological (in practice I imagine people want to use symmetric overlaps, > > > > As I said above you are using "symmetric overlaps,?. It just looks ?unsymmetric? if you introduce this concept of ?non-overlapping initial subdomains? which is an unneeded harmful concept. > > > > I really do not understand what you are saying here. If you want to do RASM, then you must be able to > > tell the difference between your domain and the overlap. That is all that the distinction between is and > > islocal does. Your original implementation of RASM was wanting because it merely dropped communication. > > If you have several domains on a process, then this is not RASM. > > Correct. > > The problem is that your extra test prevented perfectly valid ASM configurations. You are right that these ?perfectly valid ASM configurations? do not have an RASM form (if we define RASM as strictly requiring non-overlapping domains that then get extended and either the restriction or prolongation skips the overlap region) but they are still valid ASM preconditioners so shouldn?t error out just because they cannot be used for RASM. > > Barry > > Note that I am calling any collection of domains which may or may not overlap, which may have a ?single grid point? overlap or more as valid ASM configurations, because they are (i.e. the set of valid ASM configurations is larger than the set of RASM configurations). So my valid ASM configurations is more then just domains obtained by taking a non-overlapping set of domains and then "growing the domains? and I wanted the code to support this, hence I removed the extra test. > > That is fine. We must make sure PETSc properly throws an error if someone selects PC_ASM_RESTRICT. I?m not sure. It depends on your definition of PC_ASM_RESTRICT > > > Matt > > > > > > Matt > > > > > > Barry > > > > > > > and I assume that one domain per node is the most common use case), but I could imagine it being more of a real concern when working with unstructured grids. > > > > > >> > > >> Barry > > >> > > >> > > >> > > >>> Matt > > >>> > > >>>> is[0] <-- 0 1 2 3 > > >>>> is[1] <-- 3 4 5 6 > > >>>> is_local[0] <-- 0 1 2 3 > > >>>> is_local[1] <-- 3 4 5 6 > > >>> Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); > > >>> > > >>> Barry > > >>> > > >>> > > >>> Note that is_local[] doesn?t have to be non-overlapping or anything. > > >>> > > >>> > > >>> On Sep 16, 2014, at 10:48 AM, Patrick Sanan wrote: > > >>> > > >>>> For the purposes of reproducing an example from a paper, I'd like to use PCASM with subdomains which 'overlap minimally' (though this is probably never a good idea in practice). > > >>>> > > >>>> In one dimension with 7 unknowns and 2 domains, this might look like > > >>>> > > >>>> 0 1 2 3 4 5 6 (unknowns) > > >>>> ------------ (first subdomain : 0 .. 3) > > >>>> ----------- (second subdomain : 3 .. 6) > > >>>> > > >>>> The subdomains share only a single grid point, which differs from the way PCASM is used in most of the examples. > > >>>> > > >>>> In two dimensions, minimally overlapping rectangular subdomains would overlap one exactly one row or column of the grid. Thus, for example, if the grid unknowns were > > >>>> > > >>>> 0 1 2 3 4 5 | > > >>>> 6 7 8 9 10 11 | | > > >>>> 12 13 14 15 16 17 | > > >>>> -------- > > >>>> ----------- > > >>>> > > >>>> then one minimally-overlapping set of 4 subdomains would be > > >>>> 0 1 2 3 6 7 8 9 > > >>>> 3 4 5 9 10 11 > > >>>> 6 7 8 9 12 13 14 15 > > >>>> 9 10 11 15 16 17 > > >>>> as suggested by the dashes and pipes above. The subdomains only overlap by a single row or column of the grid. > > >>>> > > >>>> My question is whether and how one can use the PCASM interface to work with these sorts of decompositions (It's fine for my purposes to use a single MPI process). In particular, I don't quite understand if should be possible to define these decompositions by correctly providing is and is_local arguments to PCASMSetLocalSubdomains. > > >>>> > > >>>> I have gotten code to run defining the is_local entries to be subsets of the is entries which define a partition of the global degrees of freedom*, but I'm not certain that this was the correct choice, as it appears to produce an unsymmetric preconditioner for a symmetric system when I use direct subdomain solves and the 'basic' type for PCASM. > > >>>> > > >>>> * For example, in the 1D example above this would correspond to > > >>>> is[0] <-- 0 1 2 3 > > >>>> is[1] <-- 3 4 5 6 > > >>>> is_local[0] <-- 0 1 2 > > >>>> is_local[1] <-- 3 4 5 6 > > >>>> > > >>>> > > >>>> > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > >>> -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From knepley at gmail.com Wed Sep 17 20:39:19 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Sep 2014 20:39:19 -0500 Subject: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains In-Reply-To: <2B231C4B-2E29-48D3-8EF3-E20F402A13F0@mcs.anl.gov> References: <54185BEB.9060803@gmail.com> <774BAF6B-620D-448D-ADB5-03B3455CFC0D@mcs.anl.gov> <84324D7E-9146-4C84-9301-FA2F7603B167@mcs.anl.gov> <5419E917.2060405@gmail.com> <3C54C269-CF81-45E2-8BB2-85ACB67545D6@mcs.anl.gov> <3240385C-FF18-4622-931F-484C8A2AA554@mcs.anl.gov> <2B231C4B-2E29-48D3-8EF3-E20F402A13F0@mcs.anl.gov> Message-ID: On Wed, Sep 17, 2014 at 8:36 PM, Barry Smith wrote: > > On Sep 17, 2014, at 7:29 PM, Matthew Knepley wrote: > > > On Wed, Sep 17, 2014 at 5:08 PM, Barry Smith wrote: > > > > On Sep 17, 2014, at 3:38 PM, Matthew Knepley wrote: > > > > > On Wed, Sep 17, 2014 at 3:12 PM, Barry Smith > wrote: > > > > > > On Sep 17, 2014, at 3:03 PM, Patrick Sanan > wrote: > > > > > > > On 9/16/14 9:43 PM, Barry Smith wrote: > > > >> On Sep 16, 2014, at 2:29 PM, Matthew Knepley > wrote: > > > >> > > > >>> On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith > wrote: > > > >>> > > > >>> Patrick, > > > >>> > > > >>> This "local part of the subdomains for this processor? term > in PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, > I think that if you set the is_local[] to be different than the is[] you > will always end up with a nonsymetric preconditioner. I think for one > dimension you need to use > > > >>> > > > >>> No I don't think that is right. The problem below is that you have > overlap in only one direction. Process 0 overlaps > > > >>> Process 1, but Process 1 has no overlap of Process 0. This is not > how Schwarz is generally envisioned. > > > >> Sure it is. > > > >>> Imagine the linear algebra viewpoint, which I think is cleaner > here. You partition the matrix rows into non-overlapping > > > >>> sets. These sets are is_local[]. Then any information you get from > another domain is another row, which is put into > > > >>> is[]. You can certainly have a non-symmetric overlap, which you > have below, but it mean one way information > > > >>> transmission which is strange for convergence. > > > >> No, not a all. > > > >> > > > >> > > > >> | 0 1 2 3 4 5 6 | > > > >> > > > >> Domain 0 is the region from | to 4 with Dirichlet boundary > conditions at each end (| and 4). Domain 1 is from 2 to | with Dirichlet > boundary conditions at each end (2 and |) . > > > >> > > > >> If you look at the PCSetUp_ASM() and PCApply_ASM() you?ll see all > kinds of VecScatter creations from the various is and is_local, > ?restriction?, ?prolongation? and ?localization? then in the apply the > different scatters are applied in the two directions, which results in a > non-symmetric operator. > > > > > > > > I was able to get my uniprocessor example to give the (symmetric) > preconditioner I expected by commenting out the check in PCSetUp_ASM (line > 311 in asm.c) > > > > > > if (firstRow != lastRow) SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_PLIB, > "Specified ASM subdomain sizes were invalid: %d != %d", firstRow, lastRow); > > > > > > This check is absurd and needs to be removed. > > > > > > > and using PCASMSetLocalSubdomains with the same (overlapping) IS's > for both is and is_local ([0 1 2 3] and [3 4 5 6] in the example above). It > also works passing NULL for is_local. > > > > > > Great. > > > > > > > > I assume that the purpose of the check mentioned above is to ensure > that every grid point is assigned to exactly one processor, which is needed > by whatever interprocess scattering goes on in the implementation. Also, I > assume that augmenting the domain definition with an explicit specification > of the way domains are distributed over processes allows for more > controllable use of PC_ASM_RESTRICT, with all its attractive properties. > > > > > > > > Anyhow, Barry's advice previously in this thread works locally (for > one test case) if you remove the check above, but the current > implementation enforces something related to what Matt describes, which > might be overly restrictive if multiple domains share a process. The > impression I got initially from the documentation was that if one uses > PC_ASM_BASIC, the choice of is_local should only influence the details of > the communication pattern, not (in exact arithmetic, with > process-count-independent subsolves) the preconditioner being defined. > > > > > > The ?communication pattern? does determine the preconditioner being > defined. > > > > > > The introduction of is_local[] broke the clean usage of PC_ASM_* > that use to exist, so your confusion is our fault, not yours. > > > > > > > > > > > > For regular grids this all seems pretty pathological (in practice I > imagine people want to use symmetric overlaps, > > > > > > As I said above you are using "symmetric overlaps,?. It just looks > ?unsymmetric? if you introduce this concept of ?non-overlapping initial > subdomains? which is an unneeded harmful concept. > > > > > > I really do not understand what you are saying here. If you want to do > RASM, then you must be able to > > > tell the difference between your domain and the overlap. That is all > that the distinction between is and > > > islocal does. Your original implementation of RASM was wanting because > it merely dropped communication. > > > If you have several domains on a process, then this is not RASM. > > > > Correct. > > > > The problem is that your extra test prevented perfectly valid ASM > configurations. You are right that these ?perfectly valid ASM > configurations? do not have an RASM form (if we define RASM as strictly > requiring non-overlapping domains that then get extended and either the > restriction or prolongation skips the overlap region) but they are still > valid ASM preconditioners so shouldn?t error out just because they cannot > be used for RASM. > > > > Barry > > > > Note that I am calling any collection of domains which may or may not > overlap, which may have a ?single grid point? overlap or more as valid ASM > configurations, because they are (i.e. the set of valid ASM configurations > is larger than the set of RASM configurations). So my valid ASM > configurations is more then just domains obtained by taking a > non-overlapping set of domains and then "growing the domains? and I wanted > the code to support this, hence I removed the extra test. > > > > That is fine. We must make sure PETSc properly throws an error if > someone selects PC_ASM_RESTRICT. > > I?m not sure. It depends on your definition of PC_ASM_RESTRICT > I would disable it when is == isLocal, since at the least it would be very misleading. Matt > > > > > > Matt > > > > > > > > > > Matt > > > > > > > > > Barry > > > > > > > > > > and I assume that one domain per node is the most common use case), > but I could imagine it being more of a real concern when working with > unstructured grids. > > > > > > > >> > > > >> Barry > > > >> > > > >> > > > >> > > > >>> Matt > > > >>> > > > >>>> is[0] <-- 0 1 2 3 > > > >>>> is[1] <-- 3 4 5 6 > > > >>>> is_local[0] <-- 0 1 2 3 > > > >>>> is_local[1] <-- 3 4 5 6 > > > >>> Or you can pass NULL for is_local use PCASMSetOverlap(pc,0); > > > >>> > > > >>> Barry > > > >>> > > > >>> > > > >>> Note that is_local[] doesn?t have to be non-overlapping or > anything. > > > >>> > > > >>> > > > >>> On Sep 16, 2014, at 10:48 AM, Patrick Sanan < > patrick.sanan at gmail.com> wrote: > > > >>> > > > >>>> For the purposes of reproducing an example from a paper, I'd like > to use PCASM with subdomains which 'overlap minimally' (though this is > probably never a good idea in practice). > > > >>>> > > > >>>> In one dimension with 7 unknowns and 2 domains, this might look > like > > > >>>> > > > >>>> 0 1 2 3 4 5 6 (unknowns) > > > >>>> ------------ (first subdomain : 0 .. 3) > > > >>>> ----------- (second subdomain : 3 .. 6) > > > >>>> > > > >>>> The subdomains share only a single grid point, which differs from > the way PCASM is used in most of the examples. > > > >>>> > > > >>>> In two dimensions, minimally overlapping rectangular subdomains > would overlap one exactly one row or column of the grid. Thus, for example, > if the grid unknowns were > > > >>>> > > > >>>> 0 1 2 3 4 5 | > > > >>>> 6 7 8 9 10 11 | | > > > >>>> 12 13 14 15 16 17 | > > > >>>> -------- > > > >>>> ----------- > > > >>>> > > > >>>> then one minimally-overlapping set of 4 subdomains would be > > > >>>> 0 1 2 3 6 7 8 9 > > > >>>> 3 4 5 9 10 11 > > > >>>> 6 7 8 9 12 13 14 15 > > > >>>> 9 10 11 15 16 17 > > > >>>> as suggested by the dashes and pipes above. The subdomains only > overlap by a single row or column of the grid. > > > >>>> > > > >>>> My question is whether and how one can use the PCASM interface to > work with these sorts of decompositions (It's fine for my purposes to use a > single MPI process). In particular, I don't quite understand if should be > possible to define these decompositions by correctly providing is and > is_local arguments to PCASMSetLocalSubdomains. > > > >>>> > > > >>>> I have gotten code to run defining the is_local entries to be > subsets of the is entries which define a partition of the global degrees of > freedom*, but I'm not certain that this was the correct choice, as it > appears to produce an unsymmetric preconditioner for a symmetric system > when I use direct subdomain solves and the 'basic' type for PCASM. > > > >>>> > > > >>>> * For example, in the 1D example above this would correspond to > > > >>>> is[0] <-- 0 1 2 3 > > > >>>> is[1] <-- 3 4 5 6 > > > >>>> is_local[0] <-- 0 1 2 > > > >>>> is_local[1] <-- 3 4 5 6 > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>> > > > >>> > > > >>> > > > >>> -- > > > >>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > >>> -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From d022117 at polito.it Thu Sep 18 17:24:50 2014 From: d022117 at polito.it (PEREZ CERQUERA MANUEL RICARDO) Date: Fri, 19 Sep 2014 00:24:50 +0200 Subject: [petsc-users] unsuscribe Message-ID: HI, i would like to dismiss my account. Thank you Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student Antenna and EMC Lab (LACE) Istituto Superiore Mario Boella (ISMB) Politecnico di Torino Via Pier Carlo Boggio 61, Torino 10138, Italy Email: manuel.perezcerquera at polito.it Phone: +39 0112276704 Fax: +39 011 2276 299 From balay at mcs.anl.gov Thu Sep 18 17:41:10 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 18 Sep 2014 17:41:10 -0500 Subject: [petsc-users] unsuscribe In-Reply-To: References: Message-ID: Anyone can unsubscribe by following the mailing list links thats included in every mailing-list email. Notice the mail header in each list e-mail: [different mailers might process/show this differently] List-Unsubscribe: , And its best to use the same e-mail id as you are subscribed. Here you are posting with d022117 at polito.it - but you were subscribed with manuel.perezcerquera at polito.it. But I was able to match the email-id from your name listed in this e-mail. You are unsubscribed now. Satish On Thu, 18 Sep 2014, PEREZ CERQUERA MANUEL RICARDO wrote: > HI, > > i would like to dismiss my account. > > Thank you > > Eng. Manuel Ricardo Perez Cerquera. MSc. Ph.D student > Antenna and EMC Lab (LACE) > Istituto Superiore Mario Boella (ISMB) > Politecnico di Torino > Via Pier Carlo Boggio 61, Torino 10138, Italy > Email: manuel.perezcerquera at polito.it > Phone: +39 0112276704 > Fax: +39 011 2276 299 > > From fshi at fit.edu Thu Sep 18 21:28:36 2014 From: fshi at fit.edu (Feng Shi) Date: Fri, 19 Sep 2014 02:28:36 +0000 Subject: [petsc-users] implicit TS solver for flux calculation Message-ID: Dear all, I'm learning how to use implicit TS solvers to calculate the flux. The example code is http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex10.c.html I'm dying in understanding Lines 232-255. Is there anyone who knows how to work this out? Thank you in advance and sorry for bothering others. Best regards, Feng- From jed at jedbrown.org Thu Sep 18 22:02:36 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 18 Sep 2014 21:02:36 -0600 Subject: [petsc-users] implicit TS solver for flux calculation In-Reply-To: References: Message-ID: <87vboknwhf.fsf@jedbrown.org> Feng Shi writes: > Dear all, > > I'm learning how to use implicit TS solvers to calculate the flux. The example code is http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex10.c.html > I'm dying in understanding Lines 232-255. Is there anyone who knows how to work this out? It's chain rule, differentiating the fluxes with respect to the states in each cell. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From fshi at fit.edu Fri Sep 19 23:45:29 2014 From: fshi at fit.edu (Feng Shi) Date: Sat, 20 Sep 2014 04:45:29 +0000 Subject: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - Message-ID: Hi Folks, I get hard times in using the routine MatSetValuesStencil for dof>1. There numerous examples for dof=1 but dof>1, neither in the Petsc manual. The routine on-line manual (http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetValuesStencil.html) says, idxm/idxn are grid coordinates (and component number when dof > 1) for matrix columns being entered. What is the general form of these indices? Am I right, for example, for idxn, if I write idxn[ix, dof].i? Thank you for your help! Best regards, Feng From jed at jedbrown.org Sat Sep 20 00:16:43 2014 From: jed at jedbrown.org (Jed Brown) Date: Fri, 19 Sep 2014 23:16:43 -0600 Subject: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - In-Reply-To: References: Message-ID: <878ulelvlw.fsf@jedbrown.org> Feng Shi writes: > Hi Folks, > > I get hard times in using the routine MatSetValuesStencil for > dof>1. There numerous examples for dof=1 but dof>1, neither in the > Petsc manual. The routine on-line manual > (http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetValuesStencil.html) says, idxm/idxn are grid coordinates (and component number when dof > 1) for matrix columns being entered. What is the general form of these indices? If you have dof>1, please use MatSetValuesBlockedStencil. See src/snes/examples/tutorials/ex48.c for an example usage for a finite-element method. > Am I right, for example, for idxn, if I write idxn[ix, dof].i? That's not valid syntax, so I don't know what you're asking. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From fshi at fit.edu Sat Sep 20 00:55:41 2014 From: fshi at fit.edu (Feng Shi) Date: Sat, 20 Sep 2014 05:55:41 +0000 Subject: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - In-Reply-To: <878ulelvlw.fsf@jedbrown.org> References: , <878ulelvlw.fsf@jedbrown.org> Message-ID: I'm sorry I've no idea on the finite element, but for the finite difference, say for a 3-D case, we could use 7-point differentiation method, so the routine would be MatSetValuesStencil(A_mat, 1, &row, 7, col, val, INSERT_VALUES). I know 7 denotes the 7-points in space for differentiation and array val[0:6]. If I use MatSetValuesBlockedStencil, am I right using MatSetValuesBlockedStencil(A_mat, 1, &row, 7, col, val, INSERT_VALUES), where array is val[0:1][0:7] for dof=2? Best regards, ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Saturday, September 20, 2014 1:16 AM To: Feng Shi; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - Feng Shi writes: > Hi Folks, > > I get hard times in using the routine MatSetValuesStencil for > dof>1. There numerous examples for dof=1 but dof>1, neither in the > Petsc manual. The routine on-line manual > (http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetValuesStencil.html) says, idxm/idxn are grid coordinates (and component number when dof > 1) for matrix columns being entered. What is the general form of these indices? If you have dof>1, please use MatSetValuesBlockedStencil. See src/snes/examples/tutorials/ex48.c for an example usage for a finite-element method. > Am I right, for example, for idxn, if I write idxn[ix, dof].i? That's not valid syntax, so I don't know what you're asking. From bsmith at mcs.anl.gov Sat Sep 20 11:12:32 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 20 Sep 2014 11:12:32 -0500 Subject: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - In-Reply-To: References: , <878ulelvlw.fsf@jedbrown.org> Message-ID: <2BA43679-CAC5-4A8E-8FE9-78CF8F44F364@mcs.anl.gov> When using MatSetValuesStencil() for dof > 1 note that typedef struct { PetscInt k,j,i,c; } MatStencil; For dof > 1 you need to assign the k,j,i AND c value. (c corresponds to the dof) for each entry you put in the matrix. For example row[0].k = 2; row[0].j=3; row[0].i= 9; row[0].c = 1 (this is for the second dof at the grid point 9,3,2) c=0 is for the first dof Only use the blocked stencil form if you really have blocks of non zeros in the matrix (i.e. all the dof are fully coupled with each other and the neighboring points). Barry On Sep 20, 2014, at 12:55 AM, Feng Shi wrote: > I'm sorry I've no idea on the finite element, but for the finite difference, say for a 3-D case, we could use 7-point differentiation method, so the routine would be MatSetValuesStencil(A_mat, 1, &row, 7, col, val, INSERT_VALUES). I know 7 denotes the 7-points in space for differentiation and array val[0:6]. If I use MatSetValuesBlockedStencil, am I right using MatSetValuesBlockedStencil(A_mat, 1, &row, 7, col, val, INSERT_VALUES), where array is val[0:1][0:7] for dof=2? > > Best regards, > > ________________________________________ > From: Jed Brown [jed at jedbrown.org] > Sent: Saturday, September 20, 2014 1:16 AM > To: Feng Shi; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - > > Feng Shi writes: > >> Hi Folks, >> >> I get hard times in using the routine MatSetValuesStencil for >> dof>1. There numerous examples for dof=1 but dof>1, neither in the >> Petsc manual. The routine on-line manual >> (http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetValuesStencil.html) says, idxm/idxn are grid coordinates (and component number when dof > 1) for matrix columns being entered. What is the general form of these indices? > > If you have dof>1, please use MatSetValuesBlockedStencil. See > src/snes/examples/tutorials/ex48.c for an example usage for a > finite-element method. > >> Am I right, for example, for idxn, if I write idxn[ix, dof].i? > > That's not valid syntax, so I don't know what you're asking. From fshi at fit.edu Sat Sep 20 12:04:30 2014 From: fshi at fit.edu (Feng Shi) Date: Sat, 20 Sep 2014 17:04:30 +0000 Subject: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - In-Reply-To: <2BA43679-CAC5-4A8E-8FE9-78CF8F44F364@mcs.anl.gov> References: , <878ulelvlw.fsf@jedbrown.org> , <2BA43679-CAC5-4A8E-8FE9-78CF8F44F364@mcs.anl.gov> Message-ID: Thank you very much! Best regards, Feng ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Saturday, September 20, 2014 12:12 PM To: Feng Shi Cc: Jed Brown; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - When using MatSetValuesStencil() for dof > 1 note that typedef struct { PetscInt k,j,i,c; } MatStencil; For dof > 1 you need to assign the k,j,i AND c value. (c corresponds to the dof) for each entry you put in the matrix. For example row[0].k = 2; row[0].j=3; row[0].i= 9; row[0].c = 1 (this is for the second dof at the grid point 9,3,2) c=0 is for the first dof Only use the blocked stencil form if you really have blocks of non zeros in the matrix (i.e. all the dof are fully coupled with each other and the neighboring points). Barry On Sep 20, 2014, at 12:55 AM, Feng Shi wrote: > I'm sorry I've no idea on the finite element, but for the finite difference, say for a 3-D case, we could use 7-point differentiation method, so the routine would be MatSetValuesStencil(A_mat, 1, &row, 7, col, val, INSERT_VALUES). I know 7 denotes the 7-points in space for differentiation and array val[0:6]. If I use MatSetValuesBlockedStencil, am I right using MatSetValuesBlockedStencil(A_mat, 1, &row, 7, col, val, INSERT_VALUES), where array is val[0:1][0:7] for dof=2? > > Best regards, > > ________________________________________ > From: Jed Brown [jed at jedbrown.org] > Sent: Saturday, September 20, 2014 1:16 AM > To: Feng Shi; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] How to set matrix values using "MatSetValuesStencil" when dof>1 idxn - > > Feng Shi writes: > >> Hi Folks, >> >> I get hard times in using the routine MatSetValuesStencil for >> dof>1. There numerous examples for dof=1 but dof>1, neither in the >> Petsc manual. The routine on-line manual >> (http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetValuesStencil.html) says, idxm/idxn are grid coordinates (and component number when dof > 1) for matrix columns being entered. What is the general form of these indices? > > If you have dof>1, please use MatSetValuesBlockedStencil. See > src/snes/examples/tutorials/ex48.c for an example usage for a > finite-element method. > >> Am I right, for example, for idxn, if I write idxn[ix, dof].i? > > That's not valid syntax, so I don't know what you're asking. From fshi at fit.edu Sun Sep 21 10:35:59 2014 From: fshi at fit.edu (Feng Shi) Date: Sun, 21 Sep 2014 15:35:59 +0000 Subject: [petsc-users] Mat indices Message-ID: Hi all, When setting the Mat indices, like the example: src/ts/examples/tutorials/ex10, I noticed that on line 508: col[0] = i-1; 509: col[1] = i; 510: col[2] = i+1(B,1,&i,3,col,&K[0][0],INSERT_VALUES); How about the case that when i=0, then col[0]=-1 and i=info.mx then col[2]=-1? My question is, is the Mat default indices starting from -1? Thank you in advance! Best regards, Feng From jed at jedbrown.org Sun Sep 21 10:55:02 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 21 Sep 2014 09:55:02 -0600 Subject: [petsc-users] Mat indices In-Reply-To: References: Message-ID: <87tx41ynmx.fsf@jedbrown.org> Feng Shi writes: > Hi all, > > When setting the Mat indices, like the example: src/ts/examples/tutorials/ex10, I noticed that on line > > 508: col[0] = i-1; > 509: col[1] = i; > 510: col[2] = i+1 511: MatSetValuesBlocked(B,1,&i,3,col,&K[0][0],INSERT_VALUES); > > How about the case that when i=0, then col[0]=-1 and i=info.mx then col[2]=-1? > My question is, is the Mat default indices starting from -1? No, matrix insertion ignores negative indices. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sun Sep 21 10:57:09 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 21 Sep 2014 10:57:09 -0500 Subject: [petsc-users] Mat indices In-Reply-To: References: Message-ID: Mat indices always start with 0 in global indexing used by MatSetValues, MatSetValuesBlocked With ?local orderings? on can chose to use whatever indexing makes sense for your local ordering, this is used with MatSetValuesLocal(), MatSetValuesBlockedLocal(). Barry On Sep 21, 2014, at 10:35 AM, Feng Shi wrote: > Hi all, > > When setting the Mat indices, like the example: src/ts/examples/tutorials/ex10, I noticed that on line > > 508: col[0] = i-1; > 509: col[1] = i; > 510: col[2] = i+1 511: MatSetValuesBlocked(B,1,&i,3,col,&K[0][0],INSERT_VALUES); > > How about the case that when i=0, then col[0]=-1 and i=info.mx then col[2]=-1? > My question is, is the Mat default indices starting from -1? > > Thank you in advance! > > Best regards, > > Feng From fshi at fit.edu Sun Sep 21 10:59:50 2014 From: fshi at fit.edu (Feng Shi) Date: Sun, 21 Sep 2014 15:59:50 +0000 Subject: [petsc-users] Mat indices In-Reply-To: <87tx41ynmx.fsf@jedbrown.org> References: , <87tx41ynmx.fsf@jedbrown.org> Message-ID: <5D6491E9-BCD6-4AB9-9472-E252742D2EDE@fit.edu> Thank you all for your replies. All the best,, On Sep 21, 2014, at 11:55 AM, "Jed Brown" wrote: > Feng Shi writes: > >> Hi all, >> >> When setting the Mat indices, like the example: src/ts/examples/tutorials/ex10, I noticed that on line >> >> 508: col[0] = i-1; >> 509: col[1] = i; >> 510: col[2] = i+1> 511: MatSetValuesBlocked(B,1,&i,3,col,&K[0][0],INSERT_VALUES); >> >> How about the case that when i=0, then col[0]=-1 and i=info.mx then col[2]=-1? >> My question is, is the Mat default indices starting from -1? > > No, matrix insertion ignores negative indices. From Vincent.De-Groof at uibk.ac.at Sun Sep 21 12:35:29 2014 From: Vincent.De-Groof at uibk.ac.at (De Groof, Vincent Frans Maria) Date: Sun, 21 Sep 2014 17:35:29 +0000 Subject: [petsc-users] Natural norm Message-ID: <17A78B9D13564547AC894B88C1596747203AF1F7@XMBX4.uibk.ac.at> Hi all, the natural norm for positive definite systems in Petsc uses the preconditioner B, and is defined by r' * B * r. Am I right assuming that this way we want to obtain an estimate for r' * K^-1 * r, which is impossible since we don't have K^-1? But we do know B which is approximately K^-1. I'm sorry for this, I guess, basic question, but I just wanted to be sure. kind regards, Vincent -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Sep 21 13:16:29 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 21 Sep 2014 13:16:29 -0500 Subject: [petsc-users] Natural norm In-Reply-To: <17A78B9D13564547AC894B88C1596747203AF1F7@XMBX4.uibk.ac.at> References: <17A78B9D13564547AC894B88C1596747203AF1F7@XMBX4.uibk.ac.at> Message-ID: <4304962A-E9B6-4752-959A-3436993B5675@mcs.anl.gov> On Sep 21, 2014, at 12:35 PM, De Groof, Vincent Frans Maria wrote: > Hi all, > > > the natural norm for positive definite systems in Petsc uses the preconditioner B, and is defined by r' * B * r. Am I right assuming that this way we want to obtain an estimate for r' * K^-1 * r, which is impossible since we don't have K^-1? But we do know B which is approximately K^-1. I think so. The way I look at it is r? * B * r = e? *A *B *A e and if B is inv(A) then it = e?*A*e which is the ?energy? of the error as measured by A, .ie. ||e||_A = sort(e?*A*e) . Now, for example, if B = I then one gets e?*A*A*e so it is no longer the ?natural norm? of the error, just some other norm. > > > I'm sorry for this, I guess, basic question, but I just wanted to be sure. > > > kind regards, > Vincent From fshi at fit.edu Sun Sep 21 13:21:11 2014 From: fshi at fit.edu (Feng Shi) Date: Sun, 21 Sep 2014 18:21:11 +0000 Subject: [petsc-users] Mat indices In-Reply-To: References: , Message-ID: Hi all, For 2-D finite difference problems with dof>1,to use MatSetValuesBlocked, what should be the indices used in that routine? Am I right if I just use the indices just like dof=1, but set (5*dof^2) values at one time? Specifically, I'm trying to use implicit TS solver with dof>1, as in example src/ts/examples/tutorials/ex10. I understand for 1-D finite diffrence cases, we have (3 by dof^2) matrix elements, and we can use as in the example: MatSetValuesBlocked(B,1,&i,3,col,&K[0][0],INSERT_VALUES); to insert/form the Jacobian. In my 2-D cases with dof=3, I use 5-point finite difference regime, which means I will have (5*3^2=45) elements Jacobian to be set at one time as in the example, right? If I use the statement "Matstencil row, col[5]" as indices to insert values, after we set: "row.i=i, row.j=j and col[1:5].i=..., col[1:5].j=...", then just simply use: MatSetValuesBlocked(B,1,&row, 5, &col, &K[0][0],INSERT_VALUES); to insert these (5*3^2) values to form the Jacobian? I'm also confused by the dof defined in the Mat. Does it mean for each node, there are (dof^2) elements? Thank you in advance! Best regards, ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Sunday, September 21, 2014 11:57 AM To: Feng Shi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Mat indices Mat indices always start with 0 in global indexing used by MatSetValues, MatSetValuesBlocked With ?local orderings? on can chose to use whatever indexing makes sense for your local ordering, this is used with MatSetValuesLocal(), MatSetValuesBlockedLocal(). Barry On Sep 21, 2014, at 10:35 AM, Feng Shi wrote: > Hi all, > > When setting the Mat indices, like the example: src/ts/examples/tutorials/ex10, I noticed that on line > > 508: col[0] = i-1; > 509: col[1] = i; > 510: col[2] = i+1 511: MatSetValuesBlocked(B,1,&i,3,col,&K[0][0],INSERT_VALUES); > > How about the case that when i=0, then col[0]=-1 and i=info.mx then col[2]=-1? > My question is, is the Mat default indices starting from -1? > > Thank you in advance! > > Best regards, > > Feng From jed at jedbrown.org Sun Sep 21 13:34:02 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 21 Sep 2014 12:34:02 -0600 Subject: [petsc-users] Mat indices In-Reply-To: References: Message-ID: <87oau8zuud.fsf@jedbrown.org> Feng Shi writes: > Hi all, > > For 2-D finite difference problems with dof>1,to use > MatSetValuesBlocked, what should be the indices used in that routine? > Am I right if I just use the indices just like dof=1, but set > (5*dof^2) values at one time? That is the number of entries in a block row when using a 5-point stencil. The row and column indices are by block, not by scalar. > Specifically, I'm trying to use implicit TS solver with dof>1, as in example src/ts/examples/tutorials/ex10. I understand for 1-D finite diffrence cases, we have (3 by dof^2) matrix elements, and we can use as in the example: > MatSetValuesBlocked(B,1,&i,3,col,&K[0][0],INSERT_VALUES); > to insert/form the Jacobian. In my 2-D cases with dof=3, I use 5-point finite difference regime, which means I will have (5*3^2=45) elements Jacobian to be set at one time as in the example, right? If I use the statement "Matstencil row, col[5]" as indices to insert values, after we set: "row.i=i, row.j=j and col[1:5].i=..., col[1:5].j=...", then just simply use: > MatSetValuesBlocked(B,1,&row, 5, &col, &K[0][0],INSERT_VALUES); > to insert these (5*3^2) values to form the Jacobian? > > I'm also confused by the dof defined in the Mat. Does it mean for each node, there are (dof^2) elements? What "dof defined in the Mat"? -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From fshi at fit.edu Sun Sep 21 13:45:22 2014 From: fshi at fit.edu (Feng Shi) Date: Sun, 21 Sep 2014 18:45:22 +0000 Subject: [petsc-users] Mat indices In-Reply-To: <87oau8zuud.fsf@jedbrown.org> References: , <87oau8zuud.fsf@jedbrown.org> Message-ID: Hi Jed, I mean, for the 3-D Maxwellian equations, the field E and B are both vectors, so we have 6 scalars, in which case I can define dof=6 for each node, right? For each node in this case, there are 6*6=36 elements for each block, am I right? As for setting the values, I understand I can use routine MatSetValuesBlockedStencil to set Jacobian. But I don't know what indices we should use in this case? Just as the same indices as the case with dof=1? row.i=i, row.j=j and col[1:5].i=..., col[1:5].j=... then call: MatSetValuesBlockedStencil(B,1,&row, 7, &col, &Value[0][0],INSERT_VALUES); Am I right? Thank you in advance! Best regards, Feng ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Sunday, September 21, 2014 2:34 PM To: Feng Shi; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Mat indices Feng Shi writes: > Hi all, > > For 2-D finite difference problems with dof>1,to use > MatSetValuesBlocked, what should be the indices used in that routine? > Am I right if I just use the indices just like dof=1, but set > (5*dof^2) values at one time? That is the number of entries in a block row when using a 5-point stencil. The row and column indices are by block, not by scalar. > Specifically, I'm trying to use implicit TS solver with dof>1, as in example src/ts/examples/tutorials/ex10. I understand for 1-D finite diffrence cases, we have (3 by dof^2) matrix elements, and we can use as in the example: > MatSetValuesBlocked(B,1,&i,3,col,&K[0][0],INSERT_VALUES); > to insert/form the Jacobian. In my 2-D cases with dof=3, I use 5-point finite difference regime, which means I will have (5*3^2=45) elements Jacobian to be set at one time as in the example, right? If I use the statement "Matstencil row, col[5]" as indices to insert values, after we set: "row.i=i, row.j=j and col[1:5].i=..., col[1:5].j=...", then just simply use: > MatSetValuesBlocked(B,1,&row, 5, &col, &K[0][0],INSERT_VALUES); > to insert these (5*3^2) values to form the Jacobian? > > I'm also confused by the dof defined in the Mat. Does it mean for each node, there are (dof^2) elements? What "dof defined in the Mat"? From jed at jedbrown.org Sun Sep 21 13:49:49 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 21 Sep 2014 12:49:49 -0600 Subject: [petsc-users] Mat indices In-Reply-To: References: <87oau8zuud.fsf@jedbrown.org> Message-ID: <87lhpczu42.fsf@jedbrown.org> Feng Shi writes: > Hi Jed, > > I mean, for the 3-D Maxwellian equations, the field E and B are both > vectors, so we have 6 scalars, in which case I can define dof=6 for > each node, right? If you use a collocated discretization, yes. Though you might want to use a staggered/mixed scheme for curl-compatibility. > For each node in this case, there are 6*6=36 elements for each block, > am I right? Yes, for collocated. > As for setting the values, I understand I can use routine MatSetValuesBlockedStencil to set Jacobian. But I don't know what indices we should use in this case? Just as the same indices as the case with dof=1? row.i=i, row.j=j and col[1:5].i=..., col[1:5].j=... then call: > MatSetValuesBlockedStencil(B,1,&row, 7, &col, &Value[0][0],INSERT_VALUES); > Am I right? The "..." part above is important, but yes, that's the general idea. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From fshi at fit.edu Sun Sep 21 13:59:18 2014 From: fshi at fit.edu (Feng Shi) Date: Sun, 21 Sep 2014 18:59:18 +0000 Subject: [petsc-users] Mat indices In-Reply-To: <87lhpczu42.fsf@jedbrown.org> References: <87oau8zuud.fsf@jedbrown.org> , <87lhpczu42.fsf@jedbrown.org> Message-ID: Hi Jed, Thank you very much! I think I understand that now. It's really helpful to me and other beginners who will use Finite Difference with Petsc in solving 2-D and 3-D problems. Thank you. row.i=i, row.j=j, row.k=k and col[1:7].i=i,i+/-1, col[1:7].j=j,j+/-1, col[1:7].k=k,k+/-1. then call: MatSetValuesBlockedStencil(B,1,&row, 7, &col, &Value[0][0],INSERT_VALUES); where Value[][] should be an array with 7*36=252 elements. Best regards, Feng ________________________________________ From: Jed Brown [jed at jedbrown.org] Sent: Sunday, September 21, 2014 2:49 PM To: Feng Shi; petsc-users at mcs.anl.gov Subject: RE: [petsc-users] Mat indices Feng Shi writes: > Hi Jed, > > I mean, for the 3-D Maxwellian equations, the field E and B are both > vectors, so we have 6 scalars, in which case I can define dof=6 for > each node, right? If you use a collocated discretization, yes. Though you might want to use a staggered/mixed scheme for curl-compatibility. > For each node in this case, there are 6*6=36 elements for each block, > am I right? Yes, for collocated. > As for setting the values, I understand I can use routine MatSetValuesBlockedStencil to set Jacobian. But I don't know what indices we should use in this case? Just as the same indices as the case with dof=1? row.i=i, row.j=j and col[1:5].i=..., col[1:5].j=... then call: > MatSetValuesBlockedStencil(B,1,&row, 7, &col, &Value[0][0],INSERT_VALUES); > Am I right? The "..." part above is important, but yes, that's the general idea. From jed at jedbrown.org Sun Sep 21 14:06:59 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 21 Sep 2014 13:06:59 -0600 Subject: [petsc-users] Mat indices In-Reply-To: References: <87oau8zuud.fsf@jedbrown.org> <87lhpczu42.fsf@jedbrown.org> Message-ID: <87iokgztbg.fsf@jedbrown.org> Feng Shi writes: > Hi Jed, > > Thank you very much! > > I think I understand that now. It's really helpful to me and other beginners who will use Finite Difference with Petsc in solving 2-D and 3-D problems. Thank you. > > row.i=i, row.j=j, row.k=k and col[1:7].i=i,i+/-1, col[1:7].j=j,j+/-1, > col[1:7].k=k,k+/-1. Note that it's really col[0] through col[6]. > then call: MatSetValuesBlockedStencil(B,1,&row, 7, &col, > &Value[0][0],INSERT_VALUES); where Value[][] should be an array with > 7*36=252 elements. Yes. If it makes the indexing more clear, you can use PetscScalar Value[1][bs][7][bs]; there is only one row and the columns are laid out as above (same as is most natural for a scalar problem). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From 4bikerboyjohn at gmail.com Mon Sep 22 07:36:55 2014 From: 4bikerboyjohn at gmail.com (John Alletto) Date: Mon, 22 Sep 2014 05:36:55 -0700 Subject: [petsc-users] Laplacian at infinity Message-ID: <9695B232-81E3-4841-9DA8-E844BDDAE2F0@gmail.com> All, I am try to match some E&M problems with analytical solutions. How do I deal with infinity when using a uniform grid ? How far out do I need to process, do I need to use a particular technique? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 22 07:51:24 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Sep 2014 07:51:24 -0500 Subject: [petsc-users] Laplacian at infinity In-Reply-To: <9695B232-81E3-4841-9DA8-E844BDDAE2F0@gmail.com> References: <9695B232-81E3-4841-9DA8-E844BDDAE2F0@gmail.com> Message-ID: On Mon, Sep 22, 2014 at 7:36 AM, John Alletto <4bikerboyjohn at gmail.com> wrote: > All, > > I am try to match some E&M problems with analytical solutions. > How do I deal with infinity when using a uniform grid ? > It depends on your problem. Keep making it bigger and see if you get convergence. > How far out do I need to process, do I need to use a particular technique? > It depends on what equations you are using (Maxwell, Poisson, etc.) and what is important (no reflection, etc.) Matt > John > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Sep 22 08:54:38 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 22 Sep 2014 07:54:38 -0600 Subject: [petsc-users] Natural norm In-Reply-To: <4304962A-E9B6-4752-959A-3436993B5675@mcs.anl.gov> References: <17A78B9D13564547AC894B88C1596747203AF1F7@XMBX4.uibk.ac.at> <4304962A-E9B6-4752-959A-3436993B5675@mcs.anl.gov> Message-ID: <871tr3zroh.fsf@jedbrown.org> Barry Smith writes: > On Sep 21, 2014, at 12:35 PM, De Groof, Vincent Frans Maria wrote: > >> the natural norm for positive definite systems in Petsc uses the >> preconditioner B, and is defined by r' * B * r. Am I right assuming >> that this way we want to obtain an estimate for r' * K^-1 * r, which >> is impossible since we don't have K^-1? But we do know B which is >> approximately K^-1. > > I think so. The way I look at it is r? * B * r = e? *A *B *A e and > if B is inv(A) then it = e?*A*e which is the ?energy? of the error > as measured by A, Hmm, unpreconditioned CG minimizes the A-norm (energy norm) of the error: i.e., |e|_A = e' * A * e. This is in contrast to GMRES which simply minimizes the 2-norm of the residual: |r|_2 = r' * r = e' * A' * A * e = |e|_{A'*A}. Note that CG's norm is stronger. When you add preconditioning, CG minimizes the B^{T/2} A B^{1/2} norm of the error as compared to GMRES, which minimizes the B' A' A B norm (or A' B' B A for left preconditioning). If the preconditioner B = A^{-1}, then all methods minimize both the error and residual (in exact arithmetic) because the preconditioned operator is the identity. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jed at jedbrown.org Mon Sep 22 09:08:07 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 22 Sep 2014 08:08:07 -0600 Subject: [petsc-users] Laplacian at infinity In-Reply-To: References: <9695B232-81E3-4841-9DA8-E844BDDAE2F0@gmail.com> Message-ID: <87zjdriw8o.fsf@jedbrown.org> Matthew Knepley writes: > On Mon, Sep 22, 2014 at 7:36 AM, John Alletto <4bikerboyjohn at gmail.com> > wrote: > >> All, >> >> I am try to match some E&M problems with analytical solutions. >> How do I deal with infinity when using a uniform grid ? >> > > It depends on your problem. Keep making it bigger and see if you get > convergence. This is a way to assess error caused by the finite grid, but you can also use an analytic solution for that. Using an asymptotic expansion for the boundary condition can be useful (esp. when there is nonzero charge in the domain, for example). For wave propagation, look at perfectly matched layers (PML). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From knepley at gmail.com Mon Sep 22 09:52:56 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Sep 2014 09:52:56 -0500 Subject: [petsc-users] Laplacian at infinity In-Reply-To: <87zjdriw8o.fsf@jedbrown.org> References: <9695B232-81E3-4841-9DA8-E844BDDAE2F0@gmail.com> <87zjdriw8o.fsf@jedbrown.org> Message-ID: On Mon, Sep 22, 2014 at 9:08 AM, Jed Brown wrote: > Matthew Knepley writes: > > > On Mon, Sep 22, 2014 at 7:36 AM, John Alletto <4bikerboyjohn at gmail.com> > > wrote: > > > >> All, > >> > >> I am try to match some E&M problems with analytical solutions. > >> How do I deal with infinity when using a uniform grid ? > >> > > > > It depends on your problem. Keep making it bigger and see if you get > > convergence. > > This is a way to assess error caused by the finite grid, but you can > also use an analytic solution for that. Using an asymptotic expansion > for the boundary condition can be useful (esp. when there is nonzero > charge in the domain, for example). For wave propagation, look at > perfectly matched layers (PML). > PML is the best option, although many codes simply use attenuation. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hillsmattc at outlook.com Mon Sep 22 04:44:31 2014 From: hillsmattc at outlook.com (Matthew Hills) Date: Mon, 22 Sep 2014 11:44:31 +0200 Subject: [petsc-users] PETSc/TAU configuration In-Reply-To: References: , Message-ID: Hi PETSc Team, I'm still experiencing difficulties with configuring PETSc with TAU. I'm currently: building OpenMPI 1. ./configure --prefix=${SESKADIR}/packages/openmpi 2. make all install set library path 1. export LD_LIBRARY_PATH=${SESKADIR}/lib:${SESKADIR}/packages/openmpi /lib:${SESKADIR}/packages/pdt/x86_64/lib:/${SESKADIR}/packages/tau/x86_64/lib:${SESKADIR}/packages/petsc/${PETSC_ARCH}/lib:$LD_LIBRARY_PATH 2. export PATH=${SESKADIR}/bin:${SESKADIR}/packages/petsc/${PETSC_ARCH}/bin:$PATH build PDT (pdtoolkit-3.20) 1. ./configure -GNU 2. export PATH=${SESKADIR}/packages/pdt/x86_64/bin:${SESKADIR}/packages/pdt/x86_64//bin:$PATH 5. make 6. make install build TAU (tau-2.23.1) using OpenMPI 1. ./configure -prefix=`pwd` -cc=mpicc -c++=mpicxx -fortran=mpif90 -pdt=${SESKADIR}/packages/pdt -mpiinc=${SESKADIR}/packages/openmpi/include -mpilib=${SESKADIR}/packages/openmpi/lib -bfd=download 2. export PATH=${SESKADIR}/packages/tau/x86_64/bin:$PATH 3. make install build fblaslapacklinpack-3.1.1 1. make build PETSc using TAU_CC/MPI 1. export TAU_MAKEFILE=${SESKADIR}/packages/tau/x86_64/lib/Makefile.tau-mpi-pdt 2. ./configure --prefix='pwd' --with-mpi=1 --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 --with-blas-lapack-dir=${SESKADIR}/packages/fblaslapack Error: Tried looking for file: /tmp/petsc-U9YCMv/config.setCompilers/conftest Error: Failed to link with TAU options Error: Command(Executable) is -- gcc Attached you'll find my configure log. Any assistance would be greatly appreciated. Warm regards, Matthew > Date: Tue, 16 Sep 2014 08:21:41 -0500 > From: balay at mcs.anl.gov > To: hillsmattc at outlook.com > CC: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] PETSc/TAU configuration > > I haven't tried using TAU in a while - but here are some obvious things to try. > > 1. --download-mpich [or openmpi] with TAU does not make sense. > > You would have to build MPICH/OpenMPI first. > > Then build TAU to use this MPI. > > And then build PETSc to use this TAU_CC/MPI > > 2. I would use only tau_cc.sh - and not bother with c++/fortran > > i.e [with TAU build with a given mpicc] - configure PETSc with: > ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 > > 3. Do not use any --download-package when using tau_cc.sh. First check > if you are able to use TAU with PETSc - without externalpackages [you > would need blas,mpi. Use system blas/lapack for blas/lapack - and > build MPI as mentioned above for use with TAU and later PETSc] > > And if you really need these externalpackage [assuming the above basic > build with TAU works] - I would recommend the following 2 step build process: > > > 4.1. ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-PACKAGE PETSC_ARCH=arch-packages > > 4.2. Now strip out the petsc relavent stuff from this location > rm -f arch-packages/include/petsc*.h > > 4.3. Now build PETSc with TAU - using these prebuilt-packages > > ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 PETSC_ARCH=arch-tau --with-PACKAGE-dir=`pwd`/arch-packages > > BTW: the current release is petsc-3.5 - we recommend upgrading to > using it [as we usually support the latest release wrt debugging/bug > fixes] > > Satish > > > > On Tue, 16 Sep 2014, Matthew Hills wrote: > > > > Hi PETSc Team,] > > > > I am experiencing difficulties with configuring PETSc with TAU. I have replaced the standard compilers with the tau_cc.sh, tau_cxx.sh, and tau_f90.sh scripts but this produces multiple errors. I have also attempted to use OpenMPI and MPICH, but both produce their own unique errors. > > > > After successfully compiling PDT, TAU was compiled with: > > > > ./configure > > -prefix=`pwd` > > -cc=gcc > > -c++=g++ -fortran=gfortran > > -pdt=${SESKADIR}/packages/pdt > > -mpiinc=${PETSC_DIR}/${PETSC_ARCH}/include > > -mpilib=${PETSC_DIR}/${PETSC_ARCH}/lib -bfd=download > > > > Attached you'll find the PETSc configuration logs. If any more information is needed please let me know. > > > > Warm regards, > > > > Matthew Hills > > University of Cape Town > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: configure_petsc-tau.log URL: From balay at mcs.anl.gov Mon Sep 22 10:24:15 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 22 Sep 2014 10:24:15 -0500 Subject: [petsc-users] PETSc/TAU configuration In-Reply-To: References: , Message-ID: On Mon, 22 Sep 2014, Matthew Hills wrote: > > > > Hi PETSc Team, > > I'm still experiencing difficulties with configuring PETSc with TAU. I'm currently: > > building OpenMPI > 1. ./configure --prefix=${SESKADIR}/packages/openmpi > 2. make all install > > set library path > 1. export LD_LIBRARY_PATH=${SESKADIR}/lib:${SESKADIR}/packages/openmpi /lib:${SESKADIR}/packages/pdt/x86_64/lib:/${SESKADIR}/packages/tau/x86_64/lib:${SESKADIR}/packages/petsc/${PETSC_ARCH}/lib:$LD_LIBRARY_PATH > 2. export PATH=${SESKADIR}/bin:${SESKADIR}/packages/petsc/${PETSC_ARCH}/bin:$PATH > > > > build PDT (pdtoolkit-3.20) > 1. ./configure -GNU > 2. export PATH=${SESKADIR}/packages/pdt/x86_64/bin:${SESKADIR}/packages/pdt/x86_64//bin:$PATH > 5. make > 6. make install > > build TAU (tau-2.23.1) using OpenMPI > 1. ./configure -prefix=`pwd` -cc=mpicc -c++=mpicxx -fortran=mpif90 -pdt=${SESKADIR}/packages/pdt -mpiinc=${SESKADIR}/packages/openmpi/include -mpilib=${SESKADIR}/packages/openmpi/lib -bfd=download > 2. export PATH=${SESKADIR}/packages/tau/x86_64/bin:$PATH > 3. make install > > build fblaslapacklinpack-3.1.1 > 1. make Should have said '--download-fblaslapack' would be fine here [as it uses mpif90 - not tau_cc.sh]. Building seperately is also fine. > > build PETSc using TAU_CC/MPI > 1. export TAU_MAKEFILE=${SESKADIR}/packages/tau/x86_64/lib/Makefile.tau-mpi-pdt > 2. ./configure --prefix='pwd' --with-mpi=1 --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 --with-blas-lapack-dir=${SESKADIR}/packages/fblaslapack --prefix='pwd' doesn't make sense. Please remove it. > > Error: Tried looking for file: /tmp/petsc-U9YCMv/config.setCompilers/conftest > Error: Failed to link with TAU options > Error: Command(Executable) is -- gcc configure.log looks complete [and indicates a successful run]. Did the messages above come during configure step on the terminal? Can you try the following and see if PETSc builds successfully? [but recommend rerunning configure first - without --prefix option] make PETSC_DIR=/home/hills/seska/packages/petsc PETSC_ARCH=linux-gnu-cxx-opt all Satish > > > Attached you'll find my configure log. Any assistance would be greatly appreciated. > > Warm regards, > Matthew > > > > > Date: Tue, 16 Sep 2014 08:21:41 -0500 > > From: balay at mcs.anl.gov > > To: hillsmattc at outlook.com > > CC: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] PETSc/TAU configuration > > > > I haven't tried using TAU in a while - but here are some obvious things to try. > > > > 1. --download-mpich [or openmpi] with TAU does not make sense. > > > > You would have to build MPICH/OpenMPI first. > > > > Then build TAU to use this MPI. > > > > And then build PETSc to use this TAU_CC/MPI > > > > 2. I would use only tau_cc.sh - and not bother with c++/fortran > > > > i.e [with TAU build with a given mpicc] - configure PETSc with: > > ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 > > > > 3. Do not use any --download-package when using tau_cc.sh. First check > > if you are able to use TAU with PETSc - without externalpackages [you > > would need blas,mpi. Use system blas/lapack for blas/lapack - and > > build MPI as mentioned above for use with TAU and later PETSc] > > > > And if you really need these externalpackage [assuming the above basic > > build with TAU works] - I would recommend the following 2 step build process: > > > > > > 4.1. ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-PACKAGE PETSC_ARCH=arch-packages > > > > 4.2. Now strip out the petsc relavent stuff from this location > > rm -f arch-packages/include/petsc*.h > > > > 4.3. Now build PETSc with TAU - using these prebuilt-packages > > > > ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 PETSC_ARCH=arch-tau --with-PACKAGE-dir=`pwd`/arch-packages > > > > BTW: the current release is petsc-3.5 - we recommend upgrading to > > using it [as we usually support the latest release wrt debugging/bug > > fixes] > > > > Satish > > > > > > > > On Tue, 16 Sep 2014, Matthew Hills wrote: > > > > > > Hi PETSc Team,] > > > > > > I am experiencing difficulties with configuring PETSc with TAU. I have replaced the standard compilers with the tau_cc.sh, tau_cxx.sh, and tau_f90.sh scripts but this produces multiple errors. I have also attempted to use OpenMPI and MPICH, but both produce their own unique errors. > > > > > > After successfully compiling PDT, TAU was compiled with: > > > > > > ./configure > > > -prefix=`pwd` > > > -cc=gcc > > > -c++=g++ -fortran=gfortran > > > -pdt=${SESKADIR}/packages/pdt > > > -mpiinc=${PETSC_DIR}/${PETSC_ARCH}/include > > > -mpilib=${PETSC_DIR}/${PETSC_ARCH}/lib -bfd=download > > > > > > Attached you'll find the PETSc configuration logs. If any more information is needed please let me know. > > > > > > Warm regards, > > > > > > Matthew Hills > > > University of Cape Town > > > > From stefano.zampini at gmail.com Mon Sep 22 10:39:44 2014 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Mon, 22 Sep 2014 18:39:44 +0300 Subject: [petsc-users] FETI-DP implementation and call sequence In-Reply-To: References: <871trs7pnj.fsf@jedbrown.org> <1F4A79CC-DDB5-40C1-A94C-870CCF03330F@gmail.com> <87egvpy3gt.fsf@jedbrown.org> <87lhpxwhda.fsf@jedbrown.org> Message-ID: Sorry for late reply. I just pushed a fix for the crash. It is in master. Stefano On Fri, 5 Sep 2014, Jed Brown wrote: > Satish Balay writes: > > Perhaps the following is the fix [with proper comments, more error > > checks?]. But someone more familiar with this code should check this.. > > > > Satish > > > > -------------- > > $ git diff |cat > > diff --git a/src/ksp/pc/impls/is/pcis.c b/src/ksp/pc/impls/is/pcis.c > > index dab5836..0fa0217 100644 > > --- a/src/ksp/pc/impls/is/pcis.c > > +++ b/src/ksp/pc/impls/is/pcis.c > > @@ -140,6 +140,8 @@ PetscErrorCode PCISSetUp(PC pc) > > ierr = PetscObjectTypeCompare((PetscObject)pc->pmat,MATIS,&flg);CHKERRQ(ierr); > > if (!flg) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_WRONG,"Preconditioner type of Neumann Neumman requires matrix of type MATIS"); > > matis = (Mat_IS*)pc->pmat->data; > > + PetscObjectReference((PetscObject)pc->pmat); > > + pcis->pmat = pc->pmat; > > Uh, PCISSetUp can be called more than once? I have no idea.. > > And simply destroying the pcis->pmat reference is not enough because > that extra reference could significantly increase the peak memory usage. Curently the object (pc->pmat) is destroyed at the end anyway [perhaps duing PCDestroy()]. This fix changes the order a bit so that its destoryed only after its last use. > The right solution is to not hold that reference and not hold the info. > > > pcis->pure_neumann = matis->pure_neumann; > > > > @@ -378,8 +380,9 @@ PetscErrorCode PCISDestroy(PC pc) > > ierr = VecScatterDestroy(&pcis->global_to_B);CHKERRQ(ierr); > > ierr = PetscFree(pcis->work_N);CHKERRQ(ierr); > > if (pcis->ISLocalToGlobalMappingGetInfoWasCalled) { > > - ierr = ISLocalToGlobalMappingRestoreInfo((ISLocalToGlobalMapping)0,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); > > + ierr = ISLocalToGlobalMappingRestoreInfo(((Mat_IS*)pcis->pmat->data)->mapping,&(pcis->n_neigh),&(pcis->neigh),&(pcis->n_shared),&(pcis->shared));CHKERRQ(ierr); > > } > > Why not restore the info at the place it is gotten, like we do with > every other accessor? Looks like this info is stashed in 'pcis->n_neigh, pcis->neigh' etc - and reused later multple times. [perhaps preventing multiple mallocs/frees] $ git grep -l 'pcis->n_neigh' src/ksp/pc/impls/bddc/bddcfetidp.c src/ksp/pc/impls/is/nn/nn.c src/ksp/pc/impls/is/pcis.c Or perhaps this info should be stashed in the IS so multiple ISLocalToGlobalMappingGetInfo() calls are cheap [but then the malloc'd memory will live until IS is destroyed anyway] I guess there are 2 issues you are touching on. A fix for this crash - and code cleanup. My patch gets the examples working. But I'll defer both isses to Stefano [asuming he is aquainted with the above sources]. Satish > > > + ierr = MatDestroy(&pcis->pmat);CHKERRQ(ierr); > > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetUseStiffnessScaling_C",NULL);CHKERRQ(ierr); > > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetSubdomainScalingFactor_C",NULL);CHKERRQ(ierr); > > ierr = PetscObjectComposeFunction((PetscObject)pc,"PCISSetSubdomainDiagonalScaling_C",NULL);CHKERRQ(ierr); > > diff --git a/src/ksp/pc/impls/is/pcis.h b/src/ksp/pc/impls/is/pcis.h > > index 4a42cf9..736ea8c 100644 > > --- a/src/ksp/pc/impls/is/pcis.h > > +++ b/src/ksp/pc/impls/is/pcis.h > > @@ -73,6 +73,7 @@ typedef struct { > > /* We need: */ > > /* proc[k].loc_to_glob(proc[k].shared[i][m]) == proc[l].loc_to_glob(proc[l].shared[j][m]) */ > > /* for all 0 <= m < proc[k].n_shared[i], or equiv'ly, for all 0 <= m < proc[l].n_shared[j] */ > > + Mat pmat; > > } PC_IS; > > > > PETSC_EXTERN PetscErrorCode PCISSetUp(PC pc); > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 22 11:37:30 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 Sep 2014 11:37:30 -0500 Subject: [petsc-users] Natural norm In-Reply-To: <871tr3zroh.fsf@jedbrown.org> References: <17A78B9D13564547AC894B88C1596747203AF1F7@XMBX4.uibk.ac.at> <4304962A-E9B6-4752-959A-3436993B5675@mcs.anl.gov> <871tr3zroh.fsf@jedbrown.org> Message-ID: On Sep 22, 2014, at 8:54 AM, Jed Brown wrote: > Barry Smith writes: > >> On Sep 21, 2014, at 12:35 PM, De Groof, Vincent Frans Maria wrote: >> >>> the natural norm for positive definite systems in Petsc uses the >>> preconditioner B, and is defined by r' * B * r. Am I right assuming >>> that this way we want to obtain an estimate for r' * K^-1 * r, which >>> is impossible since we don't have K^-1? But we do know B which is >>> approximately K^-1. >> >> I think so. The way I look at it is r? * B * r = e? *A *B *A e and >> if B is inv(A) then it = e?*A*e which is the ?energy? of the error >> as measured by A, > All true. Switching the CG ?norm? does not effect the algorithm (in exact arithmetic) it only affects the norm that is printed and when the algorithm stops. The same minimization principles hold independent of the ?norm? used. Bary > Hmm, unpreconditioned CG minimizes the A-norm (energy norm) of the error: > i.e., |e|_A = e' * A * e. This is in contrast to GMRES which simply > minimizes the 2-norm of the residual: |r|_2 = r' * r = e' * A' * A * e = > |e|_{A'*A}. Note that CG's norm is stronger. > > When you add preconditioning, CG minimizes the B^{T/2} A B^{1/2} norm of > the error as compared to GMRES, which minimizes the B' A' A B norm (or > A' B' B A for left preconditioning). > > If the preconditioner B = A^{-1}, then all methods minimize both the > error and residual (in exact arithmetic) because the preconditioned > operator is the identity. From jed at jedbrown.org Mon Sep 22 11:42:07 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 22 Sep 2014 10:42:07 -0600 Subject: [petsc-users] Natural norm In-Reply-To: References: <17A78B9D13564547AC894B88C1596747203AF1F7@XMBX4.uibk.ac.at> <4304962A-E9B6-4752-959A-3436993B5675@mcs.anl.gov> <871tr3zroh.fsf@jedbrown.org> Message-ID: <87lhpbip40.fsf@jedbrown.org> Barry Smith writes: > All true. Switching the CG ?norm? does not effect the algorithm (in > exact arithmetic) it only affects the norm that is printed and when > the algorithm stops. The same minimization principles hold > independent of the ?norm? used. Yup, and the natural norm avoids the need for an extra reduction. If you're using classic (non-pipelined and not single_reduction) CG in a latency-sensitive environment, then you might benefit from using the natural norm. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From antoine.deblois at aero.bombardier.com Mon Sep 22 12:11:59 2014 From: antoine.deblois at aero.bombardier.com (Antoine De Blois) Date: Mon, 22 Sep 2014 17:11:59 +0000 Subject: [petsc-users] superlu_dist and MatSolveTranspose In-Reply-To: References: <22B78B7D747CBF4FA36FCD2C6EC7AF5EAED55AED@MTLWAEXCH005.ca.aero.bombardier.net> <87vbpc49e3.fsf@jedbrown.org> <22B78B7D747CBF4FA36FCD2C6EC7AF5EAED5BEDF@MTLWAEXCH005.ca.aero.bombardier.net> Message-ID: <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BB9A@MTLWAEXCH004.ca.aero.bombardier.net> Dear all, Sorry for the delay on this topic. Thank you Gaetan for your suggestion. I had thought about doing that originally, but I had left it out since I thought that a rank owned the entire row of the matrix (and not only the sub-diagonal part). I will certainly give it a try. I still need the MatSolveTranspose since I need the ability to reuse the residual jacobian matrix from the flow (a 1st order approximation of it), which is assembled in a non-transposed format. This way the adjoint system is solved in a pseudo-time step manner, where the product of the exact jacobian matrix and the adjoint vector is used as a source term in the rhs. Hong, do you have an estimation of the time required to implement it in superlu_dist? Best, Antoine -----Message d'origine----- De?: Hong [mailto:hzhang at mcs.anl.gov] Envoy??: Friday, August 29, 2014 9:14 PM ??: Gaetan Kenway Cc?: Antoine De Blois; petsc-users at mcs.anl.gov Objet?: Re: [petsc-users] superlu_dist and MatSolveTranspose We can add MatSolveTranspose() to the petsc interface with superlu_dist. Jed, Are you working on it? If not, I can work on it. Hong On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway wrote: > Hi Antoine > > We are also using PETSc for solving adjoint systems resulting from > CFD. To get around the matSolveTranspose issue we just assemble the > transpose matrix directly and then call KSPSolve(). If this is > possible in your application I think it is probably the best approach > > Gaetan > > > On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois > wrote: >> >> Hello Jed, >> >> Thank you for your quick response. So I spent some time to dig deeper >> into my problem. I coded a shell script that passes through a bunch >> of ksp_type, pc_type and sub_pc_type. So please disregard the comment >> about the "does not converge properly for transpose". I had taken >> that conclusion from my own code (and not from the ex10 and extracted >> matrix), and a KSPSetFromOptions was missing. Apologies for that. >> >> What remains is the performance issue. The MatSolveTranspose takes a >> very long time to converge. For a matrix of 3 million rows, >> MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the >> MatSolve is almost instantaneous!. When I gdb my code, petsc seems to >> be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time. >> I also did a top on the compute node to check the RAM usage. It was >> hovering over 2 gig, so memory usage does not seem to be an issue here. >> >> #0 0x00002afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #1 0x00002afe8e07f15c in MatLUFactorNumeric () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #2 0x00002afe8e2afa99 in PCSetUp_ILU () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #3 0x00002afe8e337c0d in PCSetUp () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #4 0x00002afe8e39d643 in KSPSetUp () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #5 0x00002afe8e39e3ee in KSPSolveTranspose () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #6 0x00002afe8e300f8c in PCApplyTranspose_ASM () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #7 0x00002afe8e338c13 in PCApplyTranspose () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #8 0x00002afe8e3a8a84 in KSPInitialResidual () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #9 0x00002afe8e376c32 in KSPSolve_GMRES () >> from >> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >> bpetsc.so.3.5 >> #10 0x00002afe8e39e425 in KSPSolveTranspose () >> >> For that particular application, I was using: >> ksp_type: gmres >> pc_type: asm >> sub_pc_type: ilu >> adj_sub_pc_factor_levels 1 >> >> For small matrices, the MatSolveTranspose computing time is very >> similar to the simple MatSolve. >> >> And if I want to revert to a MatTranspose followed by the MatSolve, >> then the MatTranspose takes forever to finish... For a matrix of 3 >> million rows, MatTranspose takes 30 minutes on 64 cpus!! >> >> So thank you for implementing the transpose solve in superlu_dist. It >> would also be nice to have it with hypre. >> Let me know what you think and ideas on how to improve my >> computational time, Regards, Antoine >> >> -----Message d'origine----- >> De : Jed Brown [mailto:jed at jedbrown.org] Envoy? : Thursday, August >> 28, 2014 5:01 PM ? : Antoine De Blois; 'petsc-users at mcs.anl.gov' >> Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose >> >> Antoine De Blois writes: >> >> > Hello everyone, >> > >> > I am trying to solve a A^T x = b system. For my applications, I had >> > realized that the MatSolveTranspose does not converge properly. >> >> What do you mean "does not converge properly"? Can you send a test >> case where the transpose solve should be equivalent, but is not? We >> have only a few tests for transpose solve and not all preconditioners >> support it, but where it is supported, we want to ensure that it is correct. >> >> > Therefore, I had implemented a MatTranspose followed by a MatSolve. >> > This proved to converge perfectly (which is strange since the >> > transposed matrix has the same eigenvalues as the untransposed...). >> > The problem is that for bigger matrices, the MatTranspose is very >> > costly and thus cannot be used. >> >> Costly in terms of memory? (I want you to be able to use >> KSPSolveTranspose, but I'm curious what you're experiencing.) >> >> > I tried using the superlu_dist package. Although it the package >> > works perfectly for the MatSolve, I get the an "No support for this >> > operation for this object type" error with MatSolveTransopse. I >> > reproduced the error using the MatView an ex10 tutorial. I can >> > provide the matrix and rhs upon request. My command line was: >> > >> > ex10 -f0 A_and_rhs.bin -pc_type lu -pc_factor_mat_solver_package >> > superlu_dist -trans >> > >> > So it there an additional parameter I need to use for the >> > transposed solve? >> > >> > [0]PETSC ERROR: --------------------- Error Message >> > -------------------------------------------------------------- >> > [0]PETSC ERROR: No support for this operation for this object type >> > [0]PETSC ERROR: Matrix type mpiaij >> >> This is easy to add. I'll do it now. >> >> > [0]PETSC ERROR: See >> > http://www.mcs.anl.gov/petsc/documentation/faq.html >> > for trouble shooting. >> > [0]PETSC ERROR: Petsc Release Version 3.5.1, unknown [0]PETSC ERROR: >> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/example >> > s/t >> > utorials/ex10 on a ARGUS_impi_opt named hpc-user11 by ad007804 Thu >> > Aug >> > 28 16:41:15 2014 [0]PETSC ERROR: Configure options --CFLAGS="-xHost >> > -axAVX" --download-hypre --download-metis --download-ml >> > --download-parmetis --download-scalapack --download-superlu_dist >> > --download-mumps --with-c2html=0 --with-cc=mpiicc >> > --with-fc=mpiifort --with-cxx=mpiicpc --with-debugging=yes >> > --prefix=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/petsc-3.5. >> > 1 >> > --with-cmake=/gpfs/fs1/aero/SOFTWARE/TOOLS/CMAKE/cmake-2.8.7/bin/cm >> > ake >> > --with-valgrind=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/valgrind- >> > 3.9 .0/bin/valgrind --with-shared-libraries=0 [0]PETSC ERROR: #1 >> > MatSolveTranspose() line 3473 in >> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/mat/interface/m >> > atr ix.c [0]PETSC ERROR: #2 PCApplyTranspose_LU() line 214 in >> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/impls/fa >> > cto r/lu/lu.c [0]PETSC ERROR: #3 PCApplyTranspose() line 573 in >> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/interfac >> > e/p recon.c [0]PETSC ERROR: #4 KSP_PCApply() line 233 in >> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/include/petsc-priva >> > te/ kspimpl.h [0]PETSC ERROR: #5 KSPInitialResidual() line 63 in >> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/interfa >> > ce/ itres.c [0]PETSC ERROR: #6 KSPSolve_GMRES() line 234 in >> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/impls/g >> > mre s/gmres.c [0]PETSC ERROR: #7 KSPSolveTranspose() line 704 in >> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/interfa >> > ce/ itfunc.c [0]PETSC ERROR: #8 main() line 324 in >> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/example >> > s/t >> > utorials/ex10.c >> > >> > FYI, the transpose solve is a typical application for adjoint >> > optimization. There should be a big adjoint community of developers >> > that try to solve the transposed matrix. >> > >> > Any help is much appreciated, >> > Best, >> > Antoine >> > >> > >> > Antoine DeBlois >> > Specialiste ingenierie, MDO lead / Engineering Specialist, MDO lead >> > A?ronautique / Aerospace 514-855-5001, x 50862 >> > antoine.deblois at aero.bombardier.com> > bar >> > dier.com> >> > >> > 2351 Blvd Alfred-Nobel >> > Montreal, Qc >> > H4S 1A9 >> > >> > [Description : Description : >> > http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg] >> > CONFIDENTIALITY NOTICE - This communication may contain privileged >> > or confidential information. >> > If you are not the intended recipient or received this >> > communication by error, please notify the sender and delete the >> > message without copying > > From hzhang at mcs.anl.gov Mon Sep 22 12:47:46 2014 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 22 Sep 2014 12:47:46 -0500 Subject: [petsc-users] superlu_dist and MatSolveTranspose In-Reply-To: <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BB9A@MTLWAEXCH004.ca.aero.bombardier.net> References: <22B78B7D747CBF4FA36FCD2C6EC7AF5EAED55AED@MTLWAEXCH005.ca.aero.bombardier.net> <87vbpc49e3.fsf@jedbrown.org> <22B78B7D747CBF4FA36FCD2C6EC7AF5EAED5BEDF@MTLWAEXCH005.ca.aero.bombardier.net> <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BB9A@MTLWAEXCH004.ca.aero.bombardier.net> Message-ID: I'll add it. It would not take too long, just matter of priority. I'll try to get it done in a day or two, then let you know when it works. Hong On Mon, Sep 22, 2014 at 12:11 PM, Antoine De Blois wrote: > Dear all, > > Sorry for the delay on this topic. > > Thank you Gaetan for your suggestion. I had thought about doing that originally, but I had left it out since I thought that a rank owned the entire row of the matrix (and not only the sub-diagonal part). I will certainly give it a try. > > I still need the MatSolveTranspose since I need the ability to reuse the residual jacobian matrix from the flow (a 1st order approximation of it), which is assembled in a non-transposed format. This way the adjoint system is solved in a pseudo-time step manner, where the product of the exact jacobian matrix and the adjoint vector is used as a source term in the rhs. > > Hong, do you have an estimation of the time required to implement it in superlu_dist? > > Best, > Antoine > > -----Message d'origine----- > De : Hong [mailto:hzhang at mcs.anl.gov] > Envoy? : Friday, August 29, 2014 9:14 PM > ? : Gaetan Kenway > Cc : Antoine De Blois; petsc-users at mcs.anl.gov > Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose > > We can add MatSolveTranspose() to the petsc interface with superlu_dist. > > Jed, > Are you working on it? If not, I can work on it. > > Hong > > On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway wrote: >> Hi Antoine >> >> We are also using PETSc for solving adjoint systems resulting from >> CFD. To get around the matSolveTranspose issue we just assemble the >> transpose matrix directly and then call KSPSolve(). If this is >> possible in your application I think it is probably the best approach >> >> Gaetan >> >> >> On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois >> wrote: >>> >>> Hello Jed, >>> >>> Thank you for your quick response. So I spent some time to dig deeper >>> into my problem. I coded a shell script that passes through a bunch >>> of ksp_type, pc_type and sub_pc_type. So please disregard the comment >>> about the "does not converge properly for transpose". I had taken >>> that conclusion from my own code (and not from the ex10 and extracted >>> matrix), and a KSPSetFromOptions was missing. Apologies for that. >>> >>> What remains is the performance issue. The MatSolveTranspose takes a >>> very long time to converge. For a matrix of 3 million rows, >>> MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the >>> MatSolve is almost instantaneous!. When I gdb my code, petsc seems to >>> be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time. >>> I also did a top on the compute node to check the RAM usage. It was >>> hovering over 2 gig, so memory usage does not seem to be an issue here. >>> >>> #0 0x00002afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #1 0x00002afe8e07f15c in MatLUFactorNumeric () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #2 0x00002afe8e2afa99 in PCSetUp_ILU () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #3 0x00002afe8e337c0d in PCSetUp () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #4 0x00002afe8e39d643 in KSPSetUp () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #5 0x00002afe8e39e3ee in KSPSolveTranspose () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #6 0x00002afe8e300f8c in PCApplyTranspose_ASM () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #7 0x00002afe8e338c13 in PCApplyTranspose () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #8 0x00002afe8e3a8a84 in KSPInitialResidual () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #9 0x00002afe8e376c32 in KSPSolve_GMRES () >>> from >>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>> bpetsc.so.3.5 >>> #10 0x00002afe8e39e425 in KSPSolveTranspose () >>> >>> For that particular application, I was using: >>> ksp_type: gmres >>> pc_type: asm >>> sub_pc_type: ilu >>> adj_sub_pc_factor_levels 1 >>> >>> For small matrices, the MatSolveTranspose computing time is very >>> similar to the simple MatSolve. >>> >>> And if I want to revert to a MatTranspose followed by the MatSolve, >>> then the MatTranspose takes forever to finish... For a matrix of 3 >>> million rows, MatTranspose takes 30 minutes on 64 cpus!! >>> >>> So thank you for implementing the transpose solve in superlu_dist. It >>> would also be nice to have it with hypre. >>> Let me know what you think and ideas on how to improve my >>> computational time, Regards, Antoine >>> >>> -----Message d'origine----- >>> De : Jed Brown [mailto:jed at jedbrown.org] Envoy? : Thursday, August >>> 28, 2014 5:01 PM ? : Antoine De Blois; 'petsc-users at mcs.anl.gov' >>> Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose >>> >>> Antoine De Blois writes: >>> >>> > Hello everyone, >>> > >>> > I am trying to solve a A^T x = b system. For my applications, I had >>> > realized that the MatSolveTranspose does not converge properly. >>> >>> What do you mean "does not converge properly"? Can you send a test >>> case where the transpose solve should be equivalent, but is not? We >>> have only a few tests for transpose solve and not all preconditioners >>> support it, but where it is supported, we want to ensure that it is correct. >>> >>> > Therefore, I had implemented a MatTranspose followed by a MatSolve. >>> > This proved to converge perfectly (which is strange since the >>> > transposed matrix has the same eigenvalues as the untransposed...). >>> > The problem is that for bigger matrices, the MatTranspose is very >>> > costly and thus cannot be used. >>> >>> Costly in terms of memory? (I want you to be able to use >>> KSPSolveTranspose, but I'm curious what you're experiencing.) >>> >>> > I tried using the superlu_dist package. Although it the package >>> > works perfectly for the MatSolve, I get the an "No support for this >>> > operation for this object type" error with MatSolveTransopse. I >>> > reproduced the error using the MatView an ex10 tutorial. I can >>> > provide the matrix and rhs upon request. My command line was: >>> > >>> > ex10 -f0 A_and_rhs.bin -pc_type lu -pc_factor_mat_solver_package >>> > superlu_dist -trans >>> > >>> > So it there an additional parameter I need to use for the >>> > transposed solve? >>> > >>> > [0]PETSC ERROR: --------------------- Error Message >>> > -------------------------------------------------------------- >>> > [0]PETSC ERROR: No support for this operation for this object type >>> > [0]PETSC ERROR: Matrix type mpiaij >>> >>> This is easy to add. I'll do it now. >>> >>> > [0]PETSC ERROR: See >>> > http://www.mcs.anl.gov/petsc/documentation/faq.html >>> > for trouble shooting. >>> > [0]PETSC ERROR: Petsc Release Version 3.5.1, unknown [0]PETSC ERROR: >>> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/example >>> > s/t >>> > utorials/ex10 on a ARGUS_impi_opt named hpc-user11 by ad007804 Thu >>> > Aug >>> > 28 16:41:15 2014 [0]PETSC ERROR: Configure options --CFLAGS="-xHost >>> > -axAVX" --download-hypre --download-metis --download-ml >>> > --download-parmetis --download-scalapack --download-superlu_dist >>> > --download-mumps --with-c2html=0 --with-cc=mpiicc >>> > --with-fc=mpiifort --with-cxx=mpiicpc --with-debugging=yes >>> > --prefix=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/petsc-3.5. >>> > 1 >>> > --with-cmake=/gpfs/fs1/aero/SOFTWARE/TOOLS/CMAKE/cmake-2.8.7/bin/cm >>> > ake >>> > --with-valgrind=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/valgrind- >>> > 3.9 .0/bin/valgrind --with-shared-libraries=0 [0]PETSC ERROR: #1 >>> > MatSolveTranspose() line 3473 in >>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/mat/interface/m >>> > atr ix.c [0]PETSC ERROR: #2 PCApplyTranspose_LU() line 214 in >>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/impls/fa >>> > cto r/lu/lu.c [0]PETSC ERROR: #3 PCApplyTranspose() line 573 in >>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/interfac >>> > e/p recon.c [0]PETSC ERROR: #4 KSP_PCApply() line 233 in >>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/include/petsc-priva >>> > te/ kspimpl.h [0]PETSC ERROR: #5 KSPInitialResidual() line 63 in >>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/interfa >>> > ce/ itres.c [0]PETSC ERROR: #6 KSPSolve_GMRES() line 234 in >>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/impls/g >>> > mre s/gmres.c [0]PETSC ERROR: #7 KSPSolveTranspose() line 704 in >>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/interfa >>> > ce/ itfunc.c [0]PETSC ERROR: #8 main() line 324 in >>> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/example >>> > s/t >>> > utorials/ex10.c >>> > >>> > FYI, the transpose solve is a typical application for adjoint >>> > optimization. There should be a big adjoint community of developers >>> > that try to solve the transposed matrix. >>> > >>> > Any help is much appreciated, >>> > Best, >>> > Antoine >>> > >>> > >>> > Antoine DeBlois >>> > Specialiste ingenierie, MDO lead / Engineering Specialist, MDO lead >>> > A?ronautique / Aerospace 514-855-5001, x 50862 >>> > antoine.deblois at aero.bombardier.com>> > bar >>> > dier.com> >>> > >>> > 2351 Blvd Alfred-Nobel >>> > Montreal, Qc >>> > H4S 1A9 >>> > >>> > [Description : Description : >>> > http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg] >>> > CONFIDENTIALITY NOTICE - This communication may contain privileged >>> > or confidential information. >>> > If you are not the intended recipient or received this >>> > communication by error, please notify the sender and delete the >>> > message without copying >> >> From antoine.deblois at aero.bombardier.com Mon Sep 22 13:28:29 2014 From: antoine.deblois at aero.bombardier.com (Antoine De Blois) Date: Mon, 22 Sep 2014 18:28:29 +0000 Subject: [petsc-users] PCASM in transposed format Message-ID: <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BBE1@MTLWAEXCH004.ca.aero.bombardier.net> Dear all, I am using the ASM preconditioner to solve a transposed system through MatSolveTranspose. Strangely, the results I obtain differ in each call. Is there a non-deterministic operations within ASM? If I use it in a non-transposed way, I get correct results... If I use GASM, then the results are always the same, both in transposed and non-transposed formats. Below is a log of my calls (note that some of them actually diverged). I can give you my matrix and rhs if you want. Regards, Antoine $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm Number of iterations = 690 Residual norm 0.000617544 $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm Number of iterations = 475 Residual norm 0.000727253 $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm Number of iterations = 10000 Residual norm 1.3866 $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm Number of iterations = 568 Residual norm 0.000684401 $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm Number of iterations = 540 Residual norm 0.000555548 $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm Number of iterations = 10000 Residual norm 1.30198 $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm Number of iterations = 207 Residual norm 0.000555849 --------------- $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type gasm Number of iterations = 297 Residual norm 0.000600143 $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type gasm Number of iterations = 297 Residual norm 0.000600143 Antoine DeBlois Specialiste ingenierie, MDO lead / Engineering Specialist, MDO lead A?ronautique / Aerospace 514-855-5001, x 50862 antoine.deblois at aero.bombardier.com 2351 Blvd Alfred-Nobel Montreal, Qc H4S 1A9 [Description?: Description?: http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg] CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. If you are not the intended recipient or received this communication by error, please notify the sender and delete the message without copying -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 4648 bytes Desc: image001.jpg URL: From chung1shen at yahoo.com Mon Sep 22 13:57:07 2014 From: chung1shen at yahoo.com (Chung Shen) Date: Mon, 22 Sep 2014 11:57:07 -0700 Subject: [petsc-users] GPU speedup in Poisson solvers Message-ID: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> Dear PETSc Users, I am new to PETSc and trying to determine if GPU speedup is possible with the 3D Poisson solvers. I configured 2 copies of 'petsc-master' on a standalone machine, one with CUDA toolkit 5.0 and one without (both without MPI): Machine: HP Z820 Workstation, Redhat Enterprise Linux 5.0 CPU: (x2) 8-core Xeon E5-2650 2.0GHz, 128GB Memory GPU: (x2) Tesla K20c (706MHz, 5.12GB Memory, Cuda Compatibility: 3.5, Driver: 313.09) I used 'src/ksp/ksp/examples/tests/ex32.c' as a test and was getting about 20% speedup with GPU. Is this reasonable or did I miss something? Attached is a comparison chart with two sample logs. The y-axis is the elapsed time in seconds and the x-axis corresponds to the size of the problem. In particular, I wonder if the numbers of calls to 'VecCUSPCopyTo' and 'VecCUSPCopyFrom' shown in the GPU log are excessive? Thanks in advance for your reply. Best Regards, Chung Shen -------------- next part -------------- A non-text attachment was scrubbed... Name: chart.jpg Type: image/pjpeg Size: 122316 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ex32-m150-cpu.log URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ex32-m150-gpu.log URL: From bsmith at mcs.anl.gov Mon Sep 22 14:17:07 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 Sep 2014 14:17:07 -0500 Subject: [petsc-users] PCASM in transposed format In-Reply-To: <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BBE1@MTLWAEXCH004.ca.aero.bombardier.net> References: <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BBE1@MTLWAEXCH004.ca.aero.bombardier.net> Message-ID: <05689B26-BA45-4BFA-AE29-826BC92ACCD7@mcs.anl.gov> Please email the data file to petsc-maint at mcs.anl.gov or if you fear it is too large tell us from where we may download it. Barry On Sep 22, 2014, at 1:28 PM, Antoine De Blois wrote: > Dear all, > > I am using the ASM preconditioner to solve a transposed system through MatSolveTranspose. Strangely, the results I obtain differ in each call. Is there a non-deterministic operations within ASM? If I use it in a non-transposed way, I get correct results? If I use GASM, then the results are always the same, both in transposed and non-transposed formats. > > Below is a log of my calls (note that some of them actually diverged). > I can give you my matrix and rhs if you want. > > Regards, > Antoine > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 690 > Residual norm 0.000617544 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 475 > Residual norm 0.000727253 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 10000 > Residual norm 1.3866 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 568 > Residual norm 0.000684401 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 540 > Residual norm 0.000555548 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 10000 > Residual norm 1.30198 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 207 > Residual norm 0.000555849 > > > --------------- > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type gasm > Number of iterations = 297 > Residual norm 0.000600143 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type gasm > Number of iterations = 297 > Residual norm 0.000600143 > > > > Antoine DeBlois > Specialiste ingenierie, MDO lead / Engineering Specialist, MDO lead > A?ronautique / Aerospace > 514-855-5001, x 50862 > antoine.deblois at aero.bombardier.com > > 2351 Blvd Alfred-Nobel > Montreal, Qc > H4S 1A9 > > > CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. > If you are not the intended recipient or received this communication by error, please notify the sender > and delete the message without copying From rupp at iue.tuwien.ac.at Mon Sep 22 14:25:15 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Mon, 22 Sep 2014 21:25:15 +0200 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> Message-ID: <5420779B.40705@iue.tuwien.ac.at> Hi, > I am new to PETSc and trying to determine if GPU speedup is possible with the 3D Poisson solvers. I configured 2 copies of 'petsc-master' on a standalone machine, one with CUDA toolkit 5.0 and one without (both without MPI): > Machine: HP Z820 Workstation, Redhat Enterprise Linux 5.0 > CPU: (x2) 8-core Xeon E5-2650 2.0GHz, 128GB Memory > GPU: (x2) Tesla K20c (706MHz, 5.12GB Memory, Cuda Compatibility: 3.5, Driver: 313.09) > > I used 'src/ksp/ksp/examples/tests/ex32.c' as a test and was getting about 20% speedup with GPU. Is this reasonable or did I miss something? That is fairly reasonable for your setting, yet the setup is not ideal: With the default ILU preconditioner, the residual gets copied between host and device in each iteration. Better use a preconditioner suitable for the GPU. For a Poisson problem you should get good numbers with the algebraic multigrid preconditioner in CUSP (-pctype sacusp) For Poisson you may also try CG instead of GMRES to save all the orthogonalization costs - assuming that you use a symmetric preconditioner. > Attached is a comparison chart with two sample logs. The y-axis is the elapsed time in seconds and the x-axis corresponds to the size of the problem. In particular, I wonder if the numbers of calls to 'VecCUSPCopyTo' and 'VecCUSPCopyFrom' shown in the GPU log are excessive? They just manifest that the residual gets copied between host and device in each iteration because ILU is only run sequentially. Best regards, Karli From dmeiser at txcorp.com Mon Sep 22 14:38:29 2014 From: dmeiser at txcorp.com (Dominic Meiser) Date: Mon, 22 Sep 2014 13:38:29 -0600 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> Message-ID: <54207AB5.2050104@txcorp.com> On 09/22/2014 12:57 PM, Chung Shen wrote: > Dear PETSc Users, > > I am new to PETSc and trying to determine if GPU speedup is possible with the 3D Poisson solvers. I configured 2 copies of 'petsc-master' on a standalone machine, one with CUDA toolkit 5.0 and one without (both without MPI): > Machine: HP Z820 Workstation, Redhat Enterprise Linux 5.0 > CPU: (x2) 8-core Xeon E5-2650 2.0GHz, 128GB Memory > GPU: (x2) Tesla K20c (706MHz, 5.12GB Memory, Cuda Compatibility: 3.5, Driver: 313.09) > > I used 'src/ksp/ksp/examples/tests/ex32.c' as a test and was getting about 20% speedup with GPU. Is this reasonable or did I miss something? > > Attached is a comparison chart with two sample logs. The y-axis is the elapsed time in seconds and the x-axis corresponds to the size of the problem. In particular, I wonder if the numbers of calls to 'VecCUSPCopyTo' and 'VecCUSPCopyFrom' shown in the GPU log are excessive? > > Thanks in advance for your reply. > > Best Regards, > > Chung Shen A few comments: - To get reliable timing you should configure PETSc without debugging (i.e. --with-debugging=no) - The ILU preconditioning in your GPU benchmark is done on the CPU. The host-device data transfers are killing performance. Can you try to run with the additional option --pc_factor_mat_solver_packe cusparse? This will perform the preconditioning on the GPU. - If you're interested in running benchmarks in parallel you will need a few patches that are not yet in petsc/master. I can put together a branch that has the needed fixes. Cheers, Dominic -- Dominic Meiser Tech-X Corporation 5621 Arapahoe Avenue Boulder, CO 80303 USA Telephone: 303-996-2036 Fax: 303-448-7756 www.txcorp.com From ashwinsrnth at gmail.com Mon Sep 22 14:47:03 2014 From: ashwinsrnth at gmail.com (Ashwin Srinath) Date: Mon, 22 Sep 2014 15:47:03 -0400 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <54207AB5.2050104@txcorp.com> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> Message-ID: Dominic, I second a request for such a branch. Thanks, Ashwin On Mon, Sep 22, 2014 at 3:38 PM, Dominic Meiser wrote: > On 09/22/2014 12:57 PM, Chung Shen wrote: > >> Dear PETSc Users, >> >> I am new to PETSc and trying to determine if GPU speedup is possible with >> the 3D Poisson solvers. I configured 2 copies of 'petsc-master' on a >> standalone machine, one with CUDA toolkit 5.0 and one without (both without >> MPI): >> Machine: HP Z820 Workstation, Redhat Enterprise Linux 5.0 >> CPU: (x2) 8-core Xeon E5-2650 2.0GHz, 128GB Memory >> GPU: (x2) Tesla K20c (706MHz, 5.12GB Memory, Cuda Compatibility: 3.5, >> Driver: 313.09) >> >> I used 'src/ksp/ksp/examples/tests/ex32.c' as a test and was getting >> about 20% speedup with GPU. Is this reasonable or did I miss something? >> >> Attached is a comparison chart with two sample logs. The y-axis is the >> elapsed time in seconds and the x-axis corresponds to the size of the >> problem. In particular, I wonder if the numbers of calls to 'VecCUSPCopyTo' >> and 'VecCUSPCopyFrom' shown in the GPU log are excessive? >> >> Thanks in advance for your reply. >> >> Best Regards, >> >> Chung Shen >> > A few comments: > > - To get reliable timing you should configure PETSc without debugging > (i.e. --with-debugging=no) > - The ILU preconditioning in your GPU benchmark is done on the CPU. The > host-device data transfers are killing performance. Can you try to run with > the additional option --pc_factor_mat_solver_packe cusparse? This will > perform the preconditioning on the GPU. > - If you're interested in running benchmarks in parallel you will need a > few patches that are not yet in petsc/master. I can put together a branch > that has the needed fixes. > > Cheers, > Dominic > > -- > Dominic Meiser > Tech-X Corporation > 5621 Arapahoe Avenue > Boulder, CO 80303 > USA > Telephone: 303-996-2036 > Fax: 303-448-7756 > www.txcorp.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.m.alletto at lmco.com Mon Sep 22 16:02:32 2014 From: john.m.alletto at lmco.com (Alletto, John M) Date: Mon, 22 Sep 2014 21:02:32 +0000 Subject: [petsc-users] Stencil width for a 13 point 4th order stencil Message-ID: All, I have two code baseline one uses a standard 7 point STAR stencil the other a 13 point Star stencil. The first baseline works the second comes back with errors MatSetValuesStencil Argument out of range In the second baseline I have a fourth order 13 point stencil which spans +- 2 in all direction from each 3D point of my data set. stw = 2 I set the width in DMDACreate3d( ....dof,stw, 0, 0, 0, &da) Does a larger stencil need to be quantified or declare in another location? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 22 16:18:18 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 Sep 2014 16:18:18 -0500 Subject: [petsc-users] Stencil width for a 13 point 4th order stencil In-Reply-To: References: Message-ID: <0AB31F8D-91B3-4125-B705-506EE424B9BC@mcs.anl.gov> You should not need to provide the same information elsewhere. Please send the entire error message. On Sep 22, 2014, at 4:02 PM, Alletto, John M wrote: > All, > > I have two code baseline one uses a standard 7 point STAR stencil the other a 13 point Star stencil. > > The first baseline works the second comes back with errors MatSetValuesStencil > Argument out of range > > In the second baseline I have a fourth order 13 point stencil which spans +- 2 in all direction from each 3D point of my data set. > > stw = 2 > I set the width in DMDACreate3d( ?.dof,stw, 0, 0, 0, &da) > > Does a larger stencil need to be quantified or declare in another location? > > John From john.m.alletto at lmco.com Mon Sep 22 18:41:37 2014 From: john.m.alletto at lmco.com (Alletto, John M) Date: Mon, 22 Sep 2014 23:41:37 +0000 Subject: [petsc-users] Are there a set of general rules for picking preconditioners and solvers when solving a 3D Laplace's equation? Message-ID: I am solving one of the PETSc 3D Laplacian examples with a 7 point stencil of width 1 and in a separate baseline with a 13 point stencil of width 2 (a more accurate mesh). What worked fast in terms of solvers and pre-conditioner for the less accurate baseline was non-optimal (very slow) for the more accurate baseline. Are there a set of general rules for picking preconditioners and solvers when solving a 3D Laplace's equation? Thank you for your time John Alletto -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 22 18:53:06 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 Sep 2014 18:53:06 -0500 Subject: [petsc-users] Are there a set of general rules for picking preconditioners and solvers when solving a 3D Laplace's equation? In-Reply-To: References: Message-ID: John, For any non-trivial size problem for the Laplacian you definitely want to use multigrid. You can start by trying algebraic multigrid on both cases with -pc_type gamg Barry On Sep 22, 2014, at 6:41 PM, Alletto, John M wrote: > I am solving one of the PETSc 3D Laplacian examples with a 7 point stencil of width 1 and in a separate baseline with a 13 point stencil of width 2 (a more accurate mesh). > > What worked fast in terms of solvers and pre-conditioner for the less accurate baseline was non-optimal (very slow) for the more accurate baseline. > > Are there a set of general rules for picking preconditioners and solvers when solving a 3D Laplace's equation? > > > Thank you for your time > John Alletto From 4bikerboyjohn at gmail.com Mon Sep 22 19:51:14 2014 From: 4bikerboyjohn at gmail.com (John Alletto) Date: Mon, 22 Sep 2014 17:51:14 -0700 Subject: [petsc-users] Laplacian at infinity In-Reply-To: References: <9695B232-81E3-4841-9DA8-E844BDDAE2F0@gmail.com> <87zjdriw8o.fsf@jedbrown.org> Message-ID: <1510E7AD-7B02-45A1-8444-23CB42B846B7@gmail.com> Are there any PML example problems using PETSc? John On Sep 22, 2014, at 7:52 AM, Matthew Knepley wrote: > On Mon, Sep 22, 2014 at 9:08 AM, Jed Brown wrote: > Matthew Knepley writes: > > > On Mon, Sep 22, 2014 at 7:36 AM, John Alletto <4bikerboyjohn at gmail.com> > > wrote: > > > >> All, > >> > >> I am try to match some E&M problems with analytical solutions. > >> How do I deal with infinity when using a uniform grid ? > >> > > > > It depends on your problem. Keep making it bigger and see if you get > > convergence. > > This is a way to assess error caused by the finite grid, but you can > also use an analytic solution for that. Using an asymptotic expansion > for the boundary condition can be useful (esp. when there is nonzero > charge in the domain, for example). For wave propagation, look at > perfectly matched layers (PML). > > PML is the best option, although many codes simply use attenuation. > > Matt > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Sep 22 19:58:19 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Sep 2014 20:58:19 -0400 Subject: [petsc-users] Laplacian at infinity In-Reply-To: <1510E7AD-7B02-45A1-8444-23CB42B846B7@gmail.com> References: <9695B232-81E3-4841-9DA8-E844BDDAE2F0@gmail.com> <87zjdriw8o.fsf@jedbrown.org> <1510E7AD-7B02-45A1-8444-23CB42B846B7@gmail.com> Message-ID: On Mon, Sep 22, 2014 at 8:51 PM, John Alletto <4bikerboyjohn at gmail.com> wrote: > > Are there any PML example problems using PETSc? > I do not believe we have any. Thanks, Matt > John > > > On Sep 22, 2014, at 7:52 AM, Matthew Knepley wrote: > > On Mon, Sep 22, 2014 at 9:08 AM, Jed Brown wrote: > >> Matthew Knepley writes: >> >> > On Mon, Sep 22, 2014 at 7:36 AM, John Alletto <4bikerboyjohn at gmail.com> >> > wrote: >> > >> >> All, >> >> >> >> I am try to match some E&M problems with analytical solutions. >> >> How do I deal with infinity when using a uniform grid ? >> >> >> > >> > It depends on your problem. Keep making it bigger and see if you get >> > convergence. >> >> This is a way to assess error caused by the finite grid, but you can >> also use an analytic solution for that. Using an asymptotic expansion >> for the boundary condition can be useful (esp. when there is nonzero >> charge in the domain, for example). For wave propagation, look at >> perfectly matched layers (PML). >> > > PML is the best option, although many codes simply use attenuation. > > Matt > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaolive at mit.edu Mon Sep 22 21:15:00 2014 From: jaolive at mit.edu (Jean-Arthur Louis Olive) Date: Tue, 23 Sep 2014 02:15:00 +0000 Subject: [petsc-users] problem with nested fieldsplits Message-ID: Hi all, I am using PETSc (dev version) to solve the Stokes + temperature equations. My DM has fields (vx, vy, p, T). I would like to use nested fieldsplits to separate the T part from the Stokes part, and apply a Schur complement approach to the Stokes block. Unfortunately, I keep getting this error message: [1]PETSC ERROR: DMCreateFieldDecomposition() line 1274 in /home/jolive/petsc/src/dm/interface/dm.c Decomposition defined only after DMSetUp Here are the command line options I tried: -snes_type ksponly \ -ksp_type fgmres \ # define 2 fields: [vx vy p] and [T] -pc_type fieldsplit -pc_fieldsplit_0_fields 0,1,2 -pc_fieldsplit_1_fields 3 \ # split [vx vy p] into 2 fields: [vx vy] and [p] -fieldsplit_0_pc_type fieldsplit \ -pc_fieldsplit_0_fieldsplit_0_fields 0,1 -pc_fieldsplit_0_fieldsplit_1_fields 2 \ # apply schur complement to [vx vy p] -fieldsplit_0_pc_fieldsplit_type schur \ -fieldsplit_0_pc_fieldsplit_schur_factorization_type upper \ # solve everything with lu, just for testing -fieldsplit_0_fieldsplit_0_ksp_type preonly \ -fieldsplit_0_fieldsplit_0_pc_type lu -fieldsplit_0_fieldsplit_0_pc_factor_mat_solver_package superlu_dist \ -fieldsplit_0_fieldsplit_1_ksp_type preonly \ -fieldsplit_0_fieldsplit_1_pc_type lu -fieldsplit_0_fieldsplit_1_pc_factor_mat_solver_package superlu_dist \ -fieldsplit_1_ksp_type preonly \ -fieldsplit_1_pc_type lu -fieldsplit_1_pc_factor_mat_solver_package superlu_dist \ Any idea what could be causing this? Thanks a lot, Arthur -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaolive at mit.edu Mon Sep 22 21:37:08 2014 From: jaolive at mit.edu (Jean-Arthur Louis Olive) Date: Tue, 23 Sep 2014 02:37:08 +0000 Subject: [petsc-users] problem with nested fieldsplits In-Reply-To: References: Message-ID: <6DE0AD79-490E-4C2D-A907-2EE984B03973@mit.edu> Hi all, below is the complete error message & list of options. Best, Arthur STARTING SOLVE FOR TIMESTEP: 1 0 KSP unpreconditioned resid norm 7.599999999605e+10 true resid norm 7.599999999605e+10 ||r(i)||/||b|| 1.000000000000e+00 [0]PETSC ERROR: DMCreateFieldDecomposition() line 1274 in /home/jolive/petsc/src/dm/interface/dm.c Decomposition defined only after DMSetUp [1]PETSC ERROR: DMCreateFieldDecomposition() line 1274 in /home/jolive/petsc/src/dm/interface/dm.c Decomposition defined only after DMSetUp [0]PETSC ERROR: PETSC: Attaching gdb to ./Stokes of pid 21092 on display localhost:11.0 on machine fuxi [1]PETSC ERROR: PETSC: Attaching gdb to ./Stokes of pid 21093 on display localhost:11.0 on machine fuxi [0]PETSC ERROR: PCFieldSplitSetDefaults() line 467 in /home/jolive/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c Unhandled case, must have at least two fields, not 0 [1]PETSC ERROR: PCFieldSplitSetDefaults() line 467 in /home/jolive/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c Unhandled case, must have at least two fields, not 0 [0]PETSC ERROR: PETSC: Attaching gdb to ./Stokes of pid 21092 on display localhost:11.0 on machine fuxi [1]PETSC ERROR: PETSC: Attaching gdb to ./Stokes of pid 21093 on display localhost:11.0 on machine fuxi [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] DMCreateFieldDecomposition line 1251 /home/jolive/petsc/src/dm/interface/dm.c [1]PETSC ERROR: [1] PCFieldSplitSetDefaults line 320 /home/jolive/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [1]PETSC ERROR: [1] PCSetUp_FieldSplit line 483 /home/jolive/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [1]PETSC ERROR: [1] KSPSetUp line 219 /home/jolive/petsc/src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: [1] KSPSolve line 381 /home/jolive/petsc/src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: [1] PCApply_FieldSplit line 893 /home/jolive/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [1]PETSC ERROR: [1] KSP_PCApply line 225 /home/jolive/petsc/include/petsc-private/kspimpl.h [1]PETSC ERROR: [1] KSPFGMRESCycle line 111 /home/jolive/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c [1]PETSC ERROR: [1] KSPSolve_FGMRES line 278 /home/jolive/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c [1]PETSC ERROR: [1] SNESSolve_KSPONLY line 13 /home/jolive/petsc/src/snes/impls/ksponly/ksponly.c [1]PETSC ERROR: [1] SNESSolve line 3687 /home/jolive/petsc/src/snes/interface/snes.c [1]PETSC ERROR: [1] Hipster_RunPicardIterations line 5553 /home/jolive/HIPSTER/hipster/developing_HiPStER/09-16-14_nested_fieldsplits/StokesSolve1.c [1]PETSC ERROR: User provided function() line 0 in unknown file [1]PETSC ERROR: PETSC: Attaching gdb to ./Stokes of pid 21093 on display localhost:11.0 on machine fuxi [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] DMCreateFieldDecomposition line 1251 /home/jolive/petsc/src/dm/interface/dm.c [0]PETSC ERROR: [0] PCFieldSplitSetDefaults line 320 /home/jolive/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: [0] PCSetUp_FieldSplit line 483 /home/jolive/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: [0] KSPSetUp line 219 /home/jolive/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] KSPSolve line 381 /home/jolive/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] PCApply_FieldSplit line 893 /home/jolive/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: [0] KSP_PCApply line 225 /home/jolive/petsc/include/petsc-private/kspimpl.h [0]PETSC ERROR: [0] KSPFGMRESCycle line 111 /home/jolive/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c [0]PETSC ERROR: [0] KSPSolve_FGMRES line 278 /home/jolive/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 13 /home/jolive/petsc/src/snes/impls/ksponly/ksponly.c [0]PETSC ERROR: [0] SNESSolve line 3687 /home/jolive/petsc/src/snes/interface/snes.c [0]PETSC ERROR: [0] Hipster_RunPicardIterations line 5553 /home/jolive/HIPSTER/hipster/developing_HiPStER/09-16-14_nested_fieldsplits/StokesSolve1.c [0]PETSC ERROR: User provided function() line 0 in unknown file [0]PETSC ERROR: PETSC: Attaching gdb to ./Stokes of pid 21092 on display localhost:11.0 on machine fuxi options: -ksp_view -ksp_monitor_true_residual -ksp_converged_reason \ -ksp_type fgmres -ksp_rtol 1.0e-6 \ -pc_type fieldsplit -pc_fieldsplit_0_fields 0,1,2 -pc_fieldsplit_1_fields 3 \ -fieldsplit_0_pc_type fieldsplit \ -pc_fieldsplit_0_fieldsplit_0_fields 0,1 -pc_fieldsplit_0_fieldsplit_1_fields 2 \ -fieldsplit_0_pc_fieldsplit_type schur \ -fieldsplit_0_pc_fieldsplit_schur_factorization_type upper \ -fieldsplit_0_fieldsplit_0_ksp_type preonly \ -fieldsplit_0_fieldsplit_0_pc_type lu -fieldsplit_0_fieldsplit_0_pc_factor_mat_solver_package superlu_dist \ -fieldsplit_0_fieldsplit_1_ksp_type preonly \ -fieldsplit_0_fieldsplit_1_pc_type lu -fieldsplit_0_fieldsplit_1_pc_factor_mat_solver_package superlu_dist \ -fieldsplit_1_ksp_type preonly \ -fieldsplit_1_pc_type lu -fieldsplit_1_pc_factor_mat_solver_package superlu_dist \ -snes_type ksponly \ -snes_converged_reason -snes_linesearch_monitor true On Sep 22, 2014, at 10:15 PM, Jean-Arthur Louis Olive > wrote: Hi all, I am using PETSc (dev version) to solve the Stokes + temperature equations. My DM has fields (vx, vy, p, T). I would like to use nested fieldsplits to separate the T part from the Stokes part, and apply a Schur complement approach to the Stokes block. Unfortunately, I keep getting this error message: [1]PETSC ERROR: DMCreateFieldDecomposition() line 1274 in /home/jolive/petsc/src/dm/interface/dm.c Decomposition defined only after DMSetUp Here are the command line options I tried: -snes_type ksponly \ -ksp_type fgmres \ # define 2 fields: [vx vy p] and [T] -pc_type fieldsplit -pc_fieldsplit_0_fields 0,1,2 -pc_fieldsplit_1_fields 3 \ # split [vx vy p] into 2 fields: [vx vy] and [p] -fieldsplit_0_pc_type fieldsplit \ -pc_fieldsplit_0_fieldsplit_0_fields 0,1 -pc_fieldsplit_0_fieldsplit_1_fields 2 \ # apply schur complement to [vx vy p] -fieldsplit_0_pc_fieldsplit_type schur \ -fieldsplit_0_pc_fieldsplit_schur_factorization_type upper \ # solve everything with lu, just for testing -fieldsplit_0_fieldsplit_0_ksp_type preonly \ -fieldsplit_0_fieldsplit_0_pc_type lu -fieldsplit_0_fieldsplit_0_pc_factor_mat_solver_package superlu_dist \ -fieldsplit_0_fieldsplit_1_ksp_type preonly \ -fieldsplit_0_fieldsplit_1_pc_type lu -fieldsplit_0_fieldsplit_1_pc_factor_mat_solver_package superlu_dist \ -fieldsplit_1_ksp_type preonly \ -fieldsplit_1_pc_type lu -fieldsplit_1_pc_factor_mat_solver_package superlu_dist \ Any idea what could be causing this? Thanks a lot, Arthur -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Sep 22 22:47:43 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 Sep 2014 22:47:43 -0500 Subject: [petsc-users] PCASM in transposed format In-Reply-To: <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BBE1@MTLWAEXCH004.ca.aero.bombardier.net> References: <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BBE1@MTLWAEXCH004.ca.aero.bombardier.net> Message-ID: <9CE5CE69-3C14-437A-A532-1CFACA93652E@mcs.anl.gov> Antoine, That is one nasty matrix! You are actually getting essentially garbage during the solution process with and without the transpose. There is no reason to think that the additive Schwarz method, or any standard iterative method will work much at all on this matrix. You should run with -ksp_monitor_true_residual to see that, though in the ?preconditioned? ?norm? it looks like reasonable convergence, it is actually not giving reasonable convergence in the two norm of the true residual regardless of transpose. I computed A?*A - A*A? and the matrix is extremely far from being normal, non-normal matrices are hard to solve with iterative methods. The diagonal entries of the matrix are microscopic compared to some non-diagonal entries, this is also a problem for iterative methods I also ran with -ksp_monitor_singular_value and -ksp_plot_eigenvalues -ksp_gmres_restart the condition number of the matrix is not very large, maybe 10^5 but the distribution of the eigenvalues (er singular values) is nasty, lots and lots lined up on the imaginary axis. I ran with the PETSc built in sparse LU solver and got satisfactory results ./ex10 -f0 ~/Datafiles/Matrices/A_and_rhs.bin -pc_type lu -ksp_monitor_true 0 KSP preconditioned resid norm 4.620896262080e-02 true resid norm 1.367267281011e-05 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 6.143025615547e-13 true resid norm 3.207400714524e-16 ||r(i)||/||b|| 2.345847632770e-11 pretty quickly. My recommendation is either to reformulate the problem to not get such a nasty matrix or use a direct solver. Iterative methods are going to hopeless for yo. Barry On Sep 22, 2014, at 1:28 PM, Antoine De Blois wrote: > Dear all, > > I am using the ASM preconditioner to solve a transposed system through MatSolveTranspose. Strangely, the results I obtain differ in each call. Is there a non-deterministic operations within ASM? If I use it in a non-transposed way, I get correct results? If I use GASM, then the results are always the same, both in transposed and non-transposed formats. > > Below is a log of my calls (note that some of them actually diverged). > I can give you my matrix and rhs if you want. > > Regards, > Antoine > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 690 > Residual norm 0.000617544 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 475 > Residual norm 0.000727253 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 10000 > Residual norm 1.3866 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 568 > Residual norm 0.000684401 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 540 > Residual norm 0.000555548 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 10000 > Residual norm 1.30198 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type asm > Number of iterations = 207 > Residual norm 0.000555849 > > > --------------- > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type gasm > Number of iterations = 297 > Residual norm 0.000600143 > > $ mpirun -np 4 /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examples/tutorials/ex10 -f0 A_and_rhs.bin -trans -pc_type gasm > Number of iterations = 297 > Residual norm 0.000600143 > > > > Antoine DeBlois > Specialiste ingenierie, MDO lead / Engineering Specialist, MDO lead > A?ronautique / Aerospace > 514-855-5001, x 50862 > antoine.deblois at aero.bombardier.com > > 2351 Blvd Alfred-Nobel > Montreal, Qc > H4S 1A9 > > > CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. > If you are not the intended recipient or received this communication by error, please notify the sender > and delete the message without copying From rupp at iue.tuwien.ac.at Mon Sep 22 23:37:10 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Tue, 23 Sep 2014 06:37:10 +0200 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <54207AB5.2050104@txcorp.com> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> Message-ID: <5420F8F6.400@iue.tuwien.ac.at> Hi Dominic, I've got some time available at the end of this week for a merge to next. Is there anything other than PR #178 needed? It currently shows some conflicts, so is there any chance to rebase it on ~Thursday? Best regards, Karli On 09/22/2014 09:38 PM, Dominic Meiser wrote: > On 09/22/2014 12:57 PM, Chung Shen wrote: >> Dear PETSc Users, >> >> I am new to PETSc and trying to determine if GPU speedup is possible >> with the 3D Poisson solvers. I configured 2 copies of 'petsc-master' >> on a standalone machine, one with CUDA toolkit 5.0 and one without >> (both without MPI): >> Machine: HP Z820 Workstation, Redhat Enterprise Linux 5.0 >> CPU: (x2) 8-core Xeon E5-2650 2.0GHz, 128GB Memory >> GPU: (x2) Tesla K20c (706MHz, 5.12GB Memory, Cuda Compatibility: 3.5, >> Driver: 313.09) >> >> I used 'src/ksp/ksp/examples/tests/ex32.c' as a test and was getting >> about 20% speedup with GPU. Is this reasonable or did I miss something? >> >> Attached is a comparison chart with two sample logs. The y-axis is the >> elapsed time in seconds and the x-axis corresponds to the size of the >> problem. In particular, I wonder if the numbers of calls to >> 'VecCUSPCopyTo' and 'VecCUSPCopyFrom' shown in the GPU log are excessive? >> >> Thanks in advance for your reply. >> >> Best Regards, >> >> Chung Shen > A few comments: > > - To get reliable timing you should configure PETSc without debugging > (i.e. --with-debugging=no) > - The ILU preconditioning in your GPU benchmark is done on the CPU. The > host-device data transfers are killing performance. Can you try to run > with the additional option --pc_factor_mat_solver_packe cusparse? This > will perform the preconditioning on the GPU. > - If you're interested in running benchmarks in parallel you will need a > few patches that are not yet in petsc/master. I can put together a > branch that has the needed fixes. > > Cheers, > Dominic > From jed at jedbrown.org Tue Sep 23 00:31:21 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 22 Sep 2014 23:31:21 -0600 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <54207AB5.2050104@txcorp.com> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> Message-ID: <878ulaj42e.fsf@jedbrown.org> Dominic Meiser writes: > - To get reliable timing you should configure PETSc without debugging > (i.e. --with-debugging=no) > - The ILU preconditioning in your GPU benchmark is done on the CPU. The > host-device data transfers are killing performance. Can you try to run > with the additional option --pc_factor_mat_solver_packe cusparse? This > will perform the preconditioning on the GPU. > - If you're interested in running benchmarks in parallel you will need a > few patches that are not yet in petsc/master. I can put together a > branch that has the needed fixes. And for the CPU version, considering using a configuration that makes sense there. Like FMG with Gauss-Seidel or Chebyshev smoothers and an error tolerance proportional to discretization error. You might find that not enough time is spent on the fine grid to see a significant speed-up. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jed at jedbrown.org Tue Sep 23 00:35:09 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 22 Sep 2014 23:35:09 -0600 Subject: [petsc-users] Are there a set of general rules for picking preconditioners and solvers when solving a 3D Laplace's equation? In-Reply-To: References: Message-ID: <8761gej3w2.fsf@jedbrown.org> "Alletto, John M" writes: > I am solving one of the PETSc 3D Laplacian examples with a 7 point stencil of width 1 and in a separate baseline with a 13 point stencil of width 2 (a more accurate mesh). > > What worked fast in terms of solvers and pre-conditioner for the less accurate baseline was non-optimal (very slow) for the more accurate baseline. > > Are there a set of general rules for picking preconditioners and solvers when solving a 3D Laplace's equation? Always use Full Multigrid. If solving up to discretization error takes more than 5 work units, something is wrong. PETSc does not have built-in high-order interpolation operators like would be most suitable for the 13-point stencil (assuming it is higher order accurate). Dealing with boundary conditions for high-order FD and (especially) FV methods are a common source of errors in MG solvers. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From csp at info.szwgroup.com Tue Sep 23 02:16:06 2014 From: csp at info.szwgroup.com (=?utf-8?B?TXMuIEVsbGEgV2Vp?=) Date: Tue, 23 Sep 2014 15:16:06 +0800 (CST) Subject: [petsc-users] =?utf-8?q?Eskom=2C_SolarReserve=2C_Acciona_and_KfW_?= =?utf-8?q?join_CSP_Focus_South_Africa_November?= Message-ID: <20140923071606.5D3C140A9C3A@mx2.easetopmail.com> An HTML attachment was scrubbed... URL: From hillsmattc at outlook.com Tue Sep 23 07:07:04 2014 From: hillsmattc at outlook.com (Matthew Hills) Date: Tue, 23 Sep 2014 14:07:04 +0200 Subject: [petsc-users] PETSc/TAU configuration In-Reply-To: References: , , , , Message-ID: Hi PETSc Team, I successfully configured PETSc with TAU using: > ./configure --with-mpi=1 --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 --download-f-blas-lapack=${SESKADIR}/packages/downloads/fblaslapacklinpack-3.1.1.tar.gz > make PETSC_DIR=/home/hills/seska/packages/petsc PETSC_ARCH=linux-gnu-cxx-opt all While building I received this error message: [100%] Built target petsc **************************ERROR************************************ Error during compile, check linux-gnu-cxx-opt/conf/make.log Send it and linux-gnu-cxx-opt/conf/configure.log to petsc-maint at mcs.anl.gov ******************************************************************** Because of their large file size, I've attached links to the log files below. Any assistance would be greatly appreciated. make.log - https://www.dropbox.com/s/n6y8nl8go4s41tq/make.log?dl=0 configure.log - https://www.dropbox.com/s/ipvt4urvgldft8x/configure.log?dl=0 Matthew > Date: Mon, 22 Sep 2014 10:24:15 -0500 > From: balay at mcs.anl.gov > To: hillsmattc at outlook.com > CC: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] PETSc/TAU configuration > > On Mon, 22 Sep 2014, Matthew Hills wrote: > > > Hi PETSc Team, > > > > I'm still experiencing difficulties with configuring PETSc with TAU. I'm currently: > > > > building OpenMPI > > 1. ./configure --prefix=${SESKADIR}/packages/openmpi > > 2. make all install > > > > set library path > > 1. export LD_LIBRARY_PATH=${SESKADIR}/lib:${SESKADIR}/packages/openmpi /lib:${SESKADIR}/packages/pdt/x86_64/lib:/${SESKADIR}/packages/tau/x86_64/lib:${SESKADIR}/packages/petsc/${PETSC_ARCH}/lib:$LD_LIBRARY_PATH > > 2. export PATH=${SESKADIR}/bin:${SESKADIR}/packages/petsc/${PETSC_ARCH}/bin:$PATH > > > > build PDT (pdtoolkit-3.20) > > 1. ./configure -GNU > > 2. export PATH=${SESKADIR}/packages/pdt/x86_64/bin:${SESKADIR}/packages/pdt/x86_64//bin:$PATH > > 5. make > > 6. make install > > > > build TAU (tau-2.23.1) using OpenMPI > > 1. ./configure -prefix=`pwd` -cc=mpicc -c++=mpicxx -fortran=mpif90 -pdt=${SESKADIR}/packages/pdt -mpiinc=${SESKADIR}/packages/openmpi/include -mpilib=${SESKADIR}/packages/openmpi/lib -bfd=download > > 2. export PATH=${SESKADIR}/packages/tau/x86_64/bin:$PATH > > 3. make install > > > > build fblaslapacklinpack-3.1.1 > > 1. make > > Should have said '--download-fblaslapack' would be fine here [as it > uses mpif90 - not tau_cc.sh]. Building seperately is also fine. > > > build PETSc using TAU_CC/MPI > > 1. export TAU_MAKEFILE=${SESKADIR}/packages/tau/x86_64/lib/Makefile.tau-mpi-pdt > > 2. ./configure --prefix='pwd' --with-mpi=1 --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 --with-blas-lapack-dir=${SESKADIR}/packages/fblaslapack > > --prefix='pwd' doesn't make sense. Please remove it. > > > Error: Tried looking for file: /tmp/petsc-U9YCMv/config.setCompilers/conftest > > Error: Failed to link with TAU options > > Error: Command(Executable) is -- gcc > > configure.log looks complete [and indicates a successful run]. Did the > messages above come during configure step on the terminal? > > Can you try the following and see if PETSc builds successfully? [but > recommend rerunning configure first - without --prefix option] > > make PETSC_DIR=/home/hills/seska/packages/petsc PETSC_ARCH=linux-gnu-cxx-opt all > > Satish > > > > > > > Attached you'll find my configure log. Any assistance would be greatly appreciated. > > > > Warm regards, > > Matthew > > > > > Date: Tue, 16 Sep 2014 08:21:41 -0500 > > > From: balay at mcs.anl.gov > > > To: hillsmattc at outlook.com > > > CC: petsc-users at mcs.anl.gov > > > Subject: Re: [petsc-users] PETSc/TAU configuration > > > > > > I haven't tried using TAU in a while - but here are some obvious things to try. > > > > > > 1. --download-mpich [or openmpi] with TAU does not make sense. > > > > > > You would have to build MPICH/OpenMPI first. > > > > > > Then build TAU to use this MPI. > > > > > > And then build PETSc to use this TAU_CC/MPI > > > > > > 2. I would use only tau_cc.sh - and not bother with c++/fortran > > > > > > i.e [with TAU build with a given mpicc] - configure PETSc with: > > > ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 > > > > > > 3. Do not use any --download-package when using tau_cc.sh. First check > > > if you are able to use TAU with PETSc - without externalpackages [you > > > would need blas,mpi. Use system blas/lapack for blas/lapack - and > > > build MPI as mentioned above for use with TAU and later PETSc] > > > > > > And if you really need these externalpackage [assuming the above basic > > > build with TAU works] - I would recommend the following 2 step build process: > > > > > > > > > 4.1. ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-PACKAGE PETSC_ARCH=arch-packages > > > > > > 4.2. Now strip out the petsc relavent stuff from this location > > > rm -f arch-packages/include/petsc*.h > > > > > > 4.3. Now build PETSc with TAU - using these prebuilt-packages > > > > > > ./configure --with-cc=tau_cc.sh --with-cxx=mpicxx --with-fc=mpif90 PETSC_ARCH=arch-tau --with-PACKAGE-dir=`pwd`/arch-packages > > > > > > BTW: the current release is petsc-3.5 - we recommend upgrading to > > > using it [as we usually support the latest release wrt debugging/bug > > > fixes] > > > > > > Satish > > > > > > > > > On Tue, 16 Sep 2014, Matthew Hills wrote: > > > > > > > > Hi PETSc Team,] > > > > > > > > I am experiencing difficulties with configuring PETSc with TAU. I have replaced the standard compilers with the tau_cc.sh, tau_cxx.sh, and tau_f90.sh scripts but this produces multiple errors. I have also attempted to use OpenMPI and MPICH, but both produce their own unique errors. > > > > > > > > After successfully compiling PDT, TAU was compiled with: > > > > > > > > ./configure > > > > -prefix=`pwd` > > > > -cc=gcc > > > > -c++=g++ -fortran=gfortran > > > > -pdt=${SESKADIR}/packages/pdt > > > > -mpiinc=${PETSC_DIR}/${PETSC_ARCH}/include > > > > -mpilib=${PETSC_DIR}/${PETSC_ARCH}/lib -bfd=download > > > > > > > > Attached you'll find the PETSc configuration logs. If any more information is needed please let me know. > > > > > > > > Warm regards, > > > > > > > > Matthew Hills > > > > University of Cape Town > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmeiser at txcorp.com Tue Sep 23 09:52:23 2014 From: dmeiser at txcorp.com (Dominic Meiser) Date: Tue, 23 Sep 2014 08:52:23 -0600 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <5420F8F6.400@iue.tuwien.ac.at> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> <5420F8F6.400@iue.tuwien.ac.at> Message-ID: <54218927.9040504@txcorp.com> Hi Karli, PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c uses DMDA's which require a few additional fixes. I haven't opened a pull request for these yet but I will do that before Thursday. Regarding the rebase, wouldn't it be preferable to just resolve the conflicts in the merge commit? In any event, I've merged these branches several times into local integration branches created off of recent petsc/master branches so I'm pretty familiar with the conflicts and how to resolve them. I can help with the merge or do a rebase, whichever you prefer. Cheers, Dominic On 09/22/2014 10:37 PM, Karl Rupp wrote: > Hi Dominic, > > I've got some time available at the end of this week for a merge to > next. Is there anything other than PR #178 needed? It currently shows > some conflicts, so is there any chance to rebase it on ~Thursday? > > Best regards, > Karli > > > > On 09/22/2014 09:38 PM, Dominic Meiser wrote: >> On 09/22/2014 12:57 PM, Chung Shen wrote: >>> Dear PETSc Users, >>> >>> I am new to PETSc and trying to determine if GPU speedup is possible >>> with the 3D Poisson solvers. I configured 2 copies of 'petsc-master' >>> on a standalone machine, one with CUDA toolkit 5.0 and one without >>> (both without MPI): >>> Machine: HP Z820 Workstation, Redhat Enterprise Linux 5.0 >>> CPU: (x2) 8-core Xeon E5-2650 2.0GHz, 128GB Memory >>> GPU: (x2) Tesla K20c (706MHz, 5.12GB Memory, Cuda Compatibility: 3.5, >>> Driver: 313.09) >>> >>> I used 'src/ksp/ksp/examples/tests/ex32.c' as a test and was getting >>> about 20% speedup with GPU. Is this reasonable or did I miss something? >>> >>> Attached is a comparison chart with two sample logs. The y-axis is the >>> elapsed time in seconds and the x-axis corresponds to the size of the >>> problem. In particular, I wonder if the numbers of calls to >>> 'VecCUSPCopyTo' and 'VecCUSPCopyFrom' shown in the GPU log are >>> excessive? >>> >>> Thanks in advance for your reply. >>> >>> Best Regards, >>> >>> Chung Shen >> A few comments: >> >> - To get reliable timing you should configure PETSc without debugging >> (i.e. --with-debugging=no) >> - The ILU preconditioning in your GPU benchmark is done on the CPU. The >> host-device data transfers are killing performance. Can you try to run >> with the additional option --pc_factor_mat_solver_packe cusparse? This >> will perform the preconditioning on the GPU. >> - If you're interested in running benchmarks in parallel you will need a >> few patches that are not yet in petsc/master. I can put together a >> branch that has the needed fixes. >> >> Cheers, >> Dominic >> > -- Dominic Meiser Tech-X Corporation 5621 Arapahoe Avenue Boulder, CO 80303 USA Telephone: 303-996-2036 Fax: 303-448-7756 www.txcorp.com From popov at uni-mainz.de Tue Sep 23 09:50:57 2014 From: popov at uni-mainz.de (anton) Date: Tue, 23 Sep 2014 16:50:57 +0200 Subject: [petsc-users] SNESSetJacobian Message-ID: <542188D1.40300@uni-mainz.de> Starting from version 3.5 the matrix parameters in SNESSetJacobian are no longer pointers, hence my question: What is the most appropriate place to call SNESSetJacobian if I need to change the Jacobian during solution? What about FormFunction? Thanks, Anton From hzhang at mcs.anl.gov Tue Sep 23 09:55:32 2014 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 23 Sep 2014 09:55:32 -0500 Subject: [petsc-users] Valgrind Errors In-Reply-To: References: <06D6C4A02103674E8911418149538BA4121946DE@PW00INFMAI003.digitalglobe.com> <6F575569-1E91-4530-98CE-EB89A4D99E8F@mcs.anl.gov> <54135A29.1060601@txcorp.com> <06D6C4A02103674E8911418149538BA412195AB5@PW00INFMAI003.digitalglobe.com> Message-ID: James, The fix is pushed to petsc-maint (release) https://bitbucket.org/petsc/petsc/commits/c974faeda5a26542265b90934a889773ab380866 Thanks for your report! Hong On Mon, Sep 15, 2014 at 5:05 PM, Hong wrote: > James : > I'm fixing it in branch > hzhang/matmatmult-bugfix > https://bitbucket.org/petsc/petsc/commits/a7c7454dd425191f4a23aa5860b8c6bac03cfd7b > > Once it is further cleaned, and other routines are checked, I will > patch petsc-release. > > Hong > >> Hi Barry, >> >> Thanks for the response. You're right, it (both ex70 and my own code) doesn't give those valgrind errors when I run it in parallel. Changing the type to MATAIJ also fixes the issue. >> >> Thanks for the help, I appreciate it. >> >> James >> >> >> >> >> >>> -----Original Message----- >>> From: Hong [mailto:hzhang at mcs.anl.gov] >>> Sent: Friday, September 12, 2014 4:29 PM >>> To: Dominic Meiser >>> Cc: Barry Smith; James Balasalle; Zhang, Hong; petsc-users at mcs.anl.gov >>> Subject: Re: [petsc-users] Valgrind Errors >>> >>> I'll check it. >>> Hong >>> >>> On Fri, Sep 12, 2014 at 3:40 PM, Dominic Meiser >>> wrote: >>> > On 09/12/2014 02:11 PM, Barry Smith wrote: >>> >> >>> >> James (and Hong), >>> >> >>> >> Do you ever see this problem in parallel runs? >>> >> >>> >> You are not doing anything wrong. >>> >> >>> >> Here is what is happening. >>> >> >>> >> MatGetBrowsOfAoCols_MPIAIJ() which is used by >>> >> MatMatMult_MPIAIJ_MPIAIJ() assumes that the VecScatters for the >>> >> matrix-vector products are >>> >> >>> >> gen_to = (VecScatter_MPI_General*)ctx->todata; >>> >> gen_from = (VecScatter_MPI_General*)ctx->from data; >>> >> >>> >> but when run on one process the scatters are not of that form; hence >>> >> the code accesses values in what it thinks is one struct but is >>> >> actually a different one. Hence the valgrind errors. >>> >> >>> >> But since the matrix only lives on one process there is actually >>> >> nothing to move between processors hence no error happens in the >>> >> computation. You can avoid the issue completely by using MATAIJ >>> >> matrix for the type instead of MATMPIAIJ and then on one process it >>> automatically uses MATSEQAIJ. >>> >> >>> >> I don?t think the bug has anything in particular to do with the >>> >> MatTranspose. >>> >> >>> >> Hong, >>> >> >>> >> Can you please fix this code? Essentially you can by pass parts >>> >> of the code when the Mat is on only one process. (Maybe this also >>> >> happens for MPIBAIJ matrices?) Send a response letting me know you >>> saw this. >>> >> >>> >> Thanks >>> >> >>> >> Barry >>> > >>> > I had to fix a few issues similar to this a while back. The method >>> > VecScatterGetTypes_Private introduced in pull request 176 might be >>> > useful in this context. >>> > >>> > Cheers, >>> > Dominic >>> > >> >> >> >> >> >> >> >> >> This electronic communication and any attachments may contain confidential and proprietary >> information of DigitalGlobe, Inc. If you are not the intended recipient, or an agent or employee >> responsible for delivering this communication to the intended recipient, or if you have received >> this communication in error, please do not print, copy, retransmit, disseminate or >> otherwise use the information. Please indicate to the sender that you have received this >> communication in error, and delete the copy you received. DigitalGlobe reserves the >> right to monitor any electronic communication sent or received by its employees, agents >> or representatives. >> From hzhang at mcs.anl.gov Tue Sep 23 10:48:51 2014 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 23 Sep 2014 10:48:51 -0500 Subject: [petsc-users] superlu_dist and MatSolveTranspose In-Reply-To: References: <22B78B7D747CBF4FA36FCD2C6EC7AF5EAED55AED@MTLWAEXCH005.ca.aero.bombardier.net> <87vbpc49e3.fsf@jedbrown.org> <22B78B7D747CBF4FA36FCD2C6EC7AF5EAED5BEDF@MTLWAEXCH005.ca.aero.bombardier.net> <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BB9A@MTLWAEXCH004.ca.aero.bombardier.net> Message-ID: Antoine, I just find out that superlu_dist does not support MatSolveTransport yet (see Sherry's email below). Once superlu_dist provides this support, we can add it to the petsc/superlu_dist interface. Thanks for your patience. Hong ------------------------------------------- Hong, Sorry, the transposed solve is not there yet; it's not as simple as serial version, because here, it requires to set up entirely different communication pattern. I will try to find time to do it. Sherry On Tue, Sep 23, 2014 at 8:11 AM, Hong wrote: > > Sherry, > Can superlu_dist be used for solving A^T x = b? > > Using the option > options.Trans = TRANS; > with the existing petsc-superlu_dist interface, I cannot get correct solution. > > Hong On Mon, Sep 22, 2014 at 12:47 PM, Hong wrote: > I'll add it. It would not take too long, just matter of priority. > I'll try to get it done in a day or two, then let you know when it works. > > Hong > > On Mon, Sep 22, 2014 at 12:11 PM, Antoine De Blois > wrote: >> Dear all, >> >> Sorry for the delay on this topic. >> >> Thank you Gaetan for your suggestion. I had thought about doing that originally, but I had left it out since I thought that a rank owned the entire row of the matrix (and not only the sub-diagonal part). I will certainly give it a try. >> >> I still need the MatSolveTranspose since I need the ability to reuse the residual jacobian matrix from the flow (a 1st order approximation of it), which is assembled in a non-transposed format. This way the adjoint system is solved in a pseudo-time step manner, where the product of the exact jacobian matrix and the adjoint vector is used as a source term in the rhs. >> >> Hong, do you have an estimation of the time required to implement it in superlu_dist? >> >> Best, >> Antoine >> >> -----Message d'origine----- >> De : Hong [mailto:hzhang at mcs.anl.gov] >> Envoy? : Friday, August 29, 2014 9:14 PM >> ? : Gaetan Kenway >> Cc : Antoine De Blois; petsc-users at mcs.anl.gov >> Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose >> >> We can add MatSolveTranspose() to the petsc interface with superlu_dist. >> >> Jed, >> Are you working on it? If not, I can work on it. >> >> Hong >> >> On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway wrote: >>> Hi Antoine >>> >>> We are also using PETSc for solving adjoint systems resulting from >>> CFD. To get around the matSolveTranspose issue we just assemble the >>> transpose matrix directly and then call KSPSolve(). If this is >>> possible in your application I think it is probably the best approach >>> >>> Gaetan >>> >>> >>> On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois >>> wrote: >>>> >>>> Hello Jed, >>>> >>>> Thank you for your quick response. So I spent some time to dig deeper >>>> into my problem. I coded a shell script that passes through a bunch >>>> of ksp_type, pc_type and sub_pc_type. So please disregard the comment >>>> about the "does not converge properly for transpose". I had taken >>>> that conclusion from my own code (and not from the ex10 and extracted >>>> matrix), and a KSPSetFromOptions was missing. Apologies for that. >>>> >>>> What remains is the performance issue. The MatSolveTranspose takes a >>>> very long time to converge. For a matrix of 3 million rows, >>>> MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the >>>> MatSolve is almost instantaneous!. When I gdb my code, petsc seems to >>>> be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time. >>>> I also did a top on the compute node to check the RAM usage. It was >>>> hovering over 2 gig, so memory usage does not seem to be an issue here. >>>> >>>> #0 0x00002afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #1 0x00002afe8e07f15c in MatLUFactorNumeric () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #2 0x00002afe8e2afa99 in PCSetUp_ILU () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #3 0x00002afe8e337c0d in PCSetUp () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #4 0x00002afe8e39d643 in KSPSetUp () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #5 0x00002afe8e39e3ee in KSPSolveTranspose () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #6 0x00002afe8e300f8c in PCApplyTranspose_ASM () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #7 0x00002afe8e338c13 in PCApplyTranspose () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #8 0x00002afe8e3a8a84 in KSPInitialResidual () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #9 0x00002afe8e376c32 in KSPSolve_GMRES () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/li >>>> bpetsc.so.3.5 >>>> #10 0x00002afe8e39e425 in KSPSolveTranspose () >>>> >>>> For that particular application, I was using: >>>> ksp_type: gmres >>>> pc_type: asm >>>> sub_pc_type: ilu >>>> adj_sub_pc_factor_levels 1 >>>> >>>> For small matrices, the MatSolveTranspose computing time is very >>>> similar to the simple MatSolve. >>>> >>>> And if I want to revert to a MatTranspose followed by the MatSolve, >>>> then the MatTranspose takes forever to finish... For a matrix of 3 >>>> million rows, MatTranspose takes 30 minutes on 64 cpus!! >>>> >>>> So thank you for implementing the transpose solve in superlu_dist. It >>>> would also be nice to have it with hypre. >>>> Let me know what you think and ideas on how to improve my >>>> computational time, Regards, Antoine >>>> >>>> -----Message d'origine----- >>>> De : Jed Brown [mailto:jed at jedbrown.org] Envoy? : Thursday, August >>>> 28, 2014 5:01 PM ? : Antoine De Blois; 'petsc-users at mcs.anl.gov' >>>> Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose >>>> >>>> Antoine De Blois writes: >>>> >>>> > Hello everyone, >>>> > >>>> > I am trying to solve a A^T x = b system. For my applications, I had >>>> > realized that the MatSolveTranspose does not converge properly. >>>> >>>> What do you mean "does not converge properly"? Can you send a test >>>> case where the transpose solve should be equivalent, but is not? We >>>> have only a few tests for transpose solve and not all preconditioners >>>> support it, but where it is supported, we want to ensure that it is correct. >>>> >>>> > Therefore, I had implemented a MatTranspose followed by a MatSolve. >>>> > This proved to converge perfectly (which is strange since the >>>> > transposed matrix has the same eigenvalues as the untransposed...). >>>> > The problem is that for bigger matrices, the MatTranspose is very >>>> > costly and thus cannot be used. >>>> >>>> Costly in terms of memory? (I want you to be able to use >>>> KSPSolveTranspose, but I'm curious what you're experiencing.) >>>> >>>> > I tried using the superlu_dist package. Although it the package >>>> > works perfectly for the MatSolve, I get the an "No support for this >>>> > operation for this object type" error with MatSolveTransopse. I >>>> > reproduced the error using the MatView an ex10 tutorial. I can >>>> > provide the matrix and rhs upon request. My command line was: >>>> > >>>> > ex10 -f0 A_and_rhs.bin -pc_type lu -pc_factor_mat_solver_package >>>> > superlu_dist -trans >>>> > >>>> > So it there an additional parameter I need to use for the >>>> > transposed solve? >>>> > >>>> > [0]PETSC ERROR: --------------------- Error Message >>>> > -------------------------------------------------------------- >>>> > [0]PETSC ERROR: No support for this operation for this object type >>>> > [0]PETSC ERROR: Matrix type mpiaij >>>> >>>> This is easy to add. I'll do it now. >>>> >>>> > [0]PETSC ERROR: See >>>> > http://www.mcs.anl.gov/petsc/documentation/faq.html >>>> > for trouble shooting. >>>> > [0]PETSC ERROR: Petsc Release Version 3.5.1, unknown [0]PETSC ERROR: >>>> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/example >>>> > s/t >>>> > utorials/ex10 on a ARGUS_impi_opt named hpc-user11 by ad007804 Thu >>>> > Aug >>>> > 28 16:41:15 2014 [0]PETSC ERROR: Configure options --CFLAGS="-xHost >>>> > -axAVX" --download-hypre --download-metis --download-ml >>>> > --download-parmetis --download-scalapack --download-superlu_dist >>>> > --download-mumps --with-c2html=0 --with-cc=mpiicc >>>> > --with-fc=mpiifort --with-cxx=mpiicpc --with-debugging=yes >>>> > --prefix=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/petsc-3.5. >>>> > 1 >>>> > --with-cmake=/gpfs/fs1/aero/SOFTWARE/TOOLS/CMAKE/cmake-2.8.7/bin/cm >>>> > ake >>>> > --with-valgrind=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/valgrind- >>>> > 3.9 .0/bin/valgrind --with-shared-libraries=0 [0]PETSC ERROR: #1 >>>> > MatSolveTranspose() line 3473 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/mat/interface/m >>>> > atr ix.c [0]PETSC ERROR: #2 PCApplyTranspose_LU() line 214 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/impls/fa >>>> > cto r/lu/lu.c [0]PETSC ERROR: #3 PCApplyTranspose() line 573 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/interfac >>>> > e/p recon.c [0]PETSC ERROR: #4 KSP_PCApply() line 233 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/include/petsc-priva >>>> > te/ kspimpl.h [0]PETSC ERROR: #5 KSPInitialResidual() line 63 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/interfa >>>> > ce/ itres.c [0]PETSC ERROR: #6 KSPSolve_GMRES() line 234 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/impls/g >>>> > mre s/gmres.c [0]PETSC ERROR: #7 KSPSolveTranspose() line 704 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/interfa >>>> > ce/ itfunc.c [0]PETSC ERROR: #8 main() line 324 in >>>> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/example >>>> > s/t >>>> > utorials/ex10.c >>>> > >>>> > FYI, the transpose solve is a typical application for adjoint >>>> > optimization. There should be a big adjoint community of developers >>>> > that try to solve the transposed matrix. >>>> > >>>> > Any help is much appreciated, >>>> > Best, >>>> > Antoine >>>> > >>>> > >>>> > Antoine DeBlois >>>> > Specialiste ingenierie, MDO lead / Engineering Specialist, MDO lead >>>> > A?ronautique / Aerospace 514-855-5001, x 50862 >>>> > antoine.deblois at aero.bombardier.com>>> > bar >>>> > dier.com> >>>> > >>>> > 2351 Blvd Alfred-Nobel >>>> > Montreal, Qc >>>> > H4S 1A9 >>>> > >>>> > [Description : Description : >>>> > http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg] >>>> > CONFIDENTIALITY NOTICE - This communication may contain privileged >>>> > or confidential information. >>>> > If you are not the intended recipient or received this >>>> > communication by error, please notify the sender and delete the >>>> > message without copying >>> >>> From bsmith at mcs.anl.gov Tue Sep 23 11:11:46 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 23 Sep 2014 11:11:46 -0500 Subject: [petsc-users] SNESSetJacobian In-Reply-To: <542188D1.40300@uni-mainz.de> References: <542188D1.40300@uni-mainz.de> Message-ID: On Sep 23, 2014, at 9:50 AM, anton wrote: > Starting from version 3.5 the matrix parameters in SNESSetJacobian are no longer pointers, hence my question: > What is the most appropriate place to call SNESSetJacobian if I need to change the Jacobian during solution? > What about FormFunction? Could you please explain why you need to change the Mat? Our hope was that people would not need to change it. Note that you can change the type of a matrix at any time. So for example inside your FormJacobian you can have code like MatSetType(J,MATAIJ) this wipes out the old matrix data structure and gives you an empty matrix of the new type ready to be preallocated and then filled. Let us know what you need. Barry > > Thanks, > Anton From antoine.deblois at aero.bombardier.com Tue Sep 23 11:51:51 2014 From: antoine.deblois at aero.bombardier.com (Antoine De Blois) Date: Tue, 23 Sep 2014 16:51:51 +0000 Subject: [petsc-users] superlu_dist and MatSolveTranspose In-Reply-To: References: <22B78B7D747CBF4FA36FCD2C6EC7AF5EAED55AED@MTLWAEXCH005.ca.aero.bombardier.net> <87vbpc49e3.fsf@jedbrown.org> <22B78B7D747CBF4FA36FCD2C6EC7AF5EAED5BEDF@MTLWAEXCH005.ca.aero.bombardier.net> <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BB9A@MTLWAEXCH004.ca.aero.bombardier.net> Message-ID: <22B78B7D747CBF4FA36FCD2C6EC7AF5EBA93BD3D@MTLWAEXCH004.ca.aero.bombardier.net> Morning Hong, Alright, fully understood. Please keep me posted on that matter. Regards, Antoine -----Message d'origine----- De?: Hong [mailto:hzhang at mcs.anl.gov] Envoy??: Tuesday, September 23, 2014 11:49 AM ??: Hong Cc?: Antoine De Blois; Gaetan Kenway; petsc-users at mcs.anl.gov; Sherry Li Objet?: Re: [petsc-users] superlu_dist and MatSolveTranspose Antoine, I just find out that superlu_dist does not support MatSolveTransport yet (see Sherry's email below). Once superlu_dist provides this support, we can add it to the petsc/superlu_dist interface. Thanks for your patience. Hong ------------------------------------------- Hong, Sorry, the transposed solve is not there yet; it's not as simple as serial version, because here, it requires to set up entirely different communication pattern. I will try to find time to do it. Sherry On Tue, Sep 23, 2014 at 8:11 AM, Hong wrote: > > Sherry, > Can superlu_dist be used for solving A^T x = b? > > Using the option > options.Trans = TRANS; > with the existing petsc-superlu_dist interface, I cannot get correct solution. > > Hong On Mon, Sep 22, 2014 at 12:47 PM, Hong wrote: > I'll add it. It would not take too long, just matter of priority. > I'll try to get it done in a day or two, then let you know when it works. > > Hong > > On Mon, Sep 22, 2014 at 12:11 PM, Antoine De Blois > wrote: >> Dear all, >> >> Sorry for the delay on this topic. >> >> Thank you Gaetan for your suggestion. I had thought about doing that originally, but I had left it out since I thought that a rank owned the entire row of the matrix (and not only the sub-diagonal part). I will certainly give it a try. >> >> I still need the MatSolveTranspose since I need the ability to reuse the residual jacobian matrix from the flow (a 1st order approximation of it), which is assembled in a non-transposed format. This way the adjoint system is solved in a pseudo-time step manner, where the product of the exact jacobian matrix and the adjoint vector is used as a source term in the rhs. >> >> Hong, do you have an estimation of the time required to implement it in superlu_dist? >> >> Best, >> Antoine >> >> -----Message d'origine----- >> De : Hong [mailto:hzhang at mcs.anl.gov] Envoy? : Friday, August 29, >> 2014 9:14 PM ? : Gaetan Kenway Cc : Antoine De Blois; >> petsc-users at mcs.anl.gov Objet : Re: [petsc-users] superlu_dist and >> MatSolveTranspose >> >> We can add MatSolveTranspose() to the petsc interface with superlu_dist. >> >> Jed, >> Are you working on it? If not, I can work on it. >> >> Hong >> >> On Fri, Aug 29, 2014 at 6:14 PM, Gaetan Kenway wrote: >>> Hi Antoine >>> >>> We are also using PETSc for solving adjoint systems resulting from >>> CFD. To get around the matSolveTranspose issue we just assemble the >>> transpose matrix directly and then call KSPSolve(). If this is >>> possible in your application I think it is probably the best >>> approach >>> >>> Gaetan >>> >>> >>> On Fri, Aug 29, 2014 at 3:58 PM, Antoine De Blois >>> wrote: >>>> >>>> Hello Jed, >>>> >>>> Thank you for your quick response. So I spent some time to dig >>>> deeper into my problem. I coded a shell script that passes through >>>> a bunch of ksp_type, pc_type and sub_pc_type. So please disregard >>>> the comment about the "does not converge properly for transpose". I >>>> had taken that conclusion from my own code (and not from the ex10 >>>> and extracted matrix), and a KSPSetFromOptions was missing. Apologies for that. >>>> >>>> What remains is the performance issue. The MatSolveTranspose takes >>>> a very long time to converge. For a matrix of 3 million rows, >>>> MatSolveTranspose takes roughly 5 minutes on 64 cpus, whereas the >>>> MatSolve is almost instantaneous!. When I gdb my code, petsc seems >>>> to be stalled in the MatLUFactorNumeric_SeqAIJ_Inode () for a long time. >>>> I also did a top on the compute node to check the RAM usage. It was >>>> hovering over 2 gig, so memory usage does not seem to be an issue here. >>>> >>>> #0 0x00002afe8dfebd08 in MatLUFactorNumeric_SeqAIJ_Inode () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #1 0x00002afe8e07f15c in MatLUFactorNumeric () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #2 0x00002afe8e2afa99 in PCSetUp_ILU () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #3 0x00002afe8e337c0d in PCSetUp () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #4 0x00002afe8e39d643 in KSPSetUp () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #5 0x00002afe8e39e3ee in KSPSolveTranspose () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #6 0x00002afe8e300f8c in PCApplyTranspose_ASM () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #7 0x00002afe8e338c13 in PCApplyTranspose () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #8 0x00002afe8e3a8a84 in KSPInitialResidual () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #9 0x00002afe8e376c32 in KSPSolve_GMRES () >>>> from >>>> /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.5.1/lib/ >>>> li >>>> bpetsc.so.3.5 >>>> #10 0x00002afe8e39e425 in KSPSolveTranspose () >>>> >>>> For that particular application, I was using: >>>> ksp_type: gmres >>>> pc_type: asm >>>> sub_pc_type: ilu >>>> adj_sub_pc_factor_levels 1 >>>> >>>> For small matrices, the MatSolveTranspose computing time is very >>>> similar to the simple MatSolve. >>>> >>>> And if I want to revert to a MatTranspose followed by the MatSolve, >>>> then the MatTranspose takes forever to finish... For a matrix of 3 >>>> million rows, MatTranspose takes 30 minutes on 64 cpus!! >>>> >>>> So thank you for implementing the transpose solve in superlu_dist. >>>> It would also be nice to have it with hypre. >>>> Let me know what you think and ideas on how to improve my >>>> computational time, Regards, Antoine >>>> >>>> -----Message d'origine----- >>>> De : Jed Brown [mailto:jed at jedbrown.org] Envoy? : Thursday, August >>>> 28, 2014 5:01 PM ? : Antoine De Blois; 'petsc-users at mcs.anl.gov' >>>> Objet : Re: [petsc-users] superlu_dist and MatSolveTranspose >>>> >>>> Antoine De Blois writes: >>>> >>>> > Hello everyone, >>>> > >>>> > I am trying to solve a A^T x = b system. For my applications, I >>>> > had realized that the MatSolveTranspose does not converge properly. >>>> >>>> What do you mean "does not converge properly"? Can you send a test >>>> case where the transpose solve should be equivalent, but is not? >>>> We have only a few tests for transpose solve and not all >>>> preconditioners support it, but where it is supported, we want to ensure that it is correct. >>>> >>>> > Therefore, I had implemented a MatTranspose followed by a MatSolve. >>>> > This proved to converge perfectly (which is strange since the >>>> > transposed matrix has the same eigenvalues as the untransposed...). >>>> > The problem is that for bigger matrices, the MatTranspose is very >>>> > costly and thus cannot be used. >>>> >>>> Costly in terms of memory? (I want you to be able to use >>>> KSPSolveTranspose, but I'm curious what you're experiencing.) >>>> >>>> > I tried using the superlu_dist package. Although it the package >>>> > works perfectly for the MatSolve, I get the an "No support for >>>> > this operation for this object type" error with >>>> > MatSolveTransopse. I reproduced the error using the MatView an >>>> > ex10 tutorial. I can provide the matrix and rhs upon request. My command line was: >>>> > >>>> > ex10 -f0 A_and_rhs.bin -pc_type lu -pc_factor_mat_solver_package >>>> > superlu_dist -trans >>>> > >>>> > So it there an additional parameter I need to use for the >>>> > transposed solve? >>>> > >>>> > [0]PETSC ERROR: --------------------- Error Message >>>> > -------------------------------------------------------------- >>>> > [0]PETSC ERROR: No support for this operation for this object >>>> > type [0]PETSC ERROR: Matrix type mpiaij >>>> >>>> This is easy to add. I'll do it now. >>>> >>>> > [0]PETSC ERROR: See >>>> > http://www.mcs.anl.gov/petsc/documentation/faq.html >>>> > for trouble shooting. >>>> > [0]PETSC ERROR: Petsc Release Version 3.5.1, unknown [0]PETSC ERROR: >>>> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examp >>>> > le >>>> > s/t >>>> > utorials/ex10 on a ARGUS_impi_opt named hpc-user11 by ad007804 >>>> > Thu Aug >>>> > 28 16:41:15 2014 [0]PETSC ERROR: Configure options >>>> > --CFLAGS="-xHost -axAVX" --download-hypre --download-metis >>>> > --download-ml --download-parmetis --download-scalapack >>>> > --download-superlu_dist --download-mumps --with-c2html=0 >>>> > --with-cc=mpiicc --with-fc=mpiifort --with-cxx=mpiicpc >>>> > --with-debugging=yes --prefix=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/petsc-3.5. >>>> > 1 >>>> > --with-cmake=/gpfs/fs1/aero/SOFTWARE/TOOLS/CMAKE/cmake-2.8.7/bin/ >>>> > cm >>>> > ake >>>> > --with-valgrind=/gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/valgrin >>>> > d- >>>> > 3.9 .0/bin/valgrind --with-shared-libraries=0 [0]PETSC ERROR: #1 >>>> > MatSolveTranspose() line 3473 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/mat/interface >>>> > /m atr ix.c [0]PETSC ERROR: #2 PCApplyTranspose_LU() line 214 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/impls/ >>>> > fa cto r/lu/lu.c [0]PETSC ERROR: #3 PCApplyTranspose() line 573 >>>> > in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/pc/interf >>>> > ac e/p recon.c [0]PETSC ERROR: #4 KSP_PCApply() line 233 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/include/petsc-pri >>>> > va te/ kspimpl.h [0]PETSC ERROR: #5 KSPInitialResidual() line 63 >>>> > in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/inter >>>> > fa ce/ itres.c [0]PETSC ERROR: #6 KSPSolve_GMRES() line 234 in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/impls >>>> > /g mre s/gmres.c [0]PETSC ERROR: #7 KSPSolveTranspose() line 704 >>>> > in >>>> > /gpfs/fs2/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/inter >>>> > fa ce/ itfunc.c [0]PETSC ERROR: #8 main() line 324 in >>>> > /gpfs/fs1/aero/SOFTWARE/TOOLS/PROGRAMMING/petsc/src/ksp/ksp/examp >>>> > le >>>> > s/t >>>> > utorials/ex10.c >>>> > >>>> > FYI, the transpose solve is a typical application for adjoint >>>> > optimization. There should be a big adjoint community of >>>> > developers that try to solve the transposed matrix. >>>> > >>>> > Any help is much appreciated, >>>> > Best, >>>> > Antoine >>>> > >>>> > >>>> > Antoine DeBlois >>>> > Specialiste ingenierie, MDO lead / Engineering Specialist, MDO >>>> > lead A?ronautique / Aerospace 514-855-5001, x 50862 >>>> > antoine.deblois at aero.bombardier.com>>> > om >>>> > bar >>>> > dier.com> >>>> > >>>> > 2351 Blvd Alfred-Nobel >>>> > Montreal, Qc >>>> > H4S 1A9 >>>> > >>>> > [Description : Description : >>>> > http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg] >>>> > CONFIDENTIALITY NOTICE - This communication may contain >>>> > privileged or confidential information. >>>> > If you are not the intended recipient or received this >>>> > communication by error, please notify the sender and delete the >>>> > message without copying >>> >>> From rupp at iue.tuwien.ac.at Tue Sep 23 14:45:20 2014 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Tue, 23 Sep 2014 21:45:20 +0200 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <54218927.9040504@txcorp.com> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> <5420F8F6.400@iue.tuwien.ac.at> <54218927.9040504@txcorp.com> Message-ID: <5421CDD0.5000509@iue.tuwien.ac.at> Hi Dominic, > PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c uses > DMDA's which require a few additional fixes. I haven't opened a pull > request for these yet but I will do that before Thursday. > > Regarding the rebase, wouldn't it be preferable to just resolve the > conflicts in the merge commit? In any event, I've merged these branches > several times into local integration branches created off of recent > petsc/master branches so I'm pretty familiar with the conflicts and how > to resolve them. I can help with the merge or do a rebase, whichever you > prefer. Ok, I'll give the merge a try and see how things go. :-) Best regards, Karli From dmeiser at txcorp.com Tue Sep 23 14:57:05 2014 From: dmeiser at txcorp.com (Dominic Meiser) Date: Tue, 23 Sep 2014 13:57:05 -0600 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <5421CDD0.5000509@iue.tuwien.ac.at> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> <5420F8F6.400@iue.tuwien.ac.at> <54218927.9040504@txcorp.com> <5421CDD0.5000509@iue.tuwien.ac.at> Message-ID: <5421D091.1040609@txcorp.com> On 09/23/2014 01:45 PM, Karl Rupp wrote: > Hi Dominic, > > > PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c > uses >> DMDA's which require a few additional fixes. I haven't opened a pull >> request for these yet but I will do that before Thursday. >> >> Regarding the rebase, wouldn't it be preferable to just resolve the >> conflicts in the merge commit? In any event, I've merged these branches >> several times into local integration branches created off of recent >> petsc/master branches so I'm pretty familiar with the conflicts and how >> to resolve them. I can help with the merge or do a rebase, whichever you >> prefer. > > Ok, I'll give the merge a try and see how things go. :-) > > Best regards, > Karli > > I can join a google+ or skype session to assist if that helps. Let me know if you run into problems. Cheers, Dominic -- Dominic Meiser Tech-X Corporation 5621 Arapahoe Avenue Boulder, CO 80303 USA Telephone: 303-996-2036 Fax: 303-448-7756 www.txcorp.com From salazardetroya at gmail.com Tue Sep 23 15:01:11 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Tue, 23 Sep 2014 15:01:11 -0500 Subject: [petsc-users] DMPlex with spring elements Message-ID: Hi all I was wondering if it could be possible to build a model similar to the example snes/ex12.c, but with spring elements (for elasticity) instead of simplicial elements. Spring elements in a grid, therefore each element would have two nodes and each node two components. There would be more differences, because instead of calling the functions f0,f1,g0,g1,g2 and g3 to build the residual and the jacobian, I would call a routine that would build the residual vector and the jacobian matrix directly. I would not have shape functions whatsoever. My problem is discrete, I don't have a PDE and my equations are algebraic. What is the best way in petsc to solve this problem? Is there any example that I can follow? Thanks in advance Miguel -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmeiser at txcorp.com Tue Sep 23 16:48:34 2014 From: dmeiser at txcorp.com (Dominic Meiser) Date: Tue, 23 Sep 2014 15:48:34 -0600 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <5421CDD0.5000509@iue.tuwien.ac.at> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> <5420F8F6.400@iue.tuwien.ac.at> <54218927.9040504@txcorp.com> <5421CDD0.5000509@iue.tuwien.ac.at> Message-ID: <5421EAB2.9000203@txcorp.com> On 09/23/2014 01:45 PM, Karl Rupp wrote: > Hi Dominic, > > > PR #178 gets you most of the way. src/ksp/ksp/examples/tests/ex32.c > uses >> DMDA's which require a few additional fixes. I haven't opened a pull >> request for these yet but I will do that before Thursday. >> >> Regarding the rebase, wouldn't it be preferable to just resolve the >> conflicts in the merge commit? In any event, I've merged these branches >> several times into local integration branches created off of recent >> petsc/master branches so I'm pretty familiar with the conflicts and how >> to resolve them. I can help with the merge or do a rebase, whichever you >> prefer. > > Ok, I'll give the merge a try and see how things go. :-) > > Best regards, > Karli > > Hi Karli, I just updated the branch for PR #178 with the additional fixes for the DMDA issues. This branch now has all my GPU related bug fixes. Cheers, Dominic -- Dominic Meiser Tech-X Corporation 5621 Arapahoe Avenue Boulder, CO 80303 USA Telephone: 303-996-2036 Fax: 303-448-7756 www.txcorp.com From zinlin.zinlin at gmail.com Tue Sep 23 19:00:05 2014 From: zinlin.zinlin at gmail.com (Zin Lin) Date: Tue, 23 Sep 2014 20:00:05 -0400 Subject: [petsc-users] Memory requirements in SUPERLU_DIST Message-ID: Hi I am solving a frequency domain Maxwell problem for a dielectric structure of size 90x90x50, (the total matrix size is (90x90x50x6)^2 which includes the three vector components as well as real and imaginary parts.) I am using SUPERLU_DIST for the direct solver with the following options parsymbfact = 1, (parallel symbolic factorization) permcol = PARMETIS, (parallel METIS) permrow = NATURAL (natural ordering). First, I tried to use 4096 cores with 2GB / core memory which totals to about 8 TB of memory. I get the following error: Using ParMETIS for parallel ordering. Structual symmetry is:100% Current memory used: 1400271832 bytes Maximum memory used: 1575752120 bytes ***Memory allocation failed for SetupCoarseGraph: adjncy. Requested size: 148242928 bytes So it seems to be an insufficient memory allocation problem (which apparently happens at the METIS analysis phase?). Then, I tried to use 64 large-memory cores which have a total of 2 TB memory (so larger memory per each core), it seems to work fine (though the solver takes about 900 sec ). What I don't understand is why memory per core matters rather than the total memory? If the work space is distributed across the processors, shouldn't it work as long as I choose a sufficient number of smaller-memory cores? What kind of role does the memory per core play in the algorithm in contrast to the total memory over all the cores? The issue is I would rather use a large number of small-memory cores than any number of the large-memory cores. The latter are two times more expensive in terms of service units (I am running on STAMPEDE at TACC) and not many cores are available either. Any idea would be appreciated. Zin -- Zin Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 23 20:59:25 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 23 Sep 2014 20:59:25 -0500 Subject: [petsc-users] Memory requirements in SUPERLU_DIST In-Reply-To: References: Message-ID: <4A932A2D-4F50-4843-973D-CC1712E91736@mcs.anl.gov> This is something you better ask Sherri about. She?s the one who wrote and understands SuperLU_DIST Barry On Sep 23, 2014, at 7:00 PM, Zin Lin wrote: > Hi > I am solving a frequency domain Maxwell problem for a dielectric structure of size 90x90x50, (the total matrix size is (90x90x50x6)^2 which includes the three vector components as well as real and imaginary parts.) > I am using SUPERLU_DIST for the direct solver with the following options > > parsymbfact = 1, (parallel symbolic factorization) > permcol = PARMETIS, (parallel METIS) > permrow = NATURAL (natural ordering). > > First, I tried to use 4096 cores with 2GB / core memory which totals to about 8 TB of memory. > I get the following error: > > Using ParMETIS for parallel ordering. > Structual symmetry is:100% > Current memory used: 1400271832 bytes > Maximum memory used: 1575752120 bytes > ***Memory allocation failed for SetupCoarseGraph: adjncy. Requested size: 148242928 bytes > > So it seems to be an insufficient memory allocation problem (which apparently happens at the METIS analysis phase?). > > Then, I tried to use 64 large-memory cores which have a total of 2 TB memory (so larger memory per each core), it seems to work fine (though the solver takes about 900 sec ). > What I don't understand is why memory per core matters rather than the total memory? If the work space is distributed across the processors, shouldn't it work as long as I choose a sufficient number of smaller-memory cores? What kind of role does the memory per core play in the algorithm in contrast to the total memory over all the cores? > > The issue is I would rather use a large number of small-memory cores than any number of the large-memory cores. The latter are two times more expensive in terms of service units (I am running on STAMPEDE at TACC) and not many cores are available either. > > Any idea would be appreciated. > > Zin > > -- > Zin Lin > From knepley at gmail.com Tue Sep 23 21:40:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Sep 2014 22:40:52 -0400 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > Hi all > > I was wondering if it could be possible to build a model similar to the > example snes/ex12.c, but with spring elements (for elasticity) instead of > simplicial elements. Spring elements in a grid, therefore each element > would have two nodes and each node two components. There would be more > differences, because instead of calling the functions f0,f1,g0,g1,g2 and g3 > to build the residual and the jacobian, I would call a routine that would > build the residual vector and the jacobian matrix directly. I would not > have shape functions whatsoever. My problem is discrete, I don't have a PDE > and my equations are algebraic. What is the best way in petsc to solve this > problem? Is there any example that I can follow? Thanks in advance > Yes, ex12 is fairly specific to FEM. However, I think the right tools for what you want are DMPlex and PetscSection. Here is how I would proceed: 1) Make a DMPlex that encodes a simple network that you wish to simulate 2) Make a PetscSection that gets the data layout right. Its hard from the above for me to understand where you degrees of freedom actually are. This is usually the hard part. 3) Calculate the residual, so you can check an exact solution. Here you use the PetscSectionGetDof/Offset() for each mesh piece that you are interested in. Again, its hard to be more specific when I do not understand your discretization. Thanks, Matt > Miguel > > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Tue Sep 23 23:13:36 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Wed, 24 Sep 2014 04:13:36 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: You may also want to take a look at the DMNetwork framework that can be used for general unstructured networks that don't use PDEs. Its description is given in the manual and an example is in src/snes/examples/tutorials/network/pflow. Shri From: Matthew Knepley Date: Tue, 23 Sep 2014 22:40:52 -0400 To: Miguel Angel Salazar de Troya Cc: "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] DMPlex with spring elements >On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya > wrote: > >Hi all >I was wondering if it could be possible to build a model similar to the >example snes/ex12.c, but with spring elements (for elasticity) instead of >simplicial elements. Spring elements in a grid, therefore each element >would have two nodes and each node two components. There would be more >differences, because instead of calling the functions f0,f1,g0,g1,g2 and >g3 to build the residual and the jacobian, I would call a routine that >would build the residual vector and the jacobian matrix directly. I would >not have shape functions whatsoever. My problem is discrete, I don't have >a PDE and my equations are algebraic. What is the best way in petsc to >solve this problem? Is there any example that I can follow? Thanks in >advance > > > > >Yes, ex12 is fairly specific to FEM. However, I think the right tools for >what you want are >DMPlex and PetscSection. Here is how I would proceed: > > 1) Make a DMPlex that encodes a simple network that you wish to simulate > > 2) Make a PetscSection that gets the data layout right. Its hard from >the above > for me to understand where you degrees of freedom actually are. >This is usually > the hard part. > > 3) Calculate the residual, so you can check an exact solution. Here you >use the > PetscSectionGetDof/Offset() for each mesh piece that you are >interested in. Again, > its hard to be more specific when I do not understand your >discretization. > > Thanks, > > Matt > > >Miguel > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu > > > > > > > > > >-- >What most experimenters take for granted before they begin their >experiments is infinitely more interesting than any results to which >their experiments lead. >-- Norbert Wiener From jed at jedbrown.org Tue Sep 23 19:14:15 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 23 Sep 2014 17:14:15 -0700 Subject: [petsc-users] Memory requirements in SUPERLU_DIST In-Reply-To: References: Message-ID: <87mw9pho2w.fsf@jedbrown.org> Zin Lin writes: > What I don't understand is why memory per core matters rather than the > total memory? If the work space is distributed across the processors, > shouldn't it work as long as I choose a sufficient number of smaller-memory > cores? METIS is a serial partitioner. ParMETIS is parallel, but often performs worse and still doesn't scale to very large numbers of cores, so it is not the default for most direct solver packages. > What kind of role does the memory per core play in the algorithm in > contrast to the total memory over all the cores? > > The issue is I would rather use a large number of small-memory cores than > any number of the large-memory cores. The latter are two times more > expensive in terms of service units (I am running on STAMPEDE at TACC) and > not many cores are available either. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From popov at uni-mainz.de Wed Sep 24 05:21:45 2014 From: popov at uni-mainz.de (anton) Date: Wed, 24 Sep 2014 12:21:45 +0200 Subject: [petsc-users] SNESSetJacobian In-Reply-To: References: <542188D1.40300@uni-mainz.de> Message-ID: <54229B39.8060900@uni-mainz.de> On 09/23/2014 06:11 PM, Barry Smith wrote: > On Sep 23, 2014, at 9:50 AM, anton wrote: > >> Starting from version 3.5 the matrix parameters in SNESSetJacobian are no longer pointers, hence my question: >> What is the most appropriate place to call SNESSetJacobian if I need to change the Jacobian during solution? >> What about FormFunction? > Could you please explain why you need to change the Mat? Our hope was that people would not need to change it. Note that you can change the type of a matrix at any time. So for example inside your FormJacobian you can have code like MatSetType(J,MATAIJ) this wipes out the old matrix data structure and gives you an empty matrix of the new type ready to be preallocated and then filled. Let us know what you need. How should a user switch from assembled to a matrix-free Jacobian (for example) within one run? Simplest is resetting SNES altogether, I guess. Anton > Barry > >> Thanks, >> Anton From knepley at gmail.com Wed Sep 24 05:48:15 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 06:48:15 -0400 Subject: [petsc-users] SNESSetJacobian In-Reply-To: <54229B39.8060900@uni-mainz.de> References: <542188D1.40300@uni-mainz.de> <54229B39.8060900@uni-mainz.de> Message-ID: On Wed, Sep 24, 2014 at 6:21 AM, anton wrote: > > On 09/23/2014 06:11 PM, Barry Smith wrote: > >> On Sep 23, 2014, at 9:50 AM, anton wrote: >> >> Starting from version 3.5 the matrix parameters in SNESSetJacobian are >>> no longer pointers, hence my question: >>> What is the most appropriate place to call SNESSetJacobian if I need to >>> change the Jacobian during solution? >>> What about FormFunction? >>> >> Could you please explain why you need to change the Mat? Our hope was >> that people would not need to change it. Note that you can change the type >> of a matrix at any time. So for example inside your FormJacobian you can >> have code like MatSetType(J,MATAIJ) this wipes out the old matrix data >> structure and gives you an empty matrix of the new type ready to be >> preallocated and then filled. Let us know what you need. >> > > > How should a user switch from assembled to a matrix-free Jacobian (for > example) within one run? Simplest is resetting SNES altogether, I guess. > Set the type to MATSHELL and set your apply function. You still should not need to change the pointer, exactly as Barry says above. Matt > Anton > > Barry >> >> Thanks, >>> Anton >>> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 24 06:52:18 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 24 Sep 2014 06:52:18 -0500 Subject: [petsc-users] SNESSetJacobian In-Reply-To: <54229B39.8060900@uni-mainz.de> References: <542188D1.40300@uni-mainz.de> <54229B39.8060900@uni-mainz.de> Message-ID: On Sep 24, 2014, at 5:21 AM, anton wrote: > > On 09/23/2014 06:11 PM, Barry Smith wrote: >> On Sep 23, 2014, at 9:50 AM, anton wrote: >> >>> Starting from version 3.5 the matrix parameters in SNESSetJacobian are no longer pointers, hence my question: >>> What is the most appropriate place to call SNESSetJacobian if I need to change the Jacobian during solution? >>> What about FormFunction? >> Could you please explain why you need to change the Mat? Our hope was that people would not need to change it. Note that you can change the type of a matrix at any time. So for example inside your FormJacobian you can have code like MatSetType(J,MATAIJ) this wipes out the old matrix data structure and gives you an empty matrix of the new type ready to be preallocated and then filled. Let us know what you need. > > > How should a user switch from assembled to a matrix-free Jacobian (for example) within one run? Simplest is resetting SNES altogether, I guess. So you want to run, say, three steps of Newton ?matrix-free? and then four steps with an explicit matrix? Then I guess you could call SNESSetJacobian() inside a SNES monitor routine, or even in your compute Jacobian routine. Calling it within FormFunction would not be a good idea since FormFunction is called in a variety of places. Barry > > Anton > >> Barry >> >>> Thanks, >>> Anton > From salazardetroya at gmail.com Wed Sep 24 11:31:51 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Wed, 24 Sep 2014 11:31:51 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: Thanks for your response. My discretization is based on spring elements. For the linear one dimensional case in which each spring has a coefficient k, their jacobian would be this two by two matrix. [ k -k ] [ -k k ] and the internal force [ k ( Ui - Uj) ] [ k ( Uj - Ui) ] where Ui and Uj are the node displacements (just one displacement per node because it's one dimensional) For the two dimensional case, assuming small deformations, we have a four-by-four matrix. Each node has two degrees of freedom. We obtain it by performing the outer product of the vector (t , -t) where "t" is the vector that connects both nodes in a spring. This is for the case of small deformations. I would need to assemble each spring contribution to the jacobian and the residual like they were finite elements. The springs share nodes, that's how they are connected. This example is just the linear case, I will have to implement a nonlinear case in a similar fashion. Seeing the DMNetwork example, I think it's what I need, although I don't know much of power electric grids and it's hard for me to understand what's going on. Do you have a good reference to be able to follow the code? For example, why are they adding components to the edges? 475: DMNetworkAddComponent (networkdm,i,componentkey[0],&pfdata.branch[i-eStart]); Miguel On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. < abhyshr at mcs.anl.gov> wrote: > You may also want to take a look at the DMNetwork framework that can be > used for general unstructured networks that don't use PDEs. Its > description is given in the manual and an example is in > src/snes/examples/tutorials/network/pflow. > > Shri > > From: Matthew Knepley > Date: Tue, 23 Sep 2014 22:40:52 -0400 > To: Miguel Angel Salazar de Troya > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements > > > >On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya > > wrote: > > > >Hi all > >I was wondering if it could be possible to build a model similar to the > >example snes/ex12.c, but with spring elements (for elasticity) instead of > >simplicial elements. Spring elements in a grid, therefore each element > >would have two nodes and each node two components. There would be more > >differences, because instead of calling the functions f0,f1,g0,g1,g2 and > >g3 to build the residual and the jacobian, I would call a routine that > >would build the residual vector and the jacobian matrix directly. I would > >not have shape functions whatsoever. My problem is discrete, I don't have > >a PDE and my equations are algebraic. What is the best way in petsc to > >solve this problem? Is there any example that I can follow? Thanks in > >advance > > > > > > > > > >Yes, ex12 is fairly specific to FEM. However, I think the right tools for > >what you want are > >DMPlex and PetscSection. Here is how I would proceed: > > > > 1) Make a DMPlex that encodes a simple network that you wish to simulate > > > > 2) Make a PetscSection that gets the data layout right. Its hard from > >the above > > for me to understand where you degrees of freedom actually are. > >This is usually > > the hard part. > > > > 3) Calculate the residual, so you can check an exact solution. Here you > >use the > > PetscSectionGetDof/Offset() for each mesh piece that you are > >interested in. Again, > > its hard to be more specific when I do not understand your > >discretization. > > > > Thanks, > > > > Matt > > > > > >Miguel > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > > > > > > > > > > > > > > > > > > >-- > >What most experimenters take for granted before they begin their > >experiments is infinitely more interesting than any results to which > >their experiments lead. > >-- Norbert Wiener > > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 24 12:01:39 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 12:01:39 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: On Wed, Sep 24, 2014 at 11:31 AM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > Thanks for your response. My discretization is based on spring elements. > For the linear one dimensional case in which each spring has a coefficient > k, their jacobian would be this two by two matrix. > > [ k -k ] > [ -k k ] > > and the internal force > > [ k ( Ui - Uj) ] > [ k ( Uj - Ui) ] > > where Ui and Uj are the node displacements (just one displacement per node > because it's one dimensional) > > For the two dimensional case, assuming small deformations, we have a > four-by-four matrix. Each node has two degrees of freedom. We obtain it by > performing the outer product of the vector (t , -t) where "t" is the vector > that connects both nodes in a spring. This is for the case of small > deformations. I would need to assemble each spring contribution to the > jacobian and the residual like they were finite elements. The springs share > nodes, that's how they are connected. This example is just the linear case, > I will have to implement a nonlinear case in a similar fashion. > > Seeing the DMNetwork example, I think it's what I need, although I don't > know much of power electric grids and it's hard for me to understand what's > going on. Do you have a good reference to be able to follow the code? For > example, why are they adding components to the edges? > Okay, I understand what you want now. I would recommend doing a simple example by hand first. The PetscSection is very simple, just 'dim' degrees of freedom for each vertex. It should look something like DMPlexGetDepthStratum(dm, 0, &vStart, &vEnd); PetscSectionCreate(comm, &s); PetscSectionSetNumFields(s, 1); PetscSectionSetChart(s, vStart, vEnd); for (v = vStart; v < vEnd; ++v) { PetscSectionSetDof(s, v, dim); } PetscSectionSetUp(s); DMSetDefaultSection(dm, s); PetscSectionDestroy(&s); Then when you form the residual, you loop over edges and add a contribution to each vertex (I think, its still not clear to me how your residual is defined) VecGetArray(locF, &residual); DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd); for (c = cStart; c < cEnd; ++c) { const PetscInt *cone; PetscInt offA, offB; DMPlexGetCone(c, &cone); PetscSectionGetOffset(s, cone[0], &offA); PetscSectionGetOffset(s, cone[1], &offB); residual[offA] = ; residual[offB] = ; } VecRestoreArray(locF, &residual); After that works, using DMNetwork should be much easier. Thanks, Matt > 475: DMNetworkAddComponent (networkdm,i,componentkey[0],&pfdata.branch[i-eStart]); > > Miguel > > > On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. < > abhyshr at mcs.anl.gov> wrote: > >> You may also want to take a look at the DMNetwork framework that can be >> used for general unstructured networks that don't use PDEs. Its >> description is given in the manual and an example is in >> src/snes/examples/tutorials/network/pflow. >> >> Shri >> >> From: Matthew Knepley >> Date: Tue, 23 Sep 2014 22:40:52 -0400 >> To: Miguel Angel Salazar de Troya >> Cc: "petsc-users at mcs.anl.gov" >> Subject: Re: [petsc-users] DMPlex with spring elements >> >> >> >On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >> > wrote: >> > >> >Hi all >> >I was wondering if it could be possible to build a model similar to the >> >example snes/ex12.c, but with spring elements (for elasticity) instead of >> >simplicial elements. Spring elements in a grid, therefore each element >> >would have two nodes and each node two components. There would be more >> >differences, because instead of calling the functions f0,f1,g0,g1,g2 and >> >g3 to build the residual and the jacobian, I would call a routine that >> >would build the residual vector and the jacobian matrix directly. I would >> >not have shape functions whatsoever. My problem is discrete, I don't have >> >a PDE and my equations are algebraic. What is the best way in petsc to >> >solve this problem? Is there any example that I can follow? Thanks in >> >advance >> > >> > >> > >> > >> >Yes, ex12 is fairly specific to FEM. However, I think the right tools for >> >what you want are >> >DMPlex and PetscSection. Here is how I would proceed: >> > >> > 1) Make a DMPlex that encodes a simple network that you wish to >> simulate >> > >> > 2) Make a PetscSection that gets the data layout right. Its hard from >> >the above >> > for me to understand where you degrees of freedom actually are. >> >This is usually >> > the hard part. >> > >> > 3) Calculate the residual, so you can check an exact solution. Here you >> >use the >> > PetscSectionGetDof/Offset() for each mesh piece that you are >> >interested in. Again, >> > its hard to be more specific when I do not understand your >> >discretization. >> > >> > Thanks, >> > >> > Matt >> > >> > >> >Miguel >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> >salaza11 at illinois.edu >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >What most experimenters take for granted before they begin their >> >experiments is infinitely more interesting than any results to which >> >their experiments lead. >> >-- Norbert Wiener >> >> > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Wed Sep 24 13:43:51 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Wed, 24 Sep 2014 18:43:51 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: >Thanks for your response. My discretization is based on spring elements. >For the linear one dimensional case in which each spring has a >coefficient k, their jacobian would be this two by two matrix. >[ k -k ] >[ -k k ] > >and the internal force > >[ k ( Ui - Uj) ] >[ k ( Uj - Ui) ] > >where Ui and Uj are the node displacements (just one displacement per >node because it's one dimensional) > >For the two dimensional case, assuming small deformations, we have a >four-by-four matrix. Each node has two degrees of freedom. We obtain it >by performing the outer product of the vector (t , -t) where "t" is the >vector that connects both nodes in a spring. This is for the case of >small deformations. I would need to assemble each spring contribution to >the jacobian and the residual like they were finite elements. The springs >share nodes, that's how they are connected. This example is just the >linear case, I will have to implement a nonlinear case in a similar >fashion. > >Seeing the DMNetwork example, I think it's what I need, although I don't >know much of power electric grids and it's hard for me to understand >what's going on. Do you have a good reference to be able to follow the >code? > Please see the attached document which has more description of DMNetwork and the equations for the power grid example. I don't have anything that describes how the power grid example is implemented. >For example, why are they adding components to the edges? > >475: DMNetworkAddComponent >MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey[ >0],&pfdata.branch[i-eStart]);Miguel Each edge or node can have several components (limited to 10) attached to it. The term components, taken from the circuit terminology, refers to the elements of a network. For example, a component could be a resistor, inductor, spring, or even edge/vertex weights (for graph problems). For code implementation, component is a data structure that holds the data needed for the residual, Jacobian, or any other function evaluation. In the case of power grid, there are 4 components: branches or transmission lines connecting nodes, buses or nodes, generators that are incident at a subset of the nodes, and loads that are also incident at a subset of the nodes. Each of the these components are defined by their data structures given in pf.h. DMNetwork is a wrapper class of DMPlex specifically for network applications that can be solely described using nodes, edges, and their associated components. If you have a PDE, or need FEM, or need other advanced features then DMPlex would be suitable. Please send us a write-up of your equations so that we can assist you better. Shri > > >On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. > wrote: > >You may also want to take a look at the DMNetwork framework that can be >used for general unstructured networks that don't use PDEs. Its >description is given in the manual and an example is in >src/snes/examples/tutorials/network/pflow. > >Shri > >From: Matthew Knepley >Date: Tue, 23 Sep 2014 22:40:52 -0400 >To: Miguel Angel Salazar de Troya >Cc: "petsc-users at mcs.anl.gov" >Subject: Re: [petsc-users] DMPlex with spring elements > > >>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >> wrote: >> >>Hi all >>I was wondering if it could be possible to build a model similar to the >>example snes/ex12.c, but with spring elements (for elasticity) instead of >>simplicial elements. Spring elements in a grid, therefore each element >>would have two nodes and each node two components. There would be more >>differences, because instead of calling the functions f0,f1,g0,g1,g2 and >>g3 to build the residual and the jacobian, I would call a routine that >>would build the residual vector and the jacobian matrix directly. I would >>not have shape functions whatsoever. My problem is discrete, I don't have >>a PDE and my equations are algebraic. What is the best way in petsc to >>solve this problem? Is there any example that I can follow? Thanks in >>advance >> >> >> >> >>Yes, ex12 is fairly specific to FEM. However, I think the right tools for >>what you want are >>DMPlex and PetscSection. Here is how I would proceed: >> >> 1) Make a DMPlex that encodes a simple network that you wish to >>simulate >> >> 2) Make a PetscSection that gets the data layout right. Its hard from >>the above >> for me to understand where you degrees of freedom actually are. >>This is usually >> the hard part. >> >> 3) Calculate the residual, so you can check an exact solution. Here you >>use the >> PetscSectionGetDof/Offset() for each mesh piece that you are >>interested in. Again, >> its hard to be more specific when I do not understand your >>discretization. >> >> Thanks, >> >> Matt >> >> >>Miguel >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign > > >>(217) 550-2360 >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >>-- >>What most experimenters take for granted before they begin their >>experiments is infinitely more interesting than any results to which >>their experiments lead. >>-- Norbert Wiener > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu -------------- next part -------------- A non-text attachment was scrubbed... Name: dmnetwork-abstract.pdf Type: application/pdf Size: 195830 bytes Desc: dmnetwork-abstract.pdf URL: From brianyang1106 at gmail.com Wed Sep 24 14:08:22 2014 From: brianyang1106 at gmail.com (Brian Yang) Date: Wed, 24 Sep 2014 14:08:22 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? Message-ID: Hi all, For example, I am using LSQR or CG to solve Ax=b. When we turn on the monitor option of solving the linear system, we could see iteration number and residual for each iteration. I believe it's L2 norm of the objective function? If yes, is there a way that I could set it to L1 norm for solving the system? Thanks. -- Brian Yang U of Houston -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 24 14:15:42 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 14:15:42 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: References: Message-ID: On Wed, Sep 24, 2014 at 2:08 PM, Brian Yang wrote: > Hi all, > > For example, I am using LSQR or CG to solve Ax=b. > > When we turn on the monitor option of solving the linear system, we could > see iteration number and residual for each iteration. I believe it's L2 > norm of the objective function? > > If yes, is there a way that I could set it to L1 norm for solving the > system? > Yes, you can use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetConvergenceTest.html and then use VecNorm() with NORM_1 instead. Thanks, Matt > Thanks. > > -- > Brian Yang > U of Houston > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 24 14:17:35 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 24 Sep 2014 14:17:35 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: References: Message-ID: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> Brian, Do you wish to monitor the convergence of the L1 norm, or do you wish to use the L1 norm in the convergence test? Regardless you cannot change the LSQR or CG to be a different algorithm (with different convergence properties) with the L1 norm. Barry On Sep 24, 2014, at 2:08 PM, Brian Yang wrote: > Hi all, > > For example, I am using LSQR or CG to solve Ax=b. > > When we turn on the monitor option of solving the linear system, we could see iteration number and residual for each iteration. I believe it's L2 norm of the objective function? > > If yes, is there a way that I could set it to L1 norm for solving the system? > > Thanks. > > -- > Brian Yang > U of Houston > > > From brianyang1106 at gmail.com Wed Sep 24 14:24:30 2014 From: brianyang1106 at gmail.com (Brian Yang) Date: Wed, 24 Sep 2014 14:24:30 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> References: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> Message-ID: Thanks Mat and Barry, Yes I want to use the L1 norm in the convergence test. For the ||Ax-b||, I always want to calculate L1 norm instead of L2. Is Mat's way gonna work on this? On Wed, Sep 24, 2014 at 2:17 PM, Barry Smith wrote: > > Brian, > > Do you wish to monitor the convergence of the L1 norm, or do you wish > to use the L1 norm in the convergence test? Regardless you cannot change > the LSQR or CG to be a different algorithm (with different convergence > properties) with the L1 norm. > > > Barry > > On Sep 24, 2014, at 2:08 PM, Brian Yang wrote: > > > Hi all, > > > > For example, I am using LSQR or CG to solve Ax=b. > > > > When we turn on the monitor option of solving the linear system, we > could see iteration number and residual for each iteration. I believe it's > L2 norm of the objective function? > > > > If yes, is there a way that I could set it to L1 norm for solving the > system? > > > > Thanks. > > > > -- > > Brian Yang > > U of Houston > > > > > > > > -- Brian Yang U of Houston -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Sep 24 14:31:38 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 24 Sep 2014 14:31:38 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: References: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> Message-ID: <81FC0C31-3961-46F6-9F53-7C216A03118B@mcs.anl.gov> Brian, Your convergence test routine could call KSPBuildResidual() and then compute the 1-norm of the residual and make any decision it likes based on that norm. See KSPSetConvergenceTest() Barry On Sep 24, 2014, at 2:24 PM, Brian Yang wrote: > Thanks Mat and Barry, > > Yes I want to use the L1 norm in the convergence test. For the ||Ax-b||, I always want to calculate L1 norm instead of L2. Is Mat's way gonna work on this? > > On Wed, Sep 24, 2014 at 2:17 PM, Barry Smith wrote: > > Brian, > > Do you wish to monitor the convergence of the L1 norm, or do you wish to use the L1 norm in the convergence test? Regardless you cannot change the LSQR or CG to be a different algorithm (with different convergence properties) with the L1 norm. > > > Barry > > On Sep 24, 2014, at 2:08 PM, Brian Yang wrote: > > > Hi all, > > > > For example, I am using LSQR or CG to solve Ax=b. > > > > When we turn on the monitor option of solving the linear system, we could see iteration number and residual for each iteration. I believe it's L2 norm of the objective function? > > > > If yes, is there a way that I could set it to L1 norm for solving the system? > > > > Thanks. > > > > -- > > Brian Yang > > U of Houston > > > > > > > > > > > -- > Brian Yang > U of Houston > > > From knepley at gmail.com Wed Sep 24 14:35:01 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 14:35:01 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: <81FC0C31-3961-46F6-9F53-7C216A03118B@mcs.anl.gov> References: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> <81FC0C31-3961-46F6-9F53-7C216A03118B@mcs.anl.gov> Message-ID: On Wed, Sep 24, 2014 at 2:31 PM, Barry Smith wrote: > > Brian, > > Your convergence test routine could call KSPBuildResidual() and then > compute the 1-norm of the residual and make any decision it likes based on > that norm. See KSPSetConvergenceTest() Note that what Barry is saying is that the convergence theory for CG guarantees monotonicity in the energy (A) norm, but says nothing about L1 so you might get crap. Matt > > Barry > > On Sep 24, 2014, at 2:24 PM, Brian Yang wrote: > > > Thanks Mat and Barry, > > > > Yes I want to use the L1 norm in the convergence test. For the ||Ax-b||, > I always want to calculate L1 norm instead of L2. Is Mat's way gonna work > on this? > > > > On Wed, Sep 24, 2014 at 2:17 PM, Barry Smith wrote: > > > > Brian, > > > > Do you wish to monitor the convergence of the L1 norm, or do you wish > to use the L1 norm in the convergence test? Regardless you cannot change > the LSQR or CG to be a different algorithm (with different convergence > properties) with the L1 norm. > > > > > > Barry > > > > On Sep 24, 2014, at 2:08 PM, Brian Yang wrote: > > > > > Hi all, > > > > > > For example, I am using LSQR or CG to solve Ax=b. > > > > > > When we turn on the monitor option of solving the linear system, we > could see iteration number and residual for each iteration. I believe it's > L2 norm of the objective function? > > > > > > If yes, is there a way that I could set it to L1 norm for solving the > system? > > > > > > Thanks. > > > > > > -- > > > Brian Yang > > > U of Houston > > > > > > > > > > > > > > > > > > > -- > > Brian Yang > > U of Houston > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Sep 24 14:54:08 2014 From: jed at jedbrown.org (Jed Brown) Date: Wed, 24 Sep 2014 12:54:08 -0700 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: References: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> <81FC0C31-3961-46F6-9F53-7C216A03118B@mcs.anl.gov> Message-ID: <871tr0hk0v.fsf@jedbrown.org> Matthew Knepley writes: > On Wed, Sep 24, 2014 at 2:31 PM, Barry Smith wrote: > >> >> Brian, >> >> Your convergence test routine could call KSPBuildResidual() and then >> compute the 1-norm of the residual and make any decision it likes based on >> that norm. See KSPSetConvergenceTest() > > > Note that what Barry is saying is that the convergence theory for CG > guarantees monotonicity in the energy (A) norm, but says nothing > about L1 so you might get crap. Moreover, if you have an overdetermined linear system, but want to minimize the 1-norm of the residual (instead of the 2-norm that least squares minimizes), you are actually asking to solve a "linear program" and need to use an LP solver (a very different algorithm). If you merely watch the 1-norm of the residual, it might increase as the least squares solver converges. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From john.m.alletto at lmco.com Wed Sep 24 15:03:14 2014 From: john.m.alletto at lmco.com (Alletto, John M) Date: Wed, 24 Sep 2014 20:03:14 +0000 Subject: [petsc-users] Setting coordinates Message-ID: All, I am trying to inset coordinate data into my program line 160 below comes back with a NULL pointer for coordinates. I did not expect this - what do I do, what does it mean? John 140 DM coordDA; 141 Vec coordinates; 142: DMDACoor3d ***coords; 143: PetscScalar u, ux, uy, uxx, uyy; 144: PetscReal D, K, hx, hy, hxdhy, hydhx; 145: PetscInt i,j, k; 150: D = user->D; 151: K = user->K; 152: hx = 1.0/(PetscReal)(info->mx-1); 153: hy = 1.0/(PetscReal)(info->my-1); 154: hxdhy = hx/hy; 155: hydhx = hy/hx; 156: /* 157: Compute function over the locally owned part of the grid 158: */ 159: DMGetCoordinateDA(info->da, &coordDA); 160: DMGetCoordinatesLocal(info->da, &coordinates); 161: DMDAVecGetArray(coordDA, coordinates, &coords); From knepley at gmail.com Wed Sep 24 15:09:22 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 15:09:22 -0500 Subject: [petsc-users] Setting coordinates In-Reply-To: References: Message-ID: On Wed, Sep 24, 2014 at 3:03 PM, Alletto, John M wrotAll, > > > I am trying to inset coordinate data into my program line 160 below comes > back with a NULL pointer for coordinates. > I did not expect this - what do I do, what does it mean? > Coordinates are not created by default since they take up a lot of memory. You either create the storage manually, or use something like this: http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMDASetUniformCoordinates.html Thanks, Matt > > John > > > 140 DM coordDA; > 141 Vec coordinates; > 142: DMDACoor3d ***coords; > 143: PetscScalar u, ux, uy, uxx, uyy; > 144: PetscReal D, K, hx, hy, hxdhy, hydhx; > 145: PetscInt i,j, k; > > > 150: D = user->D; > 151: K = user->K; > 152: hx = 1.0/(PetscReal)(info->mx-1); > 153: hy = 1.0/(PetscReal)(info->my-1); > 154: hxdhy = hx/hy; > 155: hydhx = hy/hx; > 156: /* > 157: Compute function over the locally owned part of the grid > 158: */ > 159: DMGetCoordinateDA(info->da, &coordDA); > 160: DMGetCoordinatesLocal(info->da, &coordinates); > 161: DMDAVecGetArray(coordDA, coordinates, &coords); > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From brianyang1106 at gmail.com Wed Sep 24 15:48:54 2014 From: brianyang1106 at gmail.com (Brian Yang) Date: Wed, 24 Sep 2014 15:48:54 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: <871tr0hk0v.fsf@jedbrown.org> References: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> <81FC0C31-3961-46F6-9F53-7C216A03118B@mcs.anl.gov> <871tr0hk0v.fsf@jedbrown.org> Message-ID: Thanks everyone, I will try KSPSetConvergenceTest to test. On Wed, Sep 24, 2014 at 2:54 PM, Jed Brown wrote: > Matthew Knepley writes: > > > On Wed, Sep 24, 2014 at 2:31 PM, Barry Smith wrote: > > > >> > >> Brian, > >> > >> Your convergence test routine could call KSPBuildResidual() and then > >> compute the 1-norm of the residual and make any decision it likes based > on > >> that norm. See KSPSetConvergenceTest() > > > > > > Note that what Barry is saying is that the convergence theory for CG > > guarantees monotonicity in the energy (A) norm, but says nothing > > about L1 so you might get crap. > > Moreover, if you have an overdetermined linear system, but want to > minimize the 1-norm of the residual (instead of the 2-norm that least > squares minimizes), you are actually asking to solve a "linear program" > and need to use an LP solver (a very different algorithm). > > If you merely watch the 1-norm of the residual, it might increase as the > least squares solver converges. > -- Brian Yang U of Houston -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.m.alletto at lmco.com Wed Sep 24 17:03:34 2014 From: john.m.alletto at lmco.com (Alletto, John M) Date: Wed, 24 Sep 2014 22:03:34 +0000 Subject: [petsc-users] I expected the Laplace run with a smaller delta to have a more accurate solution, yet they come out almost exactly the same. Message-ID: I have set up a test for running a Laplacian solver with 2 sets of data. One has twice as many points as the other. Both cover the same range, I input the X, Y and Z variables set using SetCoordinates. I compare the results with an analytical model. I expected the Laplace run with a smaller delta to have a more accurate solution, yet they come out almost exactly the same. Any ideas? -------------- next part -------------- An HTML attachment was scrubbed... URL: From potaman at outlook.com Wed Sep 24 17:08:25 2014 From: potaman at outlook.com (subramanya sadasiva) Date: Wed, 24 Sep 2014 18:08:25 -0400 Subject: [petsc-users] Generating xdmf from h5 file. Message-ID: Hi, i was trying to use petsc_gen_xdmf.py to convert a h5 file to a xdmf file. The h5 file was generated by snes/ex12 which was run as, ex12 -dm_view hdf5:my.h5 When I do, petsc_gen_xdmf.py my.h5 I get the following error, File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 220, in generateXdmf(sys.argv[1]) File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 208, in generateXdmf time = np.array(h5['time']).flatten() File "/usr/lib/python2.7/dist-packages/h5py/_hl/group.py", line 153, in __getitem__ oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5o.pyx", line 173, in h5py.h5o.open (h5py/h5o.c:3403) KeyError: "unable to open object (Symbol table: Can't open object)" I am not sure if the error is on my end. This is on Ubuntu 14.04 with the serial version of hdf5. I built petsc with --download-hdf5, is it necessary to use the same version of hdf5 to generate the xdmf file? Thanks Subramanya -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 24 17:16:37 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 17:16:37 -0500 Subject: [petsc-users] I expected the Laplace run with a smaller delta to have a more accurate solution, yet they come out almost exactly the same. In-Reply-To: References: Message-ID: On Wed, Sep 24, 2014 at 5:03 PM, Alletto, John M wrote: > I have set up a test for running a Laplacian solver with 2 sets of data. > > One has twice as many points as the other. > > Both cover the same range, I input the X, Y and Z variables set using > SetCoordinates. > > > > I compare the results with an analytical model. > > > > I expected the Laplace run with a smaller delta to have a more accurate > solution, yet they come out almost exactly the same. > > > > Any ideas? > > I think you may have the wrong idea of accuracy. If you are expecting to converge in the L_2 norm, you must do an integral to get the error, rather than just take the difference of vertex values (that is the l2 norm). You could be seeing superconvergence at the vertices, but I do not know what discretization you are using. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 24 17:19:51 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 17:19:51 -0500 Subject: [petsc-users] Generating xdmf from h5 file. In-Reply-To: References: Message-ID: On Wed, Sep 24, 2014 at 5:08 PM, subramanya sadasiva wrote: > Hi, > i was trying to use petsc_gen_xdmf.py to convert a h5 file to a xdmf file. > The h5 file was generated by snes/ex12 which was run as, > > ex12 -dm_view hdf5:my.h5 > > When I do, > petsc_gen_xdmf.py my.h5 > > I get the following error, > > File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", > line 220, in > generateXdmf(sys.argv[1]) > File > "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line > 208, in generateXdmf > time = np.array(h5['time']).flatten() > File "/usr/lib/python2.7/dist-packages/h5py/_hl/group.py", line 153, in > __getitem__ > oid = h5o.open(self.id, self._e(name), lapl=self._lapl) > File "h5o.pyx", line 173, in h5py.h5o.open (h5py/h5o.c:3403) > KeyError: "unable to open object (Symbol table: Can't open object)" > > I am not sure if the error is on my end. This is on Ubuntu 14.04 with the > serial version of hdf5. I built petsc with --download-hdf5, is it necessary > to use the same version of hdf5 to generate the xdmf file? > That code is alpha, and mainly built for me to experiment with an application here, so it is not user-friendly. In your HDF5 file, there is no 'time' since you are not running a TS. This access to h5['time'] should just be protected, and an empty array should be put in if its not there. Matt > Thanks > Subramanya > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From potaman at outlook.com Wed Sep 24 17:32:54 2014 From: potaman at outlook.com (subramanya sadasiva) Date: Wed, 24 Sep 2014 18:32:54 -0400 Subject: [petsc-users] Generating xdmf from h5 file. In-Reply-To: References: , Message-ID: Hi Matt, That did not help. Is there any other way to output the mesh to something that paraview can view? I tried outputting the file to a vtk file using ex12 -dm_view vtk:my.vtk:ascii_vtk which, I saw in another post on the forums, but that did not give me any output. (Sorry for sending this twice. but I noticed my reply did not go into the users forum) Subramanya Date: Wed, 24 Sep 2014 17:19:51 -0500 Subject: Re: [petsc-users] Generating xdmf from h5 file. From: knepley at gmail.com To: potaman at outlook.com CC: petsc-users at mcs.anl.gov On Wed, Sep 24, 2014 at 5:08 PM, subramanya sadasiva wrote: Hi, i was trying to use petsc_gen_xdmf.py to convert a h5 file to a xdmf file. The h5 file was generated by snes/ex12 which was run as, ex12 -dm_view hdf5:my.h5 When I do, petsc_gen_xdmf.py my.h5 I get the following error, File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 220, in generateXdmf(sys.argv[1]) File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 208, in generateXdmf time = np.array(h5['time']).flatten() File "/usr/lib/python2.7/dist-packages/h5py/_hl/group.py", line 153, in __getitem__ oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5o.pyx", line 173, in h5py.h5o.open (h5py/h5o.c:3403) KeyError: "unable to open object (Symbol table: Can't open object)" I am not sure if the error is on my end. This is on Ubuntu 14.04 with the serial version of hdf5. I built petsc with --download-hdf5, is it necessary to use the same version of hdf5 to generate the xdmf file? That code is alpha, and mainly built for me to experiment with an application here, so it is not user-friendly. In yourHDF5 file, there is no 'time' since you are not running a TS. This access to h5['time'] should just be protected, andan empty array should be put in if its not there. Matt Thanks Subramanya -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 24 17:36:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 17:36:52 -0500 Subject: [petsc-users] Generating xdmf from h5 file. In-Reply-To: References: Message-ID: On Wed, Sep 24, 2014 at 5:29 PM, subramanya sadasiva wrote: > Hi Matt, > That did not help. > That's not enough description to fix anything, and fixing it will require programming. > Is there any other way to output the mesh to something that paraview can > view? I tried outputting the file to a vtk file using > ex12 -dm_view vtk:my.vtk:ascii_vtk > > which, I saw in another post on the forums, but that did not give me any > output. > This is mixing two different things. PETSc has a diagnostic ASCII vtk output, so the type would be ascii, not vtk, and format ascii_vtk . It also has a production VTU output, which is type vtk with format vtk_vtu. Thanks, Matt > > Subramanya > > ------------------------------ > Date: Wed, 24 Sep 2014 17:19:51 -0500 > Subject: Re: [petsc-users] Generating xdmf from h5 file. > From: knepley at gmail.com > To: potaman at outlook.com > CC: petsc-users at mcs.anl.gov > > On Wed, Sep 24, 2014 at 5:08 PM, subramanya sadasiva > wrote: > > Hi, > i was trying to use petsc_gen_xdmf.py to convert a h5 file to a xdmf file. > The h5 file was generated by snes/ex12 which was run as, > > ex12 -dm_view hdf5:my.h5 > > When I do, > petsc_gen_xdmf.py my.h5 > > I get the following error, > > File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", > line 220, in > generateXdmf(sys.argv[1]) > File > "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line > 208, in generateXdmf > time = np.array(h5['time']).flatten() > File "/usr/lib/python2.7/dist-packages/h5py/_hl/group.py", line 153, in > __getitem__ > oid = h5o.open(self.id, self._e(name), lapl=self._lapl) > File "h5o.pyx", line 173, in h5py.h5o.open (h5py/h5o.c:3403) > KeyError: "unable to open object (Symbol table: Can't open object)" > > I am not sure if the error is on my end. This is on Ubuntu 14.04 with the > serial version of hdf5. I built petsc with --download-hdf5, is it necessary > to use the same version of hdf5 to generate the xdmf file? > > > That code is alpha, and mainly built for me to experiment with an > application here, so it is not user-friendly. In your > HDF5 file, there is no 'time' since you are not running a TS. This access > to h5['time'] should just be protected, and > an empty array should be put in if its not there. > > Matt > > > Thanks > Subramanya > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Wed Sep 24 17:38:11 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Wed, 24 Sep 2014 17:38:11 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: Thanks for your response. I'm attaching a pdf with a description of the model. The description of the PetscSection is necessary for the DMNetwork? It looks like DMNetwork does not use a PetscSection. Miguel On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. wrote: > > >Thanks for your response. My discretization is based on spring elements. > >For the linear one dimensional case in which each spring has a > >coefficient k, their jacobian would be this two by two matrix. > >[ k -k ] > >[ -k k ] > > > >and the internal force > > > >[ k ( Ui - Uj) ] > >[ k ( Uj - Ui) ] > > > >where Ui and Uj are the node displacements (just one displacement per > >node because it's one dimensional) > > > >For the two dimensional case, assuming small deformations, we have a > >four-by-four matrix. Each node has two degrees of freedom. We obtain it > >by performing the outer product of the vector (t , -t) where "t" is the > >vector that connects both nodes in a spring. This is for the case of > >small deformations. I would need to assemble each spring contribution to > >the jacobian and the residual like they were finite elements. The springs > >share nodes, that's how they are connected. This example is just the > >linear case, I will have to implement a nonlinear case in a similar > >fashion. > > > >Seeing the DMNetwork example, I think it's what I need, although I don't > >know much of power electric grids and it's hard for me to understand > >what's going on. Do you have a good reference to be able to follow the > >code? > > > > Please see the attached document which has more description of DMNetwork > and the equations for the power grid example. I don't have anything that > describes how the power grid example is implemented. > > >For example, why are they adding components to the edges? > > > >475: DMNetworkAddComponent > >< > http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/docs/manualpages/DM/D > >MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey[ > >0],&pfdata.branch[i-eStart]);Miguel > > Each edge or node can have several components (limited to 10) attached to > it. The term components, taken from the circuit terminology, refers to the > elements of a network. For example, a component could be a resistor, > inductor, spring, or even edge/vertex weights (for graph problems). For > code implementation, component is a data structure that holds the data > needed for the residual, Jacobian, or any other function evaluation. In > the case of power grid, there are 4 components: branches or transmission > lines connecting nodes, buses or nodes, generators that are incident at a > subset of the nodes, and loads that are also incident at a subset of the > nodes. Each of the these components are defined by their data structures > given in pf.h. > > DMNetwork is a wrapper class of DMPlex specifically for network > applications that can be solely described using nodes, edges, and their > associated components. If you have a PDE, or need FEM, or need other > advanced features then DMPlex would be suitable. Please send us a write-up > of your equations so that we can assist you better. > > Shri > > > > > > > >On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. > > wrote: > > > >You may also want to take a look at the DMNetwork framework that can be > >used for general unstructured networks that don't use PDEs. Its > >description is given in the manual and an example is in > >src/snes/examples/tutorials/network/pflow. > > > >Shri > > > >From: Matthew Knepley > >Date: Tue, 23 Sep 2014 22:40:52 -0400 > >To: Miguel Angel Salazar de Troya > >Cc: "petsc-users at mcs.anl.gov" > >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > >>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>Hi all > >>I was wondering if it could be possible to build a model similar to the > >>example snes/ex12.c, but with spring elements (for elasticity) instead of > >>simplicial elements. Spring elements in a grid, therefore each element > >>would have two nodes and each node two components. There would be more > >>differences, because instead of calling the functions f0,f1,g0,g1,g2 and > >>g3 to build the residual and the jacobian, I would call a routine that > >>would build the residual vector and the jacobian matrix directly. I would > >>not have shape functions whatsoever. My problem is discrete, I don't have > >>a PDE and my equations are algebraic. What is the best way in petsc to > >>solve this problem? Is there any example that I can follow? Thanks in > >>advance > >> > >> > >> > >> > >>Yes, ex12 is fairly specific to FEM. However, I think the right tools for > >>what you want are > >>DMPlex and PetscSection. Here is how I would proceed: > >> > >> 1) Make a DMPlex that encodes a simple network that you wish to > >>simulate > >> > >> 2) Make a PetscSection that gets the data layout right. Its hard from > >>the above > >> for me to understand where you degrees of freedom actually are. > >>This is usually > >> the hard part. > >> > >> 3) Calculate the residual, so you can check an exact solution. Here you > >>use the > >> PetscSectionGetDof/Offset() for each mesh piece that you are > >>interested in. Again, > >> its hard to be more specific when I do not understand your > >>discretization. > >> > >> Thanks, > >> > >> Matt > >> > >> > >>Miguel > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > > > > > >>(217) 550-2360 > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > > > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: equations.pdf Type: application/pdf Size: 569544 bytes Desc: not available URL: From knepley at gmail.com Wed Sep 24 17:44:01 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 17:44:01 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: On Wed, Sep 24, 2014 at 5:38 PM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > Thanks for your response. I'm attaching a pdf with a description of the > model. The description of the PetscSection is necessary for the DMNetwork? > It looks like DMNetwork does not use a PetscSection. > It does internally. It takes the information you give it and makes one, much like ex12 takes the FEM information and makes a PetscSection. Thanks, Matt > Miguel > > On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. < > abhyshr at mcs.anl.gov> wrote: > >> >> >Thanks for your response. My discretization is based on spring elements. >> >For the linear one dimensional case in which each spring has a >> >coefficient k, their jacobian would be this two by two matrix. >> >[ k -k ] >> >[ -k k ] >> > >> >and the internal force >> > >> >[ k ( Ui - Uj) ] >> >[ k ( Uj - Ui) ] >> > >> >where Ui and Uj are the node displacements (just one displacement per >> >node because it's one dimensional) >> > >> >For the two dimensional case, assuming small deformations, we have a >> >four-by-four matrix. Each node has two degrees of freedom. We obtain it >> >by performing the outer product of the vector (t , -t) where "t" is the >> >vector that connects both nodes in a spring. This is for the case of >> >small deformations. I would need to assemble each spring contribution to >> >the jacobian and the residual like they were finite elements. The springs >> >share nodes, that's how they are connected. This example is just the >> >linear case, I will have to implement a nonlinear case in a similar >> >fashion. >> > >> >Seeing the DMNetwork example, I think it's what I need, although I don't >> >know much of power electric grids and it's hard for me to understand >> >what's going on. Do you have a good reference to be able to follow the >> >code? >> >> > >> Please see the attached document which has more description of DMNetwork >> and the equations for the power grid example. I don't have anything that >> describes how the power grid example is implemented. >> >> >For example, why are they adding components to the edges? >> > >> >475: DMNetworkAddComponent >> >< >> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/docs/manualpages/DM/D >> >> >MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey[ >> >0],&pfdata.branch[i-eStart]);Miguel >> >> Each edge or node can have several components (limited to 10) attached to >> it. The term components, taken from the circuit terminology, refers to the >> elements of a network. For example, a component could be a resistor, >> inductor, spring, or even edge/vertex weights (for graph problems). For >> code implementation, component is a data structure that holds the data >> needed for the residual, Jacobian, or any other function evaluation. In >> the case of power grid, there are 4 components: branches or transmission >> lines connecting nodes, buses or nodes, generators that are incident at a >> subset of the nodes, and loads that are also incident at a subset of the >> nodes. Each of the these components are defined by their data structures >> given in pf.h. >> >> DMNetwork is a wrapper class of DMPlex specifically for network >> applications that can be solely described using nodes, edges, and their >> associated components. If you have a PDE, or need FEM, or need other >> advanced features then DMPlex would be suitable. Please send us a write-up >> of your equations so that we can assist you better. >> >> Shri >> >> >> > >> > >> >On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >> > wrote: >> > >> >You may also want to take a look at the DMNetwork framework that can be >> >used for general unstructured networks that don't use PDEs. Its >> >description is given in the manual and an example is in >> >src/snes/examples/tutorials/network/pflow. >> > >> >Shri >> > >> >From: Matthew Knepley >> >Date: Tue, 23 Sep 2014 22:40:52 -0400 >> >To: Miguel Angel Salazar de Troya >> >Cc: "petsc-users at mcs.anl.gov" >> >Subject: Re: [petsc-users] DMPlex with spring elements >> > >> > >> >>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >> >> wrote: >> >> >> >>Hi all >> >>I was wondering if it could be possible to build a model similar to the >> >>example snes/ex12.c, but with spring elements (for elasticity) instead >> of >> >>simplicial elements. Spring elements in a grid, therefore each element >> >>would have two nodes and each node two components. There would be more >> >>differences, because instead of calling the functions f0,f1,g0,g1,g2 and >> >>g3 to build the residual and the jacobian, I would call a routine that >> >>would build the residual vector and the jacobian matrix directly. I >> would >> >>not have shape functions whatsoever. My problem is discrete, I don't >> have >> >>a PDE and my equations are algebraic. What is the best way in petsc to >> >>solve this problem? Is there any example that I can follow? Thanks in >> >>advance >> >> >> >> >> >> >> >> >> >>Yes, ex12 is fairly specific to FEM. However, I think the right tools >> for >> >>what you want are >> >>DMPlex and PetscSection. Here is how I would proceed: >> >> >> >> 1) Make a DMPlex that encodes a simple network that you wish to >> >>simulate >> >> >> >> 2) Make a PetscSection that gets the data layout right. Its hard from >> >>the above >> >> for me to understand where you degrees of freedom actually are. >> >>This is usually >> >> the hard part. >> >> >> >> 3) Calculate the residual, so you can check an exact solution. Here >> you >> >>use the >> >> PetscSectionGetDof/Offset() for each mesh piece that you are >> >>interested in. Again, >> >> its hard to be more specific when I do not understand your >> >>discretization. >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> >> >>Miguel >> >> >> >> >> >> >> >>-- >> >>Miguel Angel Salazar de Troya >> >>Graduate Research Assistant >> >>Department of Mechanical Science and Engineering >> >>University of Illinois at Urbana-Champaign >> > >> > >> >>(217) 550-2360 >> >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>-- >> >>What most experimenters take for granted before they begin their >> >>experiments is infinitely more interesting than any results to which >> >>their experiments lead. >> >>-- Norbert Wiener >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> >salaza11 at illinois.edu >> >> > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From brianyang1106 at gmail.com Wed Sep 24 18:08:58 2014 From: brianyang1106 at gmail.com (Brian Yang) Date: Wed, 24 Sep 2014 18:08:58 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: References: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> <81FC0C31-3961-46F6-9F53-7C216A03118B@mcs.anl.gov> <871tr0hk0v.fsf@jedbrown.org> Message-ID: For LSQR, http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/impls/lsqr/lsqr.c.html#KSPLSQR I checked the lsqr.c implementation, within routine KSPSolve_LSQR, I saw few places using VecNorm and before KSP_Monitor to output the rnorm, there were many intermediate updates for the iteration. I am just wondering whether I could replace NORM2 to NORM1 and remove related square (or square root) operations to achieve my goal? Thanks. On Wed, Sep 24, 2014 at 3:48 PM, Brian Yang wrote: > Thanks everyone, I will try KSPSetConvergenceTest to test. > > On Wed, Sep 24, 2014 at 2:54 PM, Jed Brown wrote: > >> Matthew Knepley writes: >> >> > On Wed, Sep 24, 2014 at 2:31 PM, Barry Smith >> wrote: >> > >> >> >> >> Brian, >> >> >> >> Your convergence test routine could call KSPBuildResidual() and >> then >> >> compute the 1-norm of the residual and make any decision it likes >> based on >> >> that norm. See KSPSetConvergenceTest() >> > >> > >> > Note that what Barry is saying is that the convergence theory for CG >> > guarantees monotonicity in the energy (A) norm, but says nothing >> > about L1 so you might get crap. >> >> Moreover, if you have an overdetermined linear system, but want to >> minimize the 1-norm of the residual (instead of the 2-norm that least >> squares minimizes), you are actually asking to solve a "linear program" >> and need to use an LP solver (a very different algorithm). >> >> If you merely watch the 1-norm of the residual, it might increase as the >> least squares solver converges. >> > > > > -- > Brian Yang > U of Houston > > > > -- Brian Yang U of Houston -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Sep 24 18:13:04 2014 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Sep 2014 18:13:04 -0500 Subject: [petsc-users] How to set L1 norm as the converge test? In-Reply-To: References: <66EA868A-6648-4989-830D-2306467E0EFA@mcs.anl.gov> <81FC0C31-3961-46F6-9F53-7C216A03118B@mcs.anl.gov> <871tr0hk0v.fsf@jedbrown.org> Message-ID: On Wed, Sep 24, 2014 at 6:08 PM, Brian Yang wrote: > For LSQR, > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/impls/lsqr/lsqr.c.html#KSPLSQR > > I checked the lsqr.c implementation, within routine KSPSolve_LSQR, I saw > few places using VecNorm and before KSP_Monitor to output the rnorm, there > were many intermediate updates for the iteration. > > I am just wondering whether I could replace NORM2 to NORM1 and remove > related square (or square root) operations to achieve my goal? > I think the depends on your goal. If it is the print the L1 norm out, then the monitor or convergence test is enough. If it is to have a mathematically consistent way of using the L1 norm, then no you will have to use a different solver. Thanks, Matt > Thanks. > > On Wed, Sep 24, 2014 at 3:48 PM, Brian Yang > wrote: > >> Thanks everyone, I will try KSPSetConvergenceTest to test. >> >> On Wed, Sep 24, 2014 at 2:54 PM, Jed Brown wrote: >> >>> Matthew Knepley writes: >>> >>> > On Wed, Sep 24, 2014 at 2:31 PM, Barry Smith >>> wrote: >>> > >>> >> >>> >> Brian, >>> >> >>> >> Your convergence test routine could call KSPBuildResidual() and >>> then >>> >> compute the 1-norm of the residual and make any decision it likes >>> based on >>> >> that norm. See KSPSetConvergenceTest() >>> > >>> > >>> > Note that what Barry is saying is that the convergence theory for CG >>> > guarantees monotonicity in the energy (A) norm, but says nothing >>> > about L1 so you might get crap. >>> >>> Moreover, if you have an overdetermined linear system, but want to >>> minimize the 1-norm of the residual (instead of the 2-norm that least >>> squares minimizes), you are actually asking to solve a "linear program" >>> and need to use an LP solver (a very different algorithm). >>> >>> If you merely watch the 1-norm of the residual, it might increase as the >>> least squares solver converges. >>> >> >> >> >> -- >> Brian Yang >> U of Houston >> >> >> >> > > > -- > Brian Yang > U of Houston > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Wed Sep 24 21:52:07 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Thu, 25 Sep 2014 02:52:07 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: If you have equations only at the nodes, with a part of it contributed by the edges (springs), then you can use DMNetwork. If you are planning to have equations for the beads in the future, or other higher layers, then DMPlex has better functionality to manage that. Shri From: Miguel Angel Salazar de Troya > Date: Wed, 24 Sep 2014 17:38:11 -0500 To: Shri > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements Thanks for your response. I'm attaching a pdf with a description of the model. The description of the PetscSection is necessary for the DMNetwork? It looks like DMNetwork does not use a PetscSection. Miguel On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. > wrote: >Thanks for your response. My discretization is based on spring elements. >For the linear one dimensional case in which each spring has a >coefficient k, their jacobian would be this two by two matrix. >[ k -k ] >[ -k k ] > >and the internal force > >[ k ( Ui - Uj) ] >[ k ( Uj - Ui) ] > >where Ui and Uj are the node displacements (just one displacement per >node because it's one dimensional) > >For the two dimensional case, assuming small deformations, we have a >four-by-four matrix. Each node has two degrees of freedom. We obtain it >by performing the outer product of the vector (t , -t) where "t" is the >vector that connects both nodes in a spring. This is for the case of >small deformations. I would need to assemble each spring contribution to >the jacobian and the residual like they were finite elements. The springs >share nodes, that's how they are connected. This example is just the >linear case, I will have to implement a nonlinear case in a similar >fashion. > >Seeing the DMNetwork example, I think it's what I need, although I don't >know much of power electric grids and it's hard for me to understand >what's going on. Do you have a good reference to be able to follow the >code? > Please see the attached document which has more description of DMNetwork and the equations for the power grid example. I don't have anything that describes how the power grid example is implemented. >For example, why are they adding components to the edges? > >475: DMNetworkAddComponent >MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey[ >0],&pfdata.branch[i-eStart]);Miguel Each edge or node can have several components (limited to 10) attached to it. The term components, taken from the circuit terminology, refers to the elements of a network. For example, a component could be a resistor, inductor, spring, or even edge/vertex weights (for graph problems). For code implementation, component is a data structure that holds the data needed for the residual, Jacobian, or any other function evaluation. In the case of power grid, there are 4 components: branches or transmission lines connecting nodes, buses or nodes, generators that are incident at a subset of the nodes, and loads that are also incident at a subset of the nodes. Each of the these components are defined by their data structures given in pf.h. DMNetwork is a wrapper class of DMPlex specifically for network applications that can be solely described using nodes, edges, and their associated components. If you have a PDE, or need FEM, or need other advanced features then DMPlex would be suitable. Please send us a write-up of your equations so that we can assist you better. Shri > > >On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >> wrote: > >You may also want to take a look at the DMNetwork framework that can be >used for general unstructured networks that don't use PDEs. Its >description is given in the manual and an example is in >src/snes/examples/tutorials/network/pflow. > >Shri > >From: Matthew Knepley > >Date: Tue, 23 Sep 2014 22:40:52 -0400 >To: Miguel Angel Salazar de Troya > >Cc: "petsc-users at mcs.anl.gov" > >Subject: Re: [petsc-users] DMPlex with spring elements > > >>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >>> wrote: >> >>Hi all >>I was wondering if it could be possible to build a model similar to the >>example snes/ex12.c, but with spring elements (for elasticity) instead of >>simplicial elements. Spring elements in a grid, therefore each element >>would have two nodes and each node two components. There would be more >>differences, because instead of calling the functions f0,f1,g0,g1,g2 and >>g3 to build the residual and the jacobian, I would call a routine that >>would build the residual vector and the jacobian matrix directly. I would >>not have shape functions whatsoever. My problem is discrete, I don't have >>a PDE and my equations are algebraic. What is the best way in petsc to >>solve this problem? Is there any example that I can follow? Thanks in >>advance >> >> >> >> >>Yes, ex12 is fairly specific to FEM. However, I think the right tools for >>what you want are >>DMPlex and PetscSection. Here is how I would proceed: >> >> 1) Make a DMPlex that encodes a simple network that you wish to >>simulate >> >> 2) Make a PetscSection that gets the data layout right. Its hard from >>the above >> for me to understand where you degrees of freedom actually are. >>This is usually >> the hard part. >> >> 3) Calculate the residual, so you can check an exact solution. Here you >>use the >> PetscSectionGetDof/Offset() for each mesh piece that you are >>interested in. Again, >> its hard to be more specific when I do not understand your >>discretization. >> >> Thanks, >> >> Matt >> >> >>Miguel >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign > > >>(217) 550-2360 >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >>-- >>What most experimenters take for granted before they begin their >>experiments is infinitely more interesting than any results to which >>their experiments lead. >>-- Norbert Wiener > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu -- Miguel Angel Salazar de Troya Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From filippo.leonardi at sam.math.ethz.ch Thu Sep 25 02:43:20 2014 From: filippo.leonardi at sam.math.ethz.ch (Filippo Leonardi) Date: Thu, 25 Sep 2014 09:43:20 +0200 Subject: [petsc-users] Reduce DMDA's Message-ID: <5495841.MtzeHitpxM@besikovitch-ii> Hi, Let's say I have some independent DMDAs, created as follows: MPI_Comm_split(MPI_COMM_WORLD, comm_rank % 2, 0, &newcomm); DM da; DMDACreate2d(newcomm, DM_BOUNDARY_PERIODIC, DM_BOUNDARY_NONE, DMDA_STENCIL_BOX ,50, 50, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL,&da); For instance for 4 processors I get 2 DMDA's. Now, I want to reduce (in the sense of MPI) the global/local DMDA vectors to only one of the MPI groups (say group 0). Is there an elegant way (e.g. with Scatters) to do that? My current implementation would be: get the local array on each process and reduce (with MPI_Reduce) to the root of each partition. DMDA for Group 1: +------+------+ | 0 | 1 | | | | +------+------+ | 2 | 3 | | | | +------+------+ DMDA for Group 2: +------+------+ | 4 | 5 | | | | +------+------+ | 6 | 7 | | | | +------+------+ Reduce rank 0 and 4 to rank 0. Reduce rank 1 and 5 to rank 1. Reduce rank 2 and 6 to rank 2. Reduce rank 3 and 7 to rank 3. Clearly this implementation is cumbersome. Any idea? Best, Filippo -------------- next part -------------- A non-text attachment was scrubbed... Name: ETHZ.vcf Type: text/vcard Size: 593 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ETHZ.vcf Type: text/vcard Size: 594 bytes Desc: not available URL: From salazardetroya at gmail.com Thu Sep 25 08:55:25 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Thu, 25 Sep 2014 08:55:25 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: Thanks. I think the term "Component" was confusing me, I thought it was related to the components of a field. I think this would be useful to me if I wanted to assign coordinates to the vertices, wouldn't it? Also, I was wondering how to set up dirichlet boundary conditions, basically fixing certain nodes position. Could I do it as the function *SetInitialValues *does it in the pflow example? These values are used to eliminate the zeroth-order energy modes of the stiffness matrix? Last question, in my case I have two degrees of freedom per node, when I grab the offset with DMNetworkVariableOffset, that's for the first degree of freedom in that node and the second degree of freedom would just be offset+1? Miguel On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. wrote: > If you have equations only at the nodes, with a part of it contributed > by the edges (springs), then you can use DMNetwork. If you are planning to > have equations for the beads in the future, or other higher layers, then > DMPlex has better functionality to manage that. > > Shri > > From: Miguel Angel Salazar de Troya > Date: Wed, 24 Sep 2014 17:38:11 -0500 > To: Shri > > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements > > Thanks for your response. I'm attaching a pdf with a description of the > model. The description of the PetscSection is necessary for the DMNetwork? > It looks like DMNetwork does not use a PetscSection. > > > Miguel > > On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. < > abhyshr at mcs.anl.gov> wrote: > >> >> >Thanks for your response. My discretization is based on spring elements. >> >For the linear one dimensional case in which each spring has a >> >coefficient k, their jacobian would be this two by two matrix. >> >[ k -k ] >> >[ -k k ] >> > >> >and the internal force >> > >> >[ k ( Ui - Uj) ] >> >[ k ( Uj - Ui) ] >> > >> >where Ui and Uj are the node displacements (just one displacement per >> >node because it's one dimensional) >> > >> >For the two dimensional case, assuming small deformations, we have a >> >four-by-four matrix. Each node has two degrees of freedom. We obtain it >> >by performing the outer product of the vector (t , -t) where "t" is the >> >vector that connects both nodes in a spring. This is for the case of >> >small deformations. I would need to assemble each spring contribution to >> >the jacobian and the residual like they were finite elements. The springs >> >share nodes, that's how they are connected. This example is just the >> >linear case, I will have to implement a nonlinear case in a similar >> >fashion. >> > >> >Seeing the DMNetwork example, I think it's what I need, although I don't >> >know much of power electric grids and it's hard for me to understand >> >what's going on. Do you have a good reference to be able to follow the >> >code? >> >> > >> Please see the attached document which has more description of DMNetwork >> and the equations for the power grid example. I don't have anything that >> describes how the power grid example is implemented. >> >> >For example, why are they adding components to the edges? >> > >> >475: DMNetworkAddComponent >> >< >> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/docs/manualpages/DM/D >> >> >MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey[ >> >0],&pfdata.branch[i-eStart]);Miguel >> >> Each edge or node can have several components (limited to 10) attached to >> it. The term components, taken from the circuit terminology, refers to the >> elements of a network. For example, a component could be a resistor, >> inductor, spring, or even edge/vertex weights (for graph problems). For >> code implementation, component is a data structure that holds the data >> needed for the residual, Jacobian, or any other function evaluation. In >> the case of power grid, there are 4 components: branches or transmission >> lines connecting nodes, buses or nodes, generators that are incident at a >> subset of the nodes, and loads that are also incident at a subset of the >> nodes. Each of the these components are defined by their data structures >> given in pf.h. >> >> DMNetwork is a wrapper class of DMPlex specifically for network >> applications that can be solely described using nodes, edges, and their >> associated components. If you have a PDE, or need FEM, or need other >> advanced features then DMPlex would be suitable. Please send us a write-up >> of your equations so that we can assist you better. >> >> Shri >> >> >> > >> > >> >On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >> > wrote: >> > >> >You may also want to take a look at the DMNetwork framework that can be >> >used for general unstructured networks that don't use PDEs. Its >> >description is given in the manual and an example is in >> >src/snes/examples/tutorials/network/pflow. >> > >> >Shri >> > >> >From: Matthew Knepley >> >Date: Tue, 23 Sep 2014 22:40:52 -0400 >> >To: Miguel Angel Salazar de Troya >> >Cc: "petsc-users at mcs.anl.gov" >> >Subject: Re: [petsc-users] DMPlex with spring elements >> > >> > >> >>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >> >> wrote: >> >> >> >>Hi all >> >>I was wondering if it could be possible to build a model similar to the >> >>example snes/ex12.c, but with spring elements (for elasticity) instead >> of >> >>simplicial elements. Spring elements in a grid, therefore each element >> >>would have two nodes and each node two components. There would be more >> >>differences, because instead of calling the functions f0,f1,g0,g1,g2 and >> >>g3 to build the residual and the jacobian, I would call a routine that >> >>would build the residual vector and the jacobian matrix directly. I >> would >> >>not have shape functions whatsoever. My problem is discrete, I don't >> have >> >>a PDE and my equations are algebraic. What is the best way in petsc to >> >>solve this problem? Is there any example that I can follow? Thanks in >> >>advance >> >> >> >> >> >> >> >> >> >>Yes, ex12 is fairly specific to FEM. However, I think the right tools >> for >> >>what you want are >> >>DMPlex and PetscSection. Here is how I would proceed: >> >> >> >> 1) Make a DMPlex that encodes a simple network that you wish to >> >>simulate >> >> >> >> 2) Make a PetscSection that gets the data layout right. Its hard from >> >>the above >> >> for me to understand where you degrees of freedom actually are. >> >>This is usually >> >> the hard part. >> >> >> >> 3) Calculate the residual, so you can check an exact solution. Here >> you >> >>use the >> >> PetscSectionGetDof/Offset() for each mesh piece that you are >> >>interested in. Again, >> >> its hard to be more specific when I do not understand your >> >>discretization. >> >> >> >> Thanks, >> >> >> >> Matt >> >> >> >> >> >>Miguel >> >> >> >> >> >> >> >>-- >> >>Miguel Angel Salazar de Troya >> >>Graduate Research Assistant >> >>Department of Mechanical Science and Engineering >> >>University of Illinois at Urbana-Champaign >> > >> > >> >>(217) 550-2360 >> >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>-- >> >>What most experimenters take for granted before they begin their >> >>experiments is infinitely more interesting than any results to which >> >>their experiments lead. >> >>-- Norbert Wiener >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> >salaza11 at illinois.edu >> >> > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 25 09:18:24 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 25 Sep 2014 09:18:24 -0500 Subject: [petsc-users] Reduce DMDA's In-Reply-To: <5495841.MtzeHitpxM@besikovitch-ii> References: <5495841.MtzeHitpxM@besikovitch-ii> Message-ID: On Thu, Sep 25, 2014 at 2:43 AM, Filippo Leonardi < filippo.leonardi at sam.math.ethz.ch> wrote: > Hi, > > Let's say I have some independent DMDAs, created as follows: > > MPI_Comm_split(MPI_COMM_WORLD, comm_rank % 2, 0, &newcomm); > > DM da; > DMDACreate2d(newcomm, DM_BOUNDARY_PERIODIC, DM_BOUNDARY_NONE, > DMDA_STENCIL_BOX ,50, 50, > PETSC_DECIDE, PETSC_DECIDE, > 1, 1, NULL, NULL,&da); > > For instance for 4 processors I get 2 DMDA's. Now, I want to reduce (in the > sense of MPI) the global/local DMDA vectors to only one of the MPI groups > (say > group 0). Is there an elegant way (e.g. with Scatters) to do that? > > My current implementation would be: get the local array on each process and > reduce (with MPI_Reduce) to the root of each partition. > > DMDA for Group 1: > +------+------+ > | 0 | 1 | > | | | > +------+------+ > | 2 | 3 | > | | | > +------+------+ > DMDA for Group 2: > +------+------+ > | 4 | 5 | > | | | > +------+------+ > | 6 | 7 | > | | | > +------+------+ > > Reduce rank 0 and 4 to rank 0. > Reduce rank 1 and 5 to rank 1. > Reduce rank 2 and 6 to rank 2. > Reduce rank 3 and 7 to rank 3. > > Clearly this implementation is cumbersome. Any idea? > I think that is the simplest way to do it, and its is 3 calls VecGet/RestoreArray() and MPI_Reduce(). Matt > Best, > Filippo -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Thu Sep 25 10:27:57 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Thu, 25 Sep 2014 15:27:57 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: >Thanks. I think the term "Component" was confusing me, I thought it was >related to the components of a field. I think this would be useful to me >if I wanted to assign coordinates to the vertices, wouldn't it? Yes. You can put whatever data you want in the component data structure. > >Also, I was wondering how to set up dirichlet boundary conditions, >basically fixing certain nodes position. > > > You can add a component at each node with a field marking whether the node is a boundary node. >Could I do it as the function SetInitialValues does it in the pflow >example? > No. You need to put in the component data structure before calling DMNetworkAddComponent() >These values are used to eliminate the zeroth-order energy modes of the >stiffness matrix? > > >Last question, in my case I have two degrees of freedom per node, when I >grab the offset with DMNetworkVariableOffset, that's for the first degree >of freedom in that node and the second degree of freedom would just be >offset+1? > Yes. Shri > >Miguel > > >On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. > wrote: > >If you have equations only at the nodes, with a part of it contributed by >the edges (springs), then you can use DMNetwork. If you are planning to >have equations for the beads in the future, or other higher layers, then >DMPlex has better functionality > to manage that. > >Shri > > >From: Miguel Angel Salazar de Troya >Date: Wed, 24 Sep 2014 17:38:11 -0500 >To: Shri >Cc: "petsc-users at mcs.anl.gov" >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > >Thanks for your response. I'm attaching a pdf with a description of the >model. The description of the PetscSection is necessary for the >DMNetwork? It looks like DMNetwork does not use a PetscSection. > > > > > > >Miguel > > >On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. > wrote: > > >>Thanks for your response. My discretization is based on spring elements. >>For the linear one dimensional case in which each spring has a >>coefficient k, their jacobian would be this two by two matrix. >>[ k -k ] >>[ -k k ] >> >>and the internal force >> >>[ k ( Ui - Uj) ] >>[ k ( Uj - Ui) ] >> >>where Ui and Uj are the node displacements (just one displacement per >>node because it's one dimensional) >> >>For the two dimensional case, assuming small deformations, we have a >>four-by-four matrix. Each node has two degrees of freedom. We obtain it >>by performing the outer product of the vector (t , -t) where "t" is the >>vector that connects both nodes in a spring. This is for the case of >>small deformations. I would need to assemble each spring contribution to >>the jacobian and the residual like they were finite elements. The springs >>share nodes, that's how they are connected. This example is just the >>linear case, I will have to implement a nonlinear case in a similar >>fashion. >> >>Seeing the DMNetwork example, I think it's what I need, although I don't >>know much of power electric grids and it's hard for me to understand >>what's going on. Do you have a good reference to be able to follow the >>code? > >> >Please see the attached document which has more description of DMNetwork >and the equations for the power grid example. I don't have anything that >describes how the power grid example is implemented. > >>For example, why are they adding components to the edges? >> >>475: DMNetworkAddComponent >>>D >>MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey >>[ >>0],&pfdata.branch[i-eStart]);Miguel > >Each edge or node can have several components (limited to 10) attached to >it. The term components, taken from the circuit terminology, refers to the >elements of a network. For example, a component could be a resistor, >inductor, spring, or even edge/vertex weights (for graph problems). For >code implementation, component is a data structure that holds the data >needed for the residual, Jacobian, or any other function evaluation. In >the case of power grid, there are 4 components: branches or transmission >lines connecting nodes, buses or nodes, generators that are incident at a >subset of the nodes, and loads that are also incident at a subset of the >nodes. Each of the these components are defined by their data structures >given in pf.h. > >DMNetwork is a wrapper class of DMPlex specifically for network >applications that can be solely described using nodes, edges, and their >associated components. If you have a PDE, or need FEM, or need other >advanced features then DMPlex would be suitable. Please send us a write-up >of your equations so that we can assist you better. > >Shri > > >> >> >>On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >> wrote: >> >>You may also want to take a look at the DMNetwork framework that can be >>used for general unstructured networks that don't use PDEs. Its >>description is given in the manual and an example is in >>src/snes/examples/tutorials/network/pflow. >> >>Shri >> >>From: Matthew Knepley >>Date: Tue, 23 Sep 2014 22:40:52 -0400 >>To: Miguel Angel Salazar de Troya >>Cc: "petsc-users at mcs.anl.gov" >>Subject: Re: [petsc-users] DMPlex with spring elements >> >> >>>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >>> wrote: >>> >>>Hi all >>>I was wondering if it could be possible to build a model similar to the >>>example snes/ex12.c, but with spring elements (for elasticity) instead >>>of >>>simplicial elements. Spring elements in a grid, therefore each element >>>would have two nodes and each node two components. There would be more >>>differences, because instead of calling the functions f0,f1,g0,g1,g2 and >>>g3 to build the residual and the jacobian, I would call a routine that >>>would build the residual vector and the jacobian matrix directly. I >>>would >>>not have shape functions whatsoever. My problem is discrete, I don't >>>have >>>a PDE and my equations are algebraic. What is the best way in petsc to >>>solve this problem? Is there any example that I can follow? Thanks in >>>advance >>> >>> >>> >>> >>>Yes, ex12 is fairly specific to FEM. However, I think the right tools >>>for >>>what you want are >>>DMPlex and PetscSection. Here is how I would proceed: >>> >>> 1) Make a DMPlex that encodes a simple network that you wish to >>>simulate >>> >>> 2) Make a PetscSection that gets the data layout right. Its hard from >>>the above >>> for me to understand where you degrees of freedom actually are. >>>This is usually >>> the hard part. >>> >>> 3) Calculate the residual, so you can check an exact solution. Here >>>you >>>use the >>> PetscSectionGetDof/Offset() for each mesh piece that you are >>>interested in. Again, >>> its hard to be more specific when I do not understand your >>>discretization. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>Miguel >>> >>> >>> >>>-- >>>Miguel Angel Salazar de Troya >>>Graduate Research Assistant >>>Department of Mechanical Science and Engineering >>>University of Illinois at Urbana-Champaign >> >> >>>(217) 550-2360 >>>salaza11 at illinois.edu >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>>-- Norbert Wiener >> >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu > > > > > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu From Mathieu.Morlighem at uci.edu Thu Sep 25 10:46:42 2014 From: Mathieu.Morlighem at uci.edu (Mathieu MORLIGHEM) Date: Thu, 25 Sep 2014 08:46:42 -0700 Subject: [petsc-users] Keep symbolic factorization Message-ID: <0BF27D1E-99F0-4940-B57D-809658CF73A5@uci.edu> Hi, I am using an augmented Lagrangian approach that requires to solve multiple times a system with the same Stiffness matrix but different right hand sides: K U1 = F1 K U2 = F2 ? K UN = FN I am using MUMPS right now, and I would like to do the analysis and factorization only once (to compute U1), and then only call the back substitution in order to speed up the convergence. Is there any option in PETSc to do that? If yes, would it work with other direct solvers (cholmod, Pardiso)? Thanks a lot for your help -- Mathieu Morlighem Assistant Professor Department of Earth System Science University of California, Irvine 3218 Croul Hall, Irvine, CA 92697-3100 Mathieu.Morlighem at uci.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 25 10:51:24 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 25 Sep 2014 10:51:24 -0500 Subject: [petsc-users] Keep symbolic factorization In-Reply-To: <0BF27D1E-99F0-4940-B57D-809658CF73A5@uci.edu> References: <0BF27D1E-99F0-4940-B57D-809658CF73A5@uci.edu> Message-ID: On Thu, Sep 25, 2014 at 10:46 AM, Mathieu MORLIGHEM < Mathieu.Morlighem at uci.edu> wrote: > Hi, > > I am using an augmented Lagrangian approach that requires to solve > multiple times a system with the same Stiffness matrix but different right > hand sides: > > K U1 = F1 > K U2 = F2 > ? > K UN = FN > > I am using MUMPS right now, and I would like to do the analysis and > factorization only once (to compute U1), and then only call the back > substitution in order to speed up the convergence. > Is there any option in PETSc to do that? If yes, would it work with other > direct solvers (cholmod, Pardiso)? > In general, this works automatically, and will work with all factorizations. If there is a problem with one, let us know. Thanks, Matt > Thanks a lot for your help > -- > Mathieu Morlighem > > Assistant Professor > Department of Earth System Science > University of California, Irvine > 3218 Croul Hall, Irvine, CA 92697-3100 > Mathieu.Morlighem at uci.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Thu Sep 25 10:52:16 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Thu, 25 Sep 2014 10:52:16 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: Thanks. Once I have marked the nodes that are fixed nodes using the component data structure, how can I process it later? I mean, at what point does the solver know that those degrees of freedom are actually fixed and how I can tell it that they are fixed? Miguel On Thu, Sep 25, 2014 at 10:27 AM, Abhyankar, Shrirang G. < abhyshr at mcs.anl.gov> wrote: > > > >Thanks. I think the term "Component" was confusing me, I thought it was > >related to the components of a field. I think this would be useful to me > >if I wanted to assign coordinates to the vertices, wouldn't it? > > Yes. You can put whatever data you want in the component data structure. > > > > >Also, I was wondering how to set up dirichlet boundary conditions, > >basically fixing certain nodes position. > > > > > > > > You can add a component at each node with a field marking whether the node > is a boundary node. > > >Could I do it as the function SetInitialValues does it in the pflow > >example? > > > > No. You need to put in the component data structure before calling > DMNetworkAddComponent() > > > >These values are used to eliminate the zeroth-order energy modes of the > >stiffness matrix? > > > > > > > >Last question, in my case I have two degrees of freedom per node, when I > >grab the offset with DMNetworkVariableOffset, that's for the first degree > >of freedom in that node and the second degree of freedom would just be > >offset+1? > > > > Yes. > > Shri > > > > >Miguel > > > > > >On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. > > wrote: > > > >If you have equations only at the nodes, with a part of it contributed by > >the edges (springs), then you can use DMNetwork. If you are planning to > >have equations for the beads in the future, or other higher layers, then > >DMPlex has better functionality > > to manage that. > > > >Shri > > > > > >From: Miguel Angel Salazar de Troya > >Date: Wed, 24 Sep 2014 17:38:11 -0500 > >To: Shri > >Cc: "petsc-users at mcs.anl.gov" > >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > > > > > > > >Thanks for your response. I'm attaching a pdf with a description of the > >model. The description of the PetscSection is necessary for the > >DMNetwork? It looks like DMNetwork does not use a PetscSection. > > > > > > > > > > > > > >Miguel > > > > > >On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. > > wrote: > > > > > >>Thanks for your response. My discretization is based on spring elements. > >>For the linear one dimensional case in which each spring has a > >>coefficient k, their jacobian would be this two by two matrix. > >>[ k -k ] > >>[ -k k ] > >> > >>and the internal force > >> > >>[ k ( Ui - Uj) ] > >>[ k ( Uj - Ui) ] > >> > >>where Ui and Uj are the node displacements (just one displacement per > >>node because it's one dimensional) > >> > >>For the two dimensional case, assuming small deformations, we have a > >>four-by-four matrix. Each node has two degrees of freedom. We obtain it > >>by performing the outer product of the vector (t , -t) where "t" is the > >>vector that connects both nodes in a spring. This is for the case of > >>small deformations. I would need to assemble each spring contribution to > >>the jacobian and the residual like they were finite elements. The springs > >>share nodes, that's how they are connected. This example is just the > >>linear case, I will have to implement a nonlinear case in a similar > >>fashion. > >> > >>Seeing the DMNetwork example, I think it's what I need, although I don't > >>know much of power electric grids and it's hard for me to understand > >>what's going on. Do you have a good reference to be able to follow the > >>code? > > > >> > >Please see the attached document which has more description of DMNetwork > >and the equations for the power grid example. I don't have anything that > >describes how the power grid example is implemented. > > > >>For example, why are they adding components to the edges? > >> > >>475: DMNetworkAddComponent > >>< > http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/docs/manualpages/DM/ > >>D > >>MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey > >>[ > >>0],&pfdata.branch[i-eStart]);Miguel > > > >Each edge or node can have several components (limited to 10) attached to > >it. The term components, taken from the circuit terminology, refers to the > >elements of a network. For example, a component could be a resistor, > >inductor, spring, or even edge/vertex weights (for graph problems). For > >code implementation, component is a data structure that holds the data > >needed for the residual, Jacobian, or any other function evaluation. In > >the case of power grid, there are 4 components: branches or transmission > >lines connecting nodes, buses or nodes, generators that are incident at a > >subset of the nodes, and loads that are also incident at a subset of the > >nodes. Each of the these components are defined by their data structures > >given in pf.h. > > > >DMNetwork is a wrapper class of DMPlex specifically for network > >applications that can be solely described using nodes, edges, and their > >associated components. If you have a PDE, or need FEM, or need other > >advanced features then DMPlex would be suitable. Please send us a write-up > >of your equations so that we can assist you better. > > > >Shri > > > > > >> > >> > >>On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. > >> wrote: > >> > >>You may also want to take a look at the DMNetwork framework that can be > >>used for general unstructured networks that don't use PDEs. Its > >>description is given in the manual and an example is in > >>src/snes/examples/tutorials/network/pflow. > >> > >>Shri > >> > >>From: Matthew Knepley > >>Date: Tue, 23 Sep 2014 22:40:52 -0400 > >>To: Miguel Angel Salazar de Troya > >>Cc: "petsc-users at mcs.anl.gov" > >>Subject: Re: [petsc-users] DMPlex with spring elements > >> > >> > >>>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya > >>> wrote: > >>> > >>>Hi all > >>>I was wondering if it could be possible to build a model similar to the > >>>example snes/ex12.c, but with spring elements (for elasticity) instead > >>>of > >>>simplicial elements. Spring elements in a grid, therefore each element > >>>would have two nodes and each node two components. There would be more > >>>differences, because instead of calling the functions f0,f1,g0,g1,g2 and > >>>g3 to build the residual and the jacobian, I would call a routine that > >>>would build the residual vector and the jacobian matrix directly. I > >>>would > >>>not have shape functions whatsoever. My problem is discrete, I don't > >>>have > >>>a PDE and my equations are algebraic. What is the best way in petsc to > >>>solve this problem? Is there any example that I can follow? Thanks in > >>>advance > >>> > >>> > >>> > >>> > >>>Yes, ex12 is fairly specific to FEM. However, I think the right tools > >>>for > >>>what you want are > >>>DMPlex and PetscSection. Here is how I would proceed: > >>> > >>> 1) Make a DMPlex that encodes a simple network that you wish to > >>>simulate > >>> > >>> 2) Make a PetscSection that gets the data layout right. Its hard from > >>>the above > >>> for me to understand where you degrees of freedom actually are. > >>>This is usually > >>> the hard part. > >>> > >>> 3) Calculate the residual, so you can check an exact solution. Here > >>>you > >>>use the > >>> PetscSectionGetDof/Offset() for each mesh piece that you are > >>>interested in. Again, > >>> its hard to be more specific when I do not understand your > >>>discretization. > >>> > >>> Thanks, > >>> > >>> Matt > >>> > >>> > >>>Miguel > >>> > >>> > >>> > >>>-- > >>>Miguel Angel Salazar de Troya > >>>Graduate Research Assistant > >>>Department of Mechanical Science and Engineering > >>>University of Illinois at Urbana-Champaign > >> > >> > >>>(217) 550-2360 > >>>salaza11 at illinois.edu > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>>-- > >>>What most experimenters take for granted before they begin their > >>>experiments is infinitely more interesting than any results to which > >>>their experiments lead. > >>>-- Norbert Wiener > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > > > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > > > > > > > > > > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Thu Sep 25 11:32:26 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Thu, 25 Sep 2014 16:32:26 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: The solver does not know anything about the boundary conditions. You would have to specify it to the solver by describing the appropriate equations. For e.g. in the power grid example, there is a part in the residual evaluation if (bus->ide == REF_BUS || bus->ide == ISOLATED_BUS) { farr[offset] = 0.0; farr[offset+1] = 0.0; break; } This sets the residual at the nodes marked with REF_BUS or ISOLATED_BUS to 0.0. You can do something similar. Shri From: Miguel Angel Salazar de Troya > Date: Thu, 25 Sep 2014 10:52:16 -0500 To: Shri > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements Thanks. Once I have marked the nodes that are fixed nodes using the component data structure, how can I process it later? I mean, at what point does the solver know that those degrees of freedom are actually fixed and how I can tell it that they are fixed? Miguel On Thu, Sep 25, 2014 at 10:27 AM, Abhyankar, Shrirang G. > wrote: >Thanks. I think the term "Component" was confusing me, I thought it was >related to the components of a field. I think this would be useful to me >if I wanted to assign coordinates to the vertices, wouldn't it? Yes. You can put whatever data you want in the component data structure. > >Also, I was wondering how to set up dirichlet boundary conditions, >basically fixing certain nodes position. > > > You can add a component at each node with a field marking whether the node is a boundary node. >Could I do it as the function SetInitialValues does it in the pflow >example? > No. You need to put in the component data structure before calling DMNetworkAddComponent() >These values are used to eliminate the zeroth-order energy modes of the >stiffness matrix? > > >Last question, in my case I have two degrees of freedom per node, when I >grab the offset with DMNetworkVariableOffset, that's for the first degree >of freedom in that node and the second degree of freedom would just be >offset+1? > Yes. Shri > >Miguel > > >On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. >> wrote: > >If you have equations only at the nodes, with a part of it contributed by >the edges (springs), then you can use DMNetwork. If you are planning to >have equations for the beads in the future, or other higher layers, then >DMPlex has better functionality > to manage that. > >Shri > > >From: Miguel Angel Salazar de Troya > >Date: Wed, 24 Sep 2014 17:38:11 -0500 >To: Shri > >Cc: "petsc-users at mcs.anl.gov" > >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > >Thanks for your response. I'm attaching a pdf with a description of the >model. The description of the PetscSection is necessary for the >DMNetwork? It looks like DMNetwork does not use a PetscSection. > > > > > > >Miguel > > >On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. >> wrote: > > >>Thanks for your response. My discretization is based on spring elements. >>For the linear one dimensional case in which each spring has a >>coefficient k, their jacobian would be this two by two matrix. >>[ k -k ] >>[ -k k ] >> >>and the internal force >> >>[ k ( Ui - Uj) ] >>[ k ( Uj - Ui) ] >> >>where Ui and Uj are the node displacements (just one displacement per >>node because it's one dimensional) >> >>For the two dimensional case, assuming small deformations, we have a >>four-by-four matrix. Each node has two degrees of freedom. We obtain it >>by performing the outer product of the vector (t , -t) where "t" is the >>vector that connects both nodes in a spring. This is for the case of >>small deformations. I would need to assemble each spring contribution to >>the jacobian and the residual like they were finite elements. The springs >>share nodes, that's how they are connected. This example is just the >>linear case, I will have to implement a nonlinear case in a similar >>fashion. >> >>Seeing the DMNetwork example, I think it's what I need, although I don't >>know much of power electric grids and it's hard for me to understand >>what's going on. Do you have a good reference to be able to follow the >>code? > >> >Please see the attached document which has more description of DMNetwork >and the equations for the power grid example. I don't have anything that >describes how the power grid example is implemented. > >>For example, why are they adding components to the edges? >> >>475: DMNetworkAddComponent >>>D >>MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey >>[ >>0],&pfdata.branch[i-eStart]);Miguel > >Each edge or node can have several components (limited to 10) attached to >it. The term components, taken from the circuit terminology, refers to the >elements of a network. For example, a component could be a resistor, >inductor, spring, or even edge/vertex weights (for graph problems). For >code implementation, component is a data structure that holds the data >needed for the residual, Jacobian, or any other function evaluation. In >the case of power grid, there are 4 components: branches or transmission >lines connecting nodes, buses or nodes, generators that are incident at a >subset of the nodes, and loads that are also incident at a subset of the >nodes. Each of the these components are defined by their data structures >given in pf.h. > >DMNetwork is a wrapper class of DMPlex specifically for network >applications that can be solely described using nodes, edges, and their >associated components. If you have a PDE, or need FEM, or need other >advanced features then DMPlex would be suitable. Please send us a write-up >of your equations so that we can assist you better. > >Shri > > >> >> >>On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >>> wrote: >> >>You may also want to take a look at the DMNetwork framework that can be >>used for general unstructured networks that don't use PDEs. Its >>description is given in the manual and an example is in >>src/snes/examples/tutorials/network/pflow. >> >>Shri >> >>From: Matthew Knepley > >>Date: Tue, 23 Sep 2014 22:40:52 -0400 >>To: Miguel Angel Salazar de Troya > >>Cc: "petsc-users at mcs.anl.gov" > >>Subject: Re: [petsc-users] DMPlex with spring elements >> >> >>>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >>>> wrote: >>> >>>Hi all >>>I was wondering if it could be possible to build a model similar to the >>>example snes/ex12.c, but with spring elements (for elasticity) instead >>>of >>>simplicial elements. Spring elements in a grid, therefore each element >>>would have two nodes and each node two components. There would be more >>>differences, because instead of calling the functions f0,f1,g0,g1,g2 and >>>g3 to build the residual and the jacobian, I would call a routine that >>>would build the residual vector and the jacobian matrix directly. I >>>would >>>not have shape functions whatsoever. My problem is discrete, I don't >>>have >>>a PDE and my equations are algebraic. What is the best way in petsc to >>>solve this problem? Is there any example that I can follow? Thanks in >>>advance >>> >>> >>> >>> >>>Yes, ex12 is fairly specific to FEM. However, I think the right tools >>>for >>>what you want are >>>DMPlex and PetscSection. Here is how I would proceed: >>> >>> 1) Make a DMPlex that encodes a simple network that you wish to >>>simulate >>> >>> 2) Make a PetscSection that gets the data layout right. Its hard from >>>the above >>> for me to understand where you degrees of freedom actually are. >>>This is usually >>> the hard part. >>> >>> 3) Calculate the residual, so you can check an exact solution. Here >>>you >>>use the >>> PetscSectionGetDof/Offset() for each mesh piece that you are >>>interested in. Again, >>> its hard to be more specific when I do not understand your >>>discretization. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>Miguel >>> >>> >>> >>>-- >>>Miguel Angel Salazar de Troya >>>Graduate Research Assistant >>>Department of Mechanical Science and Engineering >>>University of Illinois at Urbana-Champaign >> >> >>>(217) 550-2360 >>>salaza11 at illinois.edu >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>>-- Norbert Wiener >> >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu > > > > > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu -- Miguel Angel Salazar de Troya Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Thu Sep 25 11:43:13 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Thu, 25 Sep 2014 11:43:13 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: I see, and I guess I would have to assign a value of one to the diagonal entry of that degree of freedom in the Jacobian right? Wouldn't this break the symmetry of the Jacobian (in case it were symmetric)? Thanks Miguel On Sep 25, 2014 11:32 AM, "Abhyankar, Shrirang G." wrote: > The solver does not know anything about the boundary conditions. You > would have to specify it to the solver by describing the appropriate > equations. For e.g. in the power grid example, there is a part in the > residual evaluation > > if (bus->ide == REF_BUS || bus->ide == ISOLATED_BUS) { > farr[offset] = 0.0; > farr[offset+1] = 0.0; > break; > } > > This sets the residual at the nodes marked with REF_BUS or ISOLATED_BUS > to 0.0. You can do something similar. > > Shri > > > > From: Miguel Angel Salazar de Troya > Date: Thu, 25 Sep 2014 10:52:16 -0500 > To: Shri > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements > > Thanks. Once I have marked the nodes that are fixed nodes using the > component data structure, how can I process it later? I mean, at what point > does the solver know that those degrees of freedom are actually fixed and > how I can tell it that they are fixed? > > Miguel > > On Thu, Sep 25, 2014 at 10:27 AM, Abhyankar, Shrirang G. < > abhyshr at mcs.anl.gov> wrote: > >> >> >> >Thanks. I think the term "Component" was confusing me, I thought it was >> >related to the components of a field. I think this would be useful to me >> >if I wanted to assign coordinates to the vertices, wouldn't it? >> >> Yes. You can put whatever data you want in the component data structure. >> >> > >> >Also, I was wondering how to set up dirichlet boundary conditions, >> >basically fixing certain nodes position. >> > >> >> > >> > >> You can add a component at each node with a field marking whether the node >> is a boundary node. >> >> >Could I do it as the function SetInitialValues does it in the pflow >> >example? >> > >> >> No. You need to put in the component data structure before calling >> DMNetworkAddComponent() >> >> >> >These values are used to eliminate the zeroth-order energy modes of the >> >stiffness matrix? >> > >> >> >> > >> >Last question, in my case I have two degrees of freedom per node, when I >> >grab the offset with DMNetworkVariableOffset, that's for the first degree >> >of freedom in that node and the second degree of freedom would just be >> >offset+1? >> > >> >> Yes. >> >> Shri >> >> > >> >Miguel >> > >> > >> >On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. >> > wrote: >> > >> >If you have equations only at the nodes, with a part of it contributed by >> >the edges (springs), then you can use DMNetwork. If you are planning to >> >have equations for the beads in the future, or other higher layers, then >> >DMPlex has better functionality >> > to manage that. >> > >> >Shri >> > >> > >> >From: Miguel Angel Salazar de Troya >> >Date: Wed, 24 Sep 2014 17:38:11 -0500 >> >To: Shri >> >Cc: "petsc-users at mcs.anl.gov" >> >Subject: Re: [petsc-users] DMPlex with spring elements >> > >> > >> > >> > >> > >> >Thanks for your response. I'm attaching a pdf with a description of the >> >model. The description of the PetscSection is necessary for the >> >DMNetwork? It looks like DMNetwork does not use a PetscSection. >> > >> > >> > >> > >> > >> > >> >Miguel >> > >> > >> >On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. >> > wrote: >> > >> > >> >>Thanks for your response. My discretization is based on spring elements. >> >>For the linear one dimensional case in which each spring has a >> >>coefficient k, their jacobian would be this two by two matrix. >> >>[ k -k ] >> >>[ -k k ] >> >> >> >>and the internal force >> >> >> >>[ k ( Ui - Uj) ] >> >>[ k ( Uj - Ui) ] >> >> >> >>where Ui and Uj are the node displacements (just one displacement per >> >>node because it's one dimensional) >> >> >> >>For the two dimensional case, assuming small deformations, we have a >> >>four-by-four matrix. Each node has two degrees of freedom. We obtain it >> >>by performing the outer product of the vector (t , -t) where "t" is the >> >>vector that connects both nodes in a spring. This is for the case of >> >>small deformations. I would need to assemble each spring contribution to >> >>the jacobian and the residual like they were finite elements. The >> springs >> >>share nodes, that's how they are connected. This example is just the >> >>linear case, I will have to implement a nonlinear case in a similar >> >>fashion. >> >> >> >>Seeing the DMNetwork example, I think it's what I need, although I don't >> >>know much of power electric grids and it's hard for me to understand >> >>what's going on. Do you have a good reference to be able to follow the >> >>code? >> > >> >> >> >Please see the attached document which has more description of DMNetwork >> >and the equations for the power grid example. I don't have anything that >> >describes how the power grid example is implemented. >> > >> >>For example, why are they adding components to the edges? >> >> >> >>475: DMNetworkAddComponent >> >>< >> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/docs/manualpages/DM/ >> >>D >> >> >>MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey >> >>[ >> >>0],&pfdata.branch[i-eStart]);Miguel >> > >> >Each edge or node can have several components (limited to 10) attached to >> >it. The term components, taken from the circuit terminology, refers to >> the >> >elements of a network. For example, a component could be a resistor, >> >inductor, spring, or even edge/vertex weights (for graph problems). For >> >code implementation, component is a data structure that holds the data >> >needed for the residual, Jacobian, or any other function evaluation. In >> >the case of power grid, there are 4 components: branches or transmission >> >lines connecting nodes, buses or nodes, generators that are incident at a >> >subset of the nodes, and loads that are also incident at a subset of the >> >nodes. Each of the these components are defined by their data structures >> >given in pf.h. >> > >> >DMNetwork is a wrapper class of DMPlex specifically for network >> >applications that can be solely described using nodes, edges, and their >> >associated components. If you have a PDE, or need FEM, or need other >> >advanced features then DMPlex would be suitable. Please send us a >> write-up >> >of your equations so that we can assist you better. >> > >> >Shri >> > >> > >> >> >> >> >> >>On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >> >> wrote: >> >> >> >>You may also want to take a look at the DMNetwork framework that can be >> >>used for general unstructured networks that don't use PDEs. Its >> >>description is given in the manual and an example is in >> >>src/snes/examples/tutorials/network/pflow. >> >> >> >>Shri >> >> >> >>From: Matthew Knepley >> >>Date: Tue, 23 Sep 2014 22:40:52 -0400 >> >>To: Miguel Angel Salazar de Troya >> >>Cc: "petsc-users at mcs.anl.gov" >> >>Subject: Re: [petsc-users] DMPlex with spring elements >> >> >> >> >> >>>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >> >>> wrote: >> >>> >> >>>Hi all >> >>>I was wondering if it could be possible to build a model similar to the >> >>>example snes/ex12.c, but with spring elements (for elasticity) instead >> >>>of >> >>>simplicial elements. Spring elements in a grid, therefore each element >> >>>would have two nodes and each node two components. There would be more >> >>>differences, because instead of calling the functions f0,f1,g0,g1,g2 >> and >> >>>g3 to build the residual and the jacobian, I would call a routine that >> >>>would build the residual vector and the jacobian matrix directly. I >> >>>would >> >>>not have shape functions whatsoever. My problem is discrete, I don't >> >>>have >> >>>a PDE and my equations are algebraic. What is the best way in petsc to >> >>>solve this problem? Is there any example that I can follow? Thanks in >> >>>advance >> >>> >> >>> >> >>> >> >>> >> >>>Yes, ex12 is fairly specific to FEM. However, I think the right tools >> >>>for >> >>>what you want are >> >>>DMPlex and PetscSection. Here is how I would proceed: >> >>> >> >>> 1) Make a DMPlex that encodes a simple network that you wish to >> >>>simulate >> >>> >> >>> 2) Make a PetscSection that gets the data layout right. Its hard from >> >>>the above >> >>> for me to understand where you degrees of freedom actually are. >> >>>This is usually >> >>> the hard part. >> >>> >> >>> 3) Calculate the residual, so you can check an exact solution. Here >> >>>you >> >>>use the >> >>> PetscSectionGetDof/Offset() for each mesh piece that you are >> >>>interested in. Again, >> >>> its hard to be more specific when I do not understand your >> >>>discretization. >> >>> >> >>> Thanks, >> >>> >> >>> Matt >> >>> >> >>> >> >>>Miguel >> >>> >> >>> >> >>> >> >>>-- >> >>>Miguel Angel Salazar de Troya >> >>>Graduate Research Assistant >> >>>Department of Mechanical Science and Engineering >> >>>University of Illinois at Urbana-Champaign >> >> >> >> >> >>>(217) 550-2360 >> >>>salaza11 at illinois.edu >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>>-- >> >>>What most experimenters take for granted before they begin their >> >>>experiments is infinitely more interesting than any results to which >> >>>their experiments lead. >> >>>-- Norbert Wiener >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>-- >> >>Miguel Angel Salazar de Troya >> >>Graduate Research Assistant >> >>Department of Mechanical Science and Engineering >> >>University of Illinois at Urbana-Champaign >> >>(217) 550-2360 >> >>salaza11 at illinois.edu >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> >salaza11 at illinois.edu >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> >salaza11 at illinois.edu >> >> > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Thu Sep 25 12:38:36 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Thu, 25 Sep 2014 17:38:36 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: From: Miguel Angel Salazar de Troya > Date: Thu, 25 Sep 2014 11:43:13 -0500 To: Shri > Cc: > Subject: Re: [petsc-users] DMPlex with spring elements I see, and I guess I would have to assign a value of one to the diagonal entry of that degree of freedom in the Jacobian right? Yes. Wouldn't this break the symmetry of the Jacobian (in case it were symmetric)? No. Shri Thanks Miguel On Sep 25, 2014 11:32 AM, "Abhyankar, Shrirang G." > wrote: The solver does not know anything about the boundary conditions. You would have to specify it to the solver by describing the appropriate equations. For e.g. in the power grid example, there is a part in the residual evaluation if (bus->ide == REF_BUS || bus->ide == ISOLATED_BUS) { farr[offset] = 0.0; farr[offset+1] = 0.0; break; } This sets the residual at the nodes marked with REF_BUS or ISOLATED_BUS to 0.0. You can do something similar. Shri From: Miguel Angel Salazar de Troya > Date: Thu, 25 Sep 2014 10:52:16 -0500 To: Shri > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements Thanks. Once I have marked the nodes that are fixed nodes using the component data structure, how can I process it later? I mean, at what point does the solver know that those degrees of freedom are actually fixed and how I can tell it that they are fixed? Miguel On Thu, Sep 25, 2014 at 10:27 AM, Abhyankar, Shrirang G. > wrote: >Thanks. I think the term "Component" was confusing me, I thought it was >related to the components of a field. I think this would be useful to me >if I wanted to assign coordinates to the vertices, wouldn't it? Yes. You can put whatever data you want in the component data structure. > >Also, I was wondering how to set up dirichlet boundary conditions, >basically fixing certain nodes position. > > > You can add a component at each node with a field marking whether the node is a boundary node. >Could I do it as the function SetInitialValues does it in the pflow >example? > No. You need to put in the component data structure before calling DMNetworkAddComponent() >These values are used to eliminate the zeroth-order energy modes of the >stiffness matrix? > > >Last question, in my case I have two degrees of freedom per node, when I >grab the offset with DMNetworkVariableOffset, that's for the first degree >of freedom in that node and the second degree of freedom would just be >offset+1? > Yes. Shri > >Miguel > > >On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. >> wrote: > >If you have equations only at the nodes, with a part of it contributed by >the edges (springs), then you can use DMNetwork. If you are planning to >have equations for the beads in the future, or other higher layers, then >DMPlex has better functionality > to manage that. > >Shri > > >From: Miguel Angel Salazar de Troya > >Date: Wed, 24 Sep 2014 17:38:11 -0500 >To: Shri > >Cc: "petsc-users at mcs.anl.gov" > >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > >Thanks for your response. I'm attaching a pdf with a description of the >model. The description of the PetscSection is necessary for the >DMNetwork? It looks like DMNetwork does not use a PetscSection. > > > > > > >Miguel > > >On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. >> wrote: > > >>Thanks for your response. My discretization is based on spring elements. >>For the linear one dimensional case in which each spring has a >>coefficient k, their jacobian would be this two by two matrix. >>[ k -k ] >>[ -k k ] >> >>and the internal force >> >>[ k ( Ui - Uj) ] >>[ k ( Uj - Ui) ] >> >>where Ui and Uj are the node displacements (just one displacement per >>node because it's one dimensional) >> >>For the two dimensional case, assuming small deformations, we have a >>four-by-four matrix. Each node has two degrees of freedom. We obtain it >>by performing the outer product of the vector (t , -t) where "t" is the >>vector that connects both nodes in a spring. This is for the case of >>small deformations. I would need to assemble each spring contribution to >>the jacobian and the residual like they were finite elements. The springs >>share nodes, that's how they are connected. This example is just the >>linear case, I will have to implement a nonlinear case in a similar >>fashion. >> >>Seeing the DMNetwork example, I think it's what I need, although I don't >>know much of power electric grids and it's hard for me to understand >>what's going on. Do you have a good reference to be able to follow the >>code? > >> >Please see the attached document which has more description of DMNetwork >and the equations for the power grid example. I don't have anything that >describes how the power grid example is implemented. > >>For example, why are they adding components to the edges? >> >>475: DMNetworkAddComponent >>>D >>MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey >>[ >>0],&pfdata.branch[i-eStart]);Miguel > >Each edge or node can have several components (limited to 10) attached to >it. The term components, taken from the circuit terminology, refers to the >elements of a network. For example, a component could be a resistor, >inductor, spring, or even edge/vertex weights (for graph problems). For >code implementation, component is a data structure that holds the data >needed for the residual, Jacobian, or any other function evaluation. In >the case of power grid, there are 4 components: branches or transmission >lines connecting nodes, buses or nodes, generators that are incident at a >subset of the nodes, and loads that are also incident at a subset of the >nodes. Each of the these components are defined by their data structures >given in pf.h. > >DMNetwork is a wrapper class of DMPlex specifically for network >applications that can be solely described using nodes, edges, and their >associated components. If you have a PDE, or need FEM, or need other >advanced features then DMPlex would be suitable. Please send us a write-up >of your equations so that we can assist you better. > >Shri > > >> >> >>On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >>> wrote: >> >>You may also want to take a look at the DMNetwork framework that can be >>used for general unstructured networks that don't use PDEs. Its >>description is given in the manual and an example is in >>src/snes/examples/tutorials/network/pflow. >> >>Shri >> >>From: Matthew Knepley > >>Date: Tue, 23 Sep 2014 22:40:52 -0400 >>To: Miguel Angel Salazar de Troya > >>Cc: "petsc-users at mcs.anl.gov" > >>Subject: Re: [petsc-users] DMPlex with spring elements >> >> >>>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >>>> wrote: >>> >>>Hi all >>>I was wondering if it could be possible to build a model similar to the >>>example snes/ex12.c, but with spring elements (for elasticity) instead >>>of >>>simplicial elements. Spring elements in a grid, therefore each element >>>would have two nodes and each node two components. There would be more >>>differences, because instead of calling the functions f0,f1,g0,g1,g2 and >>>g3 to build the residual and the jacobian, I would call a routine that >>>would build the residual vector and the jacobian matrix directly. I >>>would >>>not have shape functions whatsoever. My problem is discrete, I don't >>>have >>>a PDE and my equations are algebraic. What is the best way in petsc to >>>solve this problem? Is there any example that I can follow? Thanks in >>>advance >>> >>> >>> >>> >>>Yes, ex12 is fairly specific to FEM. However, I think the right tools >>>for >>>what you want are >>>DMPlex and PetscSection. Here is how I would proceed: >>> >>> 1) Make a DMPlex that encodes a simple network that you wish to >>>simulate >>> >>> 2) Make a PetscSection that gets the data layout right. Its hard from >>>the above >>> for me to understand where you degrees of freedom actually are. >>>This is usually >>> the hard part. >>> >>> 3) Calculate the residual, so you can check an exact solution. Here >>>you >>>use the >>> PetscSectionGetDof/Offset() for each mesh piece that you are >>>interested in. Again, >>> its hard to be more specific when I do not understand your >>>discretization. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>Miguel >>> >>> >>> >>>-- >>>Miguel Angel Salazar de Troya >>>Graduate Research Assistant >>>Department of Mechanical Science and Engineering >>>University of Illinois at Urbana-Champaign >> >> >>>(217) 550-2360 >>>salaza11 at illinois.edu >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>>-- Norbert Wiener >> >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu > > > > > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu -- Miguel Angel Salazar de Troya Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Thu Sep 25 12:57:53 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Thu, 25 Sep 2014 12:57:53 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: Why not? Wouldn't we have a row of zeros except for the diagonal term? The column that corresponds to that degree of from doesn't have to be zero, right? Thanks Miguel On Sep 25, 2014 12:38 PM, "Abhyankar, Shrirang G." wrote: > > > From: Miguel Angel Salazar de Troya > Date: Thu, 25 Sep 2014 11:43:13 -0500 > To: Shri > Cc: > Subject: Re: [petsc-users] DMPlex with spring elements > > I see, and I guess I would have to assign a value of one to the diagonal > entry of that degree of freedom in the Jacobian right? > > Yes. > > Wouldn't this break the symmetry of the Jacobian (in case it were > symmetric)? > > No. > > Shri > > Thanks > Miguel > On Sep 25, 2014 11:32 AM, "Abhyankar, Shrirang G." > wrote: > >> The solver does not know anything about the boundary conditions. You >> would have to specify it to the solver by describing the appropriate >> equations. For e.g. in the power grid example, there is a part in the >> residual evaluation >> >> if (bus->ide == REF_BUS || bus->ide == ISOLATED_BUS) { >> farr[offset] = 0.0; >> farr[offset+1] = 0.0; >> break; >> } >> >> This sets the residual at the nodes marked with REF_BUS or ISOLATED_BUS >> to 0.0. You can do something similar. >> >> Shri >> >> >> >> From: Miguel Angel Salazar de Troya >> Date: Thu, 25 Sep 2014 10:52:16 -0500 >> To: Shri >> Cc: "petsc-users at mcs.anl.gov" >> Subject: Re: [petsc-users] DMPlex with spring elements >> >> Thanks. Once I have marked the nodes that are fixed nodes using the >> component data structure, how can I process it later? I mean, at what point >> does the solver know that those degrees of freedom are actually fixed and >> how I can tell it that they are fixed? >> >> Miguel >> >> On Thu, Sep 25, 2014 at 10:27 AM, Abhyankar, Shrirang G. < >> abhyshr at mcs.anl.gov> wrote: >> >>> >>> >>> >Thanks. I think the term "Component" was confusing me, I thought it was >>> >related to the components of a field. I think this would be useful to me >>> >if I wanted to assign coordinates to the vertices, wouldn't it? >>> >>> Yes. You can put whatever data you want in the component data structure. >>> >>> > >>> >Also, I was wondering how to set up dirichlet boundary conditions, >>> >basically fixing certain nodes position. >>> > >>> >>> > >>> > >>> You can add a component at each node with a field marking whether the >>> node >>> is a boundary node. >>> >>> >Could I do it as the function SetInitialValues does it in the pflow >>> >example? >>> > >>> >>> No. You need to put in the component data structure before calling >>> DMNetworkAddComponent() >>> >>> >>> >These values are used to eliminate the zeroth-order energy modes of the >>> >stiffness matrix? >>> > >>> >>> >>> > >>> >Last question, in my case I have two degrees of freedom per node, when I >>> >grab the offset with DMNetworkVariableOffset, that's for the first >>> degree >>> >of freedom in that node and the second degree of freedom would just be >>> >offset+1? >>> > >>> >>> Yes. >>> >>> Shri >>> >>> > >>> >Miguel >>> > >>> > >>> >On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. >>> > wrote: >>> > >>> >If you have equations only at the nodes, with a part of it contributed >>> by >>> >the edges (springs), then you can use DMNetwork. If you are planning to >>> >have equations for the beads in the future, or other higher layers, then >>> >DMPlex has better functionality >>> > to manage that. >>> > >>> >Shri >>> > >>> > >>> >From: Miguel Angel Salazar de Troya >>> >Date: Wed, 24 Sep 2014 17:38:11 -0500 >>> >To: Shri >>> >Cc: "petsc-users at mcs.anl.gov" >>> >Subject: Re: [petsc-users] DMPlex with spring elements >>> > >>> > >>> > >>> > >>> > >>> >Thanks for your response. I'm attaching a pdf with a description of the >>> >model. The description of the PetscSection is necessary for the >>> >DMNetwork? It looks like DMNetwork does not use a PetscSection. >>> > >>> > >>> > >>> > >>> > >>> > >>> >Miguel >>> > >>> > >>> >On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. >>> > wrote: >>> > >>> > >>> >>Thanks for your response. My discretization is based on spring >>> elements. >>> >>For the linear one dimensional case in which each spring has a >>> >>coefficient k, their jacobian would be this two by two matrix. >>> >>[ k -k ] >>> >>[ -k k ] >>> >> >>> >>and the internal force >>> >> >>> >>[ k ( Ui - Uj) ] >>> >>[ k ( Uj - Ui) ] >>> >> >>> >>where Ui and Uj are the node displacements (just one displacement per >>> >>node because it's one dimensional) >>> >> >>> >>For the two dimensional case, assuming small deformations, we have a >>> >>four-by-four matrix. Each node has two degrees of freedom. We obtain it >>> >>by performing the outer product of the vector (t , -t) where "t" is the >>> >>vector that connects both nodes in a spring. This is for the case of >>> >>small deformations. I would need to assemble each spring contribution >>> to >>> >>the jacobian and the residual like they were finite elements. The >>> springs >>> >>share nodes, that's how they are connected. This example is just the >>> >>linear case, I will have to implement a nonlinear case in a similar >>> >>fashion. >>> >> >>> >>Seeing the DMNetwork example, I think it's what I need, although I >>> don't >>> >>know much of power electric grids and it's hard for me to understand >>> >>what's going on. Do you have a good reference to be able to follow the >>> >>code? >>> > >>> >> >>> >Please see the attached document which has more description of DMNetwork >>> >and the equations for the power grid example. I don't have anything that >>> >describes how the power grid example is implemented. >>> > >>> >>For example, why are they adding components to the edges? >>> >> >>> >>475: DMNetworkAddComponent >>> >>< >>> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/docs/manualpages/DM/ >>> >>D >>> >>> >>MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentkey >>> >>[ >>> >>0],&pfdata.branch[i-eStart]);Miguel >>> > >>> >Each edge or node can have several components (limited to 10) attached >>> to >>> >it. The term components, taken from the circuit terminology, refers to >>> the >>> >elements of a network. For example, a component could be a resistor, >>> >inductor, spring, or even edge/vertex weights (for graph problems). For >>> >code implementation, component is a data structure that holds the data >>> >needed for the residual, Jacobian, or any other function evaluation. In >>> >the case of power grid, there are 4 components: branches or transmission >>> >lines connecting nodes, buses or nodes, generators that are incident at >>> a >>> >subset of the nodes, and loads that are also incident at a subset of the >>> >nodes. Each of the these components are defined by their data structures >>> >given in pf.h. >>> > >>> >DMNetwork is a wrapper class of DMPlex specifically for network >>> >applications that can be solely described using nodes, edges, and their >>> >associated components. If you have a PDE, or need FEM, or need other >>> >advanced features then DMPlex would be suitable. Please send us a >>> write-up >>> >of your equations so that we can assist you better. >>> > >>> >Shri >>> > >>> > >>> >> >>> >> >>> >>On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >>> >> wrote: >>> >> >>> >>You may also want to take a look at the DMNetwork framework that can be >>> >>used for general unstructured networks that don't use PDEs. Its >>> >>description is given in the manual and an example is in >>> >>src/snes/examples/tutorials/network/pflow. >>> >> >>> >>Shri >>> >> >>> >>From: Matthew Knepley >>> >>Date: Tue, 23 Sep 2014 22:40:52 -0400 >>> >>To: Miguel Angel Salazar de Troya >>> >>Cc: "petsc-users at mcs.anl.gov" >>> >>Subject: Re: [petsc-users] DMPlex with spring elements >>> >> >>> >> >>> >>>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >>> >>> wrote: >>> >>> >>> >>>Hi all >>> >>>I was wondering if it could be possible to build a model similar to >>> the >>> >>>example snes/ex12.c, but with spring elements (for elasticity) instead >>> >>>of >>> >>>simplicial elements. Spring elements in a grid, therefore each element >>> >>>would have two nodes and each node two components. There would be more >>> >>>differences, because instead of calling the functions f0,f1,g0,g1,g2 >>> and >>> >>>g3 to build the residual and the jacobian, I would call a routine that >>> >>>would build the residual vector and the jacobian matrix directly. I >>> >>>would >>> >>>not have shape functions whatsoever. My problem is discrete, I don't >>> >>>have >>> >>>a PDE and my equations are algebraic. What is the best way in petsc to >>> >>>solve this problem? Is there any example that I can follow? Thanks in >>> >>>advance >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>Yes, ex12 is fairly specific to FEM. However, I think the right tools >>> >>>for >>> >>>what you want are >>> >>>DMPlex and PetscSection. Here is how I would proceed: >>> >>> >>> >>> 1) Make a DMPlex that encodes a simple network that you wish to >>> >>>simulate >>> >>> >>> >>> 2) Make a PetscSection that gets the data layout right. Its hard >>> from >>> >>>the above >>> >>> for me to understand where you degrees of freedom actually are. >>> >>>This is usually >>> >>> the hard part. >>> >>> >>> >>> 3) Calculate the residual, so you can check an exact solution. Here >>> >>>you >>> >>>use the >>> >>> PetscSectionGetDof/Offset() for each mesh piece that you are >>> >>>interested in. Again, >>> >>> its hard to be more specific when I do not understand your >>> >>>discretization. >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Matt >>> >>> >>> >>> >>> >>>Miguel >>> >>> >>> >>> >>> >>> >>> >>>-- >>> >>>Miguel Angel Salazar de Troya >>> >>>Graduate Research Assistant >>> >>>Department of Mechanical Science and Engineering >>> >>>University of Illinois at Urbana-Champaign >>> >> >>> >> >>> >>>(217) 550-2360 >>> >>>salaza11 at illinois.edu >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>> >>>What most experimenters take for granted before they begin their >>> >>>experiments is infinitely more interesting than any results to which >>> >>>their experiments lead. >>> >>>-- Norbert Wiener >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >>-- >>> >>Miguel Angel Salazar de Troya >>> >>Graduate Research Assistant >>> >>Department of Mechanical Science and Engineering >>> >>University of Illinois at Urbana-Champaign >>> >>(217) 550-2360 >>> >>salaza11 at illinois.edu >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >Miguel Angel Salazar de Troya >>> >Graduate Research Assistant >>> >Department of Mechanical Science and Engineering >>> >University of Illinois at Urbana-Champaign >>> >(217) 550-2360 >>> >salaza11 at illinois.edu >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >Miguel Angel Salazar de Troya >>> >Graduate Research Assistant >>> >Department of Mechanical Science and Engineering >>> >University of Illinois at Urbana-Champaign >>> >(217) 550-2360 >>> >salaza11 at illinois.edu >>> >>> >> >> >> -- >> *Miguel Angel Salazar de Troya* >> Graduate Research Assistant >> Department of Mechanical Science and Engineering >> University of Illinois at Urbana-Champaign >> (217) 550-2360 >> salaza11 at illinois.edu >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 25 13:05:09 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 25 Sep 2014 13:05:09 -0500 Subject: [petsc-users] problem with nested fieldsplits In-Reply-To: References: Message-ID: On Mon, Sep 22, 2014 at 9:15 PM, Jean-Arthur Louis Olive wrote: > Hi all, > I am using PETSc (dev version) to solve the Stokes + temperature > equations. My DM has fields (vx, vy, p, T). > I have finally had time to look at this. I have tried to reproduce this setup in a PETSc example. Here is SNES ex19: cd src/snes/examples/tutorials make ex19 ./ex19 -ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_block_size 4 -pc_fieldsplit_type SCHUR -pc_fieldsplit_0_fields 0,1,2 -pc_fieldsplit_1_fields 3 -fieldsplit_1_pc_type lu -fieldsplit_0_pc_type fieldsplit -fieldsplit_0_pc_fieldsplit_block_size 3 -fieldsplit_0_pc_fieldsplit_0_fields 0,1 -fieldsplit_0_pc_fieldsplit_1_fields 2 -fieldsplit_0_pc_fieldsplit_type schur -fieldsplit_0_fieldsplit_0_pc_type lu -fieldsplit_0_fieldsplit_1_pc_type lu -snes_monitor_short -ksp_monitor_short -snes_view lid velocity = 0.0625, prandtl # = 1, grashof # = 1 0 SNES Function norm 0.239155 0 KSP Residual norm 0.239155 1 KSP Residual norm 8.25786e-07 1 SNES Function norm 6.82106e-05 0 KSP Residual norm 6.82106e-05 1 KSP Residual norm 1.478e-11 2 SNES Function norm 1.533e-11 SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=2 total number of function evaluations=3 SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, blocksize = 4, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Fields 0, 1, 2 Split number 1 Fields 3 KSP solver for A00 block KSP Object: (fieldsplit_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, blocksize = 3, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Fields 0, 1 Split number 1 Fields 2 KSP solver for A00 block KSP Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.875 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=32, cols=32 package used to perform factorization: petsc total: nonzeros=480, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 12 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=32, cols=32 total: nonzeros=256, allocated nonzeros=256 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.875 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=16, cols=16 package used to perform factorization: petsc total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 12 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=16, cols=16 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=16, cols=32 total: nonzeros=128, allocated nonzeros=128 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.875 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=32, cols=32 package used to perform factorization: petsc total: nonzeros=480, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 12 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=32, cols=32 total: nonzeros=256, allocated nonzeros=256 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=32, cols=16 total: nonzeros=128, allocated nonzeros=128 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 Mat Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=48, cols=48 total: nonzeros=576, allocated nonzeros=576 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_1_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_1_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.875 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=16, cols=16 package used to perform factorization: petsc total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 12 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_1_) 1 MPI processes type: schurcomplement rows=16, cols=16 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=16, cols=48 total: nonzeros=192, allocated nonzeros=192 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, blocksize = 3, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Fields 0, 1 Split number 1 Fields 2 KSP solver for A00 block KSP Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.875 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=32, cols=32 package used to perform factorization: petsc total: nonzeros=480, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 12 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=32, cols=32 total: nonzeros=256, allocated nonzeros=256 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.875 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=16, cols=16 package used to perform factorization: petsc total: nonzeros=120, allocated nonzeros=120 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 12 nodes, limit used is 5 linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: schurcomplement rows=16, cols=16 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=16, cols=32 total: nonzeros=128, allocated nonzeros=128 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5, needed 1.875 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=32, cols=32 package used to perform factorization: petsc total: nonzeros=480, allocated nonzeros=480 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 12 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_0_fieldsplit_0_) 1 MPI processes type: seqaij rows=32, cols=32 total: nonzeros=256, allocated nonzeros=256 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=32, cols=16 total: nonzeros=128, allocated nonzeros=128 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 Mat Object: (fieldsplit_0_fieldsplit_1_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_0_) 1 MPI processes type: seqaij rows=48, cols=48 total: nonzeros=576, allocated nonzeros=576 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=48, cols=16 total: nonzeros=192, allocated nonzeros=192 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 Mat Object: (fieldsplit_1_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=64, cols=64, bs=4 total: nonzeros=1024, allocated nonzeros=1024 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 Number of SNES iterations = 2 I cannot replicate your failure here. I am running on the 'next' branch here. What are you using? Also, are you using the DMDA for data layout? I would like to use nested fieldsplits to separate the T part from the > Stokes part, and apply a Schur complement approach to the Stokes block. > Unfortunately, I keep getting this error message: > [1]PETSC ERROR: DMCreateFieldDecomposition() line 1274 in > /home/jolive/petsc/src/dm/interface/dm.c Decomposition defined only after > DMSetUp > > Here are the command line options I tried: > > -snes_type ksponly \ > -ksp_type fgmres \ > # define 2 fields: [vx vy p] and [T] > -pc_type fieldsplit -pc_fieldsplit_0_fields 0,1,2 -pc_fieldsplit_1_fields > 3 \ > # split [vx vy p] into 2 fields: [vx vy] and [p] > -fieldsplit_0_pc_type fieldsplit \ > -pc_fieldsplit_0_fieldsplit_0_fields 0,1 -pc_fieldsplit_0_fieldsplit_1 > _fields 2 \ > Note that the 2 options above are wrong. It should be -fieldsplit_0_pc_fieldsplit_0_fields 0,1 Thanks, Matt > # apply schur complement to [vx vy p] > -fieldsplit_0_pc_fieldsplit_type schur \ > -fieldsplit_0_pc_fieldsplit_schur_factorization_type upper \ > > # solve everything with lu, just for testing > -fieldsplit_0_fieldsplit_0_ksp_type preonly \ > -fieldsplit_0_fieldsplit_0_pc_type lu -fieldsplit_0_fieldsplit_0_pc_factor_mat_solver_package > superlu_dist \ > -fieldsplit_0_fieldsplit_1_ksp_type preonly \ > -fieldsplit_0_fieldsplit_1_pc_type lu -fieldsplit_0_fieldsplit_1_pc_factor_mat_solver_package > superlu_dist \ > -fieldsplit_1_ksp_type preonly \ > -fieldsplit_1_pc_type lu -fieldsplit_1_pc_factor_mat_solver_package > superlu_dist \ > > Any idea what could be causing this? > Thanks a lot, > Arthur > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Thu Sep 25 13:46:48 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Thu, 25 Sep 2014 18:46:48 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: You are right. The Jacobian for the power grid application is indeed non-symmetric. Is that a problem for your application? Shri From: Miguel Angel Salazar de Troya Date: Thu, 25 Sep 2014 12:57:53 -0500 To: Shri Cc: Subject: Re: [petsc-users] DMPlex with spring elements >Why not? Wouldn't we have a row of zeros except for the diagonal term? >The column that corresponds to that degree of from doesn't have to be >zero, right? >Thanks >Miguel >On Sep 25, 2014 12:38 PM, "Abhyankar, Shrirang G." >wrote: > > > >From: Miguel Angel Salazar de Troya >Date: Thu, 25 Sep 2014 11:43:13 -0500 >To: Shri >Cc: >Subject: Re: [petsc-users] DMPlex with spring elements > > > >I see, and I guess I would have to assign a value of one to the diagonal >entry of that degree of freedom in the Jacobian right? > > > >Yes. > >Wouldn't this break the symmetry of the Jacobian (in case it were >symmetric)? > > >No. > >Shri > >Thanks >Miguel >On Sep 25, 2014 11:32 AM, "Abhyankar, Shrirang G." >wrote: > >The solver does not know anything about the boundary conditions. You >would have to specify it to the solver by describing the appropriate >equations. For e.g. in the power grid example, there is a part in the >residual evaluation > > if (bus->ide == REF_BUS || bus->ide == ISOLATED_BUS) { > farr[offset] = 0.0; > farr[offset+1] = 0.0; > break; > } > > >This sets the residual at the nodes marked with REF_BUS or ISOLATED_BUS >to 0.0. You can do something similar. > >Shri > > > >From: Miguel Angel Salazar de Troya >Date: Thu, 25 Sep 2014 10:52:16 -0500 >To: Shri >Cc: "petsc-users at mcs.anl.gov" >Subject: Re: [petsc-users] DMPlex with spring elements > > > >Thanks. Once I have marked the nodes that are fixed nodes using the >component data structure, how can I process it later? I mean, at what >point does the solver know that those degrees of freedom are actually >fixed and how I can tell it that they > are fixed? > >Miguel > > >On Thu, Sep 25, 2014 at 10:27 AM, Abhyankar, Shrirang G. > wrote: > > > >>Thanks. I think the term "Component" was confusing me, I thought it was >>related to the components of a field. I think this would be useful to me >>if I wanted to assign coordinates to the vertices, wouldn't it? > >Yes. You can put whatever data you want in the component data structure. > >> >>Also, I was wondering how to set up dirichlet boundary conditions, >>basically fixing certain nodes position. >> > >> >> >You can add a component at each node with a field marking whether the node >is a boundary node. > >>Could I do it as the function SetInitialValues does it in the pflow >>example? >> > >No. You need to put in the component data structure before calling >DMNetworkAddComponent() > > >>These values are used to eliminate the zeroth-order energy modes of the >>stiffness matrix? >> > > >> >>Last question, in my case I have two degrees of freedom per node, when I >>grab the offset with DMNetworkVariableOffset, that's for the first degree >>of freedom in that node and the second degree of freedom would just be >>offset+1? >> > >Yes. > >Shri > >> >>Miguel >> >> >>On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. >> wrote: >> >>If you have equations only at the nodes, with a part of it contributed by >>the edges (springs), then you can use DMNetwork. If you are planning to >>have equations for the beads in the future, or other higher layers, then >>DMPlex has better functionality >> to manage that. >> >>Shri >> >> >>From: Miguel Angel Salazar de Troya >>Date: Wed, 24 Sep 2014 17:38:11 -0500 >>To: Shri >>Cc: "petsc-users at mcs.anl.gov" >>Subject: Re: [petsc-users] DMPlex with spring elements >> >> >> >> >> >>Thanks for your response. I'm attaching a pdf with a description of the >>model. The description of the PetscSection is necessary for the >>DMNetwork? It looks like DMNetwork does not use a PetscSection. >> >> >> >> >> >> >>Miguel >> >> >>On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. >> wrote: >> >> >>>Thanks for your response. My discretization is based on spring elements. >>>For the linear one dimensional case in which each spring has a >>>coefficient k, their jacobian would be this two by two matrix. >>>[ k -k ] >>>[ -k k ] >>> >>>and the internal force >>> >>>[ k ( Ui - Uj) ] >>>[ k ( Uj - Ui) ] >>> >>>where Ui and Uj are the node displacements (just one displacement per >>>node because it's one dimensional) >>> >>>For the two dimensional case, assuming small deformations, we have a >>>four-by-four matrix. Each node has two degrees of freedom. We obtain it >>>by performing the outer product of the vector (t , -t) where "t" is the >>>vector that connects both nodes in a spring. This is for the case of >>>small deformations. I would need to assemble each spring contribution to >>>the jacobian and the residual like they were finite elements. The >>>springs >>>share nodes, that's how they are connected. This example is just the >>>linear case, I will have to implement a nonlinear case in a similar >>>fashion. >>> >>>Seeing the DMNetwork example, I think it's what I need, although I don't >>>know much of power electric grids and it's hard for me to understand >>>what's going on. Do you have a good reference to be able to follow the >>>code? >> >>> >>Please see the attached document which has more description of DMNetwork >>and the equations for the power grid example. I don't have anything that >>describes how the power grid example is implemented. >> >>>For example, why are they adding components to the edges? >>> >>>475: DMNetworkAddComponent >>>>>/ >>>D >>>MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentke >>>y >>>[ >>>0],&pfdata.branch[i-eStart]);Miguel >> >>Each edge or node can have several components (limited to 10) attached to >>it. The term components, taken from the circuit terminology, refers to >>the >>elements of a network. For example, a component could be a resistor, >>inductor, spring, or even edge/vertex weights (for graph problems). For >>code implementation, component is a data structure that holds the data >>needed for the residual, Jacobian, or any other function evaluation. In >>the case of power grid, there are 4 components: branches or transmission >>lines connecting nodes, buses or nodes, generators that are incident at a >>subset of the nodes, and loads that are also incident at a subset of the >>nodes. Each of the these components are defined by their data structures >>given in pf.h. >> >>DMNetwork is a wrapper class of DMPlex specifically for network >>applications that can be solely described using nodes, edges, and their >>associated components. If you have a PDE, or need FEM, or need other >>advanced features then DMPlex would be suitable. Please send us a >>write-up >>of your equations so that we can assist you better. >> >>Shri >> >> >>> >>> >>>On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. >>> wrote: >>> >>>You may also want to take a look at the DMNetwork framework that can be >>>used for general unstructured networks that don't use PDEs. Its >>>description is given in the manual and an example is in >>>src/snes/examples/tutorials/network/pflow. >>> >>>Shri >>> >>>From: Matthew Knepley >>>Date: Tue, 23 Sep 2014 22:40:52 -0400 >>>To: Miguel Angel Salazar de Troya >>>Cc: "petsc-users at mcs.anl.gov" >>>Subject: Re: [petsc-users] DMPlex with spring elements >>> >>> >>>>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya >>>> wrote: >>>> >>>>Hi all >>>>I was wondering if it could be possible to build a model similar to the >>>>example snes/ex12.c, but with spring elements (for elasticity) instead >>>>of >>>>simplicial elements. Spring elements in a grid, therefore each element >>>>would have two nodes and each node two components. There would be more >>>>differences, because instead of calling the functions f0,f1,g0,g1,g2 >>>>and >>>>g3 to build the residual and the jacobian, I would call a routine that >>>>would build the residual vector and the jacobian matrix directly. I >>>>would >>>>not have shape functions whatsoever. My problem is discrete, I don't >>>>have >>>>a PDE and my equations are algebraic. What is the best way in petsc to >>>>solve this problem? Is there any example that I can follow? Thanks in >>>>advance >>>> >>>> >>>> >>>> >>>>Yes, ex12 is fairly specific to FEM. However, I think the right tools >>>>for >>>>what you want are >>>>DMPlex and PetscSection. Here is how I would proceed: >>>> >>>> 1) Make a DMPlex that encodes a simple network that you wish to >>>>simulate >>>> >>>> 2) Make a PetscSection that gets the data layout right. Its hard from >>>>the above >>>> for me to understand where you degrees of freedom actually are. >>>>This is usually >>>> the hard part. >>>> >>>> 3) Calculate the residual, so you can check an exact solution. Here >>>>you >>>>use the >>>> PetscSectionGetDof/Offset() for each mesh piece that you are >>>>interested in. Again, >>>> its hard to be more specific when I do not understand your >>>>discretization. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>Miguel >>>> >>>> >>>> >>>>-- >>>>Miguel Angel Salazar de Troya >>>>Graduate Research Assistant >>>>Department of Mechanical Science and Engineering >>>>University of Illinois at Urbana-Champaign >>> >>> >>>>(217) 550-2360 >>>>salaza11 at illinois.edu >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>-- >>>>What most experimenters take for granted before they begin their >>>>experiments is infinitely more interesting than any results to which >>>>their experiments lead. >>>>-- Norbert Wiener >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>Miguel Angel Salazar de Troya >>>Graduate Research Assistant >>>Department of Mechanical Science and Engineering >>>University of Illinois at Urbana-Champaign >>>(217) 550-2360 >>>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu From knepley at gmail.com Thu Sep 25 13:49:42 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 25 Sep 2014 13:49:42 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. wrote: > You are right. The Jacobian for the power grid application is indeed > non-symmetric. Is that a problem for your application? > If you need a symmetric Jacobian, you can use the BC facility in PetscSection, which eliminates the variables completely. This is how the FEM examples, like ex12, work. Matt > Shri > > From: Miguel Angel Salazar de Troya > Date: Thu, 25 Sep 2014 12:57:53 -0500 > To: Shri > Cc: > Subject: Re: [petsc-users] DMPlex with spring elements > > > >Why not? Wouldn't we have a row of zeros except for the diagonal term? > >The column that corresponds to that degree of from doesn't have to be > >zero, right? > >Thanks > >Miguel > >On Sep 25, 2014 12:38 PM, "Abhyankar, Shrirang G." > >wrote: > > > > > > > >From: Miguel Angel Salazar de Troya > >Date: Thu, 25 Sep 2014 11:43:13 -0500 > >To: Shri > >Cc: > >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > > > >I see, and I guess I would have to assign a value of one to the diagonal > >entry of that degree of freedom in the Jacobian right? > > > > > > > >Yes. > > > >Wouldn't this break the symmetry of the Jacobian (in case it were > >symmetric)? > > > > > >No. > > > >Shri > > > >Thanks > >Miguel > >On Sep 25, 2014 11:32 AM, "Abhyankar, Shrirang G." > >wrote: > > > >The solver does not know anything about the boundary conditions. You > >would have to specify it to the solver by describing the appropriate > >equations. For e.g. in the power grid example, there is a part in the > >residual evaluation > > > > if (bus->ide == REF_BUS || bus->ide == ISOLATED_BUS) { > > farr[offset] = 0.0; > > farr[offset+1] = 0.0; > > break; > > } > > > > > >This sets the residual at the nodes marked with REF_BUS or ISOLATED_BUS > >to 0.0. You can do something similar. > > > >Shri > > > > > > > >From: Miguel Angel Salazar de Troya > >Date: Thu, 25 Sep 2014 10:52:16 -0500 > >To: Shri > >Cc: "petsc-users at mcs.anl.gov" > >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > > > >Thanks. Once I have marked the nodes that are fixed nodes using the > >component data structure, how can I process it later? I mean, at what > >point does the solver know that those degrees of freedom are actually > >fixed and how I can tell it that they > > are fixed? > > > >Miguel > > > > > >On Thu, Sep 25, 2014 at 10:27 AM, Abhyankar, Shrirang G. > > wrote: > > > > > > > >>Thanks. I think the term "Component" was confusing me, I thought it was > >>related to the components of a field. I think this would be useful to me > >>if I wanted to assign coordinates to the vertices, wouldn't it? > > > >Yes. You can put whatever data you want in the component data structure. > > > >> > >>Also, I was wondering how to set up dirichlet boundary conditions, > >>basically fixing certain nodes position. > >> > > > >> > >> > >You can add a component at each node with a field marking whether the node > >is a boundary node. > > > >>Could I do it as the function SetInitialValues does it in the pflow > >>example? > >> > > > >No. You need to put in the component data structure before calling > >DMNetworkAddComponent() > > > > > >>These values are used to eliminate the zeroth-order energy modes of the > >>stiffness matrix? > >> > > > > > >> > >>Last question, in my case I have two degrees of freedom per node, when I > >>grab the offset with DMNetworkVariableOffset, that's for the first degree > >>of freedom in that node and the second degree of freedom would just be > >>offset+1? > >> > > > >Yes. > > > >Shri > > > >> > >>Miguel > >> > >> > >>On Wed, Sep 24, 2014 at 9:52 PM, Abhyankar, Shrirang G. > >> wrote: > >> > >>If you have equations only at the nodes, with a part of it contributed by > >>the edges (springs), then you can use DMNetwork. If you are planning to > >>have equations for the beads in the future, or other higher layers, then > >>DMPlex has better functionality > >> to manage that. > >> > >>Shri > >> > >> > >>From: Miguel Angel Salazar de Troya > >>Date: Wed, 24 Sep 2014 17:38:11 -0500 > >>To: Shri > >>Cc: "petsc-users at mcs.anl.gov" > >>Subject: Re: [petsc-users] DMPlex with spring elements > >> > >> > >> > >> > >> > >>Thanks for your response. I'm attaching a pdf with a description of the > >>model. The description of the PetscSection is necessary for the > >>DMNetwork? It looks like DMNetwork does not use a PetscSection. > >> > >> > >> > >> > >> > >> > >>Miguel > >> > >> > >>On Wed, Sep 24, 2014 at 1:43 PM, Abhyankar, Shrirang G. > >> wrote: > >> > >> > >>>Thanks for your response. My discretization is based on spring elements. > >>>For the linear one dimensional case in which each spring has a > >>>coefficient k, their jacobian would be this two by two matrix. > >>>[ k -k ] > >>>[ -k k ] > >>> > >>>and the internal force > >>> > >>>[ k ( Ui - Uj) ] > >>>[ k ( Uj - Ui) ] > >>> > >>>where Ui and Uj are the node displacements (just one displacement per > >>>node because it's one dimensional) > >>> > >>>For the two dimensional case, assuming small deformations, we have a > >>>four-by-four matrix. Each node has two degrees of freedom. We obtain it > >>>by performing the outer product of the vector (t , -t) where "t" is the > >>>vector that connects both nodes in a spring. This is for the case of > >>>small deformations. I would need to assemble each spring contribution to > >>>the jacobian and the residual like they were finite elements. The > >>>springs > >>>share nodes, that's how they are connected. This example is just the > >>>linear case, I will have to implement a nonlinear case in a similar > >>>fashion. > >>> > >>>Seeing the DMNetwork example, I think it's what I need, although I don't > >>>know much of power electric grids and it's hard for me to understand > >>>what's going on. Do you have a good reference to be able to follow the > >>>code? > >> > >>> > >>Please see the attached document which has more description of DMNetwork > >>and the equations for the power grid example. I don't have anything that > >>describes how the power grid example is implemented. > >> > >>>For example, why are they adding components to the edges? > >>> > >>>475: DMNetworkAddComponent > >>>< > http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/docs/manualpages/DM > >>>/ > >>>D > >>>MNetworkAddComponent.html#DMNetworkAddComponent>(networkdm,i,componentke > >>>y > >>>[ > >>>0],&pfdata.branch[i-eStart]);Miguel > >> > >>Each edge or node can have several components (limited to 10) attached to > >>it. The term components, taken from the circuit terminology, refers to > >>the > >>elements of a network. For example, a component could be a resistor, > >>inductor, spring, or even edge/vertex weights (for graph problems). For > >>code implementation, component is a data structure that holds the data > >>needed for the residual, Jacobian, or any other function evaluation. In > >>the case of power grid, there are 4 components: branches or transmission > >>lines connecting nodes, buses or nodes, generators that are incident at a > >>subset of the nodes, and loads that are also incident at a subset of the > >>nodes. Each of the these components are defined by their data structures > >>given in pf.h. > >> > >>DMNetwork is a wrapper class of DMPlex specifically for network > >>applications that can be solely described using nodes, edges, and their > >>associated components. If you have a PDE, or need FEM, or need other > >>advanced features then DMPlex would be suitable. Please send us a > >>write-up > >>of your equations so that we can assist you better. > >> > >>Shri > >> > >> > >>> > >>> > >>>On Tue, Sep 23, 2014 at 11:13 PM, Abhyankar, Shrirang G. > >>> wrote: > >>> > >>>You may also want to take a look at the DMNetwork framework that can be > >>>used for general unstructured networks that don't use PDEs. Its > >>>description is given in the manual and an example is in > >>>src/snes/examples/tutorials/network/pflow. > >>> > >>>Shri > >>> > >>>From: Matthew Knepley > >>>Date: Tue, 23 Sep 2014 22:40:52 -0400 > >>>To: Miguel Angel Salazar de Troya > >>>Cc: "petsc-users at mcs.anl.gov" > >>>Subject: Re: [petsc-users] DMPlex with spring elements > >>> > >>> > >>>>On Tue, Sep 23, 2014 at 4:01 PM, Miguel Angel Salazar de Troya > >>>> wrote: > >>>> > >>>>Hi all > >>>>I was wondering if it could be possible to build a model similar to the > >>>>example snes/ex12.c, but with spring elements (for elasticity) instead > >>>>of > >>>>simplicial elements. Spring elements in a grid, therefore each element > >>>>would have two nodes and each node two components. There would be more > >>>>differences, because instead of calling the functions f0,f1,g0,g1,g2 > >>>>and > >>>>g3 to build the residual and the jacobian, I would call a routine that > >>>>would build the residual vector and the jacobian matrix directly. I > >>>>would > >>>>not have shape functions whatsoever. My problem is discrete, I don't > >>>>have > >>>>a PDE and my equations are algebraic. What is the best way in petsc to > >>>>solve this problem? Is there any example that I can follow? Thanks in > >>>>advance > >>>> > >>>> > >>>> > >>>> > >>>>Yes, ex12 is fairly specific to FEM. However, I think the right tools > >>>>for > >>>>what you want are > >>>>DMPlex and PetscSection. Here is how I would proceed: > >>>> > >>>> 1) Make a DMPlex that encodes a simple network that you wish to > >>>>simulate > >>>> > >>>> 2) Make a PetscSection that gets the data layout right. Its hard from > >>>>the above > >>>> for me to understand where you degrees of freedom actually are. > >>>>This is usually > >>>> the hard part. > >>>> > >>>> 3) Calculate the residual, so you can check an exact solution. Here > >>>>you > >>>>use the > >>>> PetscSectionGetDof/Offset() for each mesh piece that you are > >>>>interested in. Again, > >>>> its hard to be more specific when I do not understand your > >>>>discretization. > >>>> > >>>> Thanks, > >>>> > >>>> Matt > >>>> > >>>> > >>>>Miguel > >>>> > >>>> > >>>> > >>>>-- > >>>>Miguel Angel Salazar de Troya > >>>>Graduate Research Assistant > >>>>Department of Mechanical Science and Engineering > >>>>University of Illinois at Urbana-Champaign > >>> > >>> > >>>>(217) 550-2360 > >>>>salaza11 at illinois.edu > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>-- > >>>>What most experimenters take for granted before they begin their > >>>>experiments is infinitely more interesting than any results to which > >>>>their experiments lead. > >>>>-- Norbert Wiener > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>>-- > >>>Miguel Angel Salazar de Troya > >>>Graduate Research Assistant > >>>Department of Mechanical Science and Engineering > >>>University of Illinois at Urbana-Champaign > >>>(217) 550-2360 > >>>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > > > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.m.alletto at lmco.com Thu Sep 25 13:50:00 2014 From: john.m.alletto at lmco.com (Alletto, John M) Date: Thu, 25 Sep 2014 18:50:00 +0000 Subject: [petsc-users] stencil ordering for a 13 point star stencil Message-ID: All, In DMDACreate3D I declare a stencil width of 2 for a 13 point star stencil Later when I fill the stencil, I am assuming the following order... can you tell me if I am correct in the assumption. My assumption is the each V index maps to the corresponding I,j,k indicies of W W is the Stencil weighting at that position v[0] = W (I,j,k-2) v[1] =W(I,j-2,k) v[2] =W( i-1,j,k) v[3] =W( I,j,k-1) v[4] =W( I,j-1,k) v[5] =W( i-1,j,k) v[7] =W( i+1, j, k) v[8] =W( I,j+1,k) v[9] =W( I,j,k+1) v[10] =W (i+2,j,k) v[11] =W(I,j+2,k) v[12] =W(I,j,k+2) v[6] = sum( v[1..5])+ sum(v[6..12]) Thanks John -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 25 13:59:06 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 25 Sep 2014 13:59:06 -0500 Subject: [petsc-users] stencil ordering for a 13 point star stencil In-Reply-To: References: Message-ID: On Thu, Sep 25, 2014 at 1:50 PM, Alletto, John M wrote: > All, > > > > In DMDACreate3D I declare a stencil width of 2 for a 13 point star stencil > > > > Later when I fill the stencil, I am assuming the following order? can you > tell me if I am correct in the assumption. > Are you talking about using MatSetValuesStencil()? If so, there is no inherent ordering. You plug in the i,j,k,c values. Thanks, Matt > My assumption is the each V index maps to the corresponding I,j,k indicies > of W > > > > W is the Stencil weighting at that position > > v[0] = W (I,j,k-2) > > v[1] =W(I,j-2,k) > > v[2] =W( i-1,j,k) > > v[3] =W( I,j,k-1) > > v[4] =W( I,j-1,k) > > v[5] =W( i-1,j,k) > > > > v[7] =W( i+1, j, k) > > v[8] =W( I,j+1,k) > > v[9] =W( I,j,k+1) > > v[10] =W (i+2,j,k) > > v[11] =W(I,j+2,k) > > v[12] =W(I,j,k+2) > > v[6] = sum( v[1..5])+ sum(v[6..12]) > > > > > > Thanks > > John > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Sep 25 14:42:08 2014 From: jed at jedbrown.org (Jed Brown) Date: Thu, 25 Sep 2014 14:42:08 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: <878ul7cwrz.fsf@jedbrown.org> Matthew Knepley writes: > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. > wrote: > >> You are right. The Jacobian for the power grid application is indeed >> non-symmetric. Is that a problem for your application? >> > > If you need a symmetric Jacobian, you can use the BC facility in > PetscSection, which eliminates the > variables completely. This is how the FEM examples, like ex12, work. You can also use MatZeroRowsColumns() or do the equivalent transformation during assembly (my preference). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From salazardetroya at gmail.com Thu Sep 25 17:15:10 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Thu, 25 Sep 2014 17:15:10 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: <878ul7cwrz.fsf@jedbrown.org> References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: > If you need a symmetric Jacobian, you can use the BC facility in > PetscSection, which eliminates the > variables completely. This is how the FEM examples, like ex12, work. Would that be with PetscSectionSetConstraintDof ? For that I will need the PetscSection, DofSection, within DMNetwork, how can I obtain it? I could cast it to DM_Network from the dm, networkdm, declared in the main program, maybe something like this: DM_Network *network = (DM_Network*) networkdm->data; Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) Thanks for your responses. Miguel On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: > Matthew Knepley writes: > > > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < > abhyshr at mcs.anl.gov > >> wrote: > > > >> You are right. The Jacobian for the power grid application is indeed > >> non-symmetric. Is that a problem for your application? > >> > > > > If you need a symmetric Jacobian, you can use the BC facility in > > PetscSection, which eliminates the > > variables completely. This is how the FEM examples, like ex12, work. > > You can also use MatZeroRowsColumns() or do the equivalent > transformation during assembly (my preference). > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Sep 25 17:17:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 25 Sep 2014 17:17:52 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > > If you need a symmetric Jacobian, you can use the BC facility in > > PetscSection, which eliminates the > > variables completely. This is how the FEM examples, like ex12, work. > > Would that be with PetscSectionSetConstraintDof ? For that I will need the > PetscSection, DofSection, within DMNetwork, how can I obtain it? I could > cast it to DM_Network from the dm, networkdm, declared in the main > program, maybe something like this: > > DM_Network *network = (DM_Network*) networkdm->data; > > Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) > > I admit to not completely understanding DMNetwork. However, it eventually builds a PetscSection for data layout, which you could get from DMGetDefaultSection(). The right thing to do is find where it builds the Section, and put in your BC there, but that sounds like it would entail coding. Thanks, Matt > Thanks for your responses. > > Miguel > > > > On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: > >> Matthew Knepley writes: >> >> > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < >> abhyshr at mcs.anl.gov >> >> wrote: >> > >> >> You are right. The Jacobian for the power grid application is indeed >> >> non-symmetric. Is that a problem for your application? >> >> >> > >> > If you need a symmetric Jacobian, you can use the BC facility in >> > PetscSection, which eliminates the >> > variables completely. This is how the FEM examples, like ex12, work. >> >> You can also use MatZeroRowsColumns() or do the equivalent >> transformation during assembly (my preference). >> > > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From potaman at outlook.com Thu Sep 25 22:29:14 2014 From: potaman at outlook.com (subramanya sadasiva) Date: Thu, 25 Sep 2014 23:29:14 -0400 Subject: [petsc-users] Generating xdmf from h5 file. In-Reply-To: References: , , , Message-ID: Hi Matt, Sorry about that, I changed if 'time' in h5: time = np.array(h5['time']).flatten() else: time = np.empty(1) The code now fails in the writeSpaceGridHeader function. with the error, Traceback (most recent call last): File "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 232, in generateXdmf(sys.argv[1]) File "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 227, in generateXdmf Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields) File "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 180, in write self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim, spaceDim) File "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 64, in writeSpaceGridHeader print self.cellMap[cellDim][numCorners] The error is due to the fact that numCorners is set to be 1 , while celldim=2. cellMap has the following elements. {1: {1: 'Polyvertex', 2: 'Polyline'}, 2: {3: 'Triangle', 4: 'Quadrilateral'}, 3: {8: 'Hexahedron', 4: 'Tetrahedron'}} I also tried ./ex12 -dm_view vtk:my.vtk:vtk_vtu . This doesn't seem to do anything. Is there any specific option I need to build petsc with to get vtk output? My current build has hdf5 and netcdf enabled. Thanks,Subramanya Date: Wed, 24 Sep 2014 17:36:52 -0500Subject: Re: [petsc-users] Generating xdmf from h5 file. From: knepley at gmail.com To: potaman at outlook.com; petsc-maint at mcs.anl.gov; petsc-users at mcs.anl.gov On Wed, Sep 24, 2014 at 5:29 PM, subramanya sadasiva wrote: Hi Matt, That did not help. That's not enough description to fix anything, and fixing it will require programming. Is there any other way to output the mesh to something that paraview can view? I tried outputting the file to a vtk file using ex12 -dm_view vtk:my.vtk:ascii_vtk which, I saw in another post on the forums, but that did not give me any output. This is mixing two different things. PETSc has a diagnostic ASCII vtk output, so the type would be ascii, not vtk,and format ascii_vtk . It also has a production VTU output, which is type vtk with format vtk_vtu. Thanks, Matt Subramanya Date: Wed, 24 Sep 2014 17:19:51 -0500 Subject: Re: [petsc-users] Generating xdmf from h5 file. From: knepley at gmail.com To: potaman at outlook.com CC: petsc-users at mcs.anl.gov On Wed, Sep 24, 2014 at 5:08 PM, subramanya sadasiva wrote: Hi, i was trying to use petsc_gen_xdmf.py to convert a h5 file to a xdmf file. The h5 file was generated by snes/ex12 which was run as, ex12 -dm_view hdf5:my.h5 When I do, petsc_gen_xdmf.py my.h5 I get the following error, File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 220, in generateXdmf(sys.argv[1]) File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 208, in generateXdmf time = np.array(h5['time']).flatten() File "/usr/lib/python2.7/dist-packages/h5py/_hl/group.py", line 153, in __getitem__ oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5o.pyx", line 173, in h5py.h5o.open (h5py/h5o.c:3403) KeyError: "unable to open object (Symbol table: Can't open object)" I am not sure if the error is on my end. This is on Ubuntu 14.04 with the serial version of hdf5. I built petsc with --download-hdf5, is it necessary to use the same version of hdf5 to generate the xdmf file? That code is alpha, and mainly built for me to experiment with an application here, so it is not user-friendly. In yourHDF5 file, there is no 'time' since you are not running a TS. This access to h5['time'] should just be protected, andan empty array should be put in if its not there. Matt Thanks Subramanya -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Fri Sep 26 07:29:40 2014 From: popov at uni-mainz.de (anton) Date: Fri, 26 Sep 2014 14:29:40 +0200 Subject: [petsc-users] fieldsplit doesn't pass prefix to inner ksp Message-ID: <54255C34.1090801@uni-mainz.de> Create preconditioner: PCCreate(PETSC_COMM_WORLD, &pc); PCSetOptionsPrefix(pc, "bf_"); PCSetFromOptions(pc); Define fieldsplit options: -bf_pc_type fieldsplit -bf_pc_fieldsplit_type SCHUR -bf_pc_fieldsplit_schur_factorization_type UPPER Works OK. Set options for the first field solver: -bf_fieldsplit_0_ksp_type preonly -bf_fieldsplit_0_pc_type lu Doesn't work (ignored), because "bf_" prefix isn't pass to inner solver ksp (checked in the debugger). Indeed, the following works: -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type lu Observed with 3.5 but not with 3.4 Thanks. Anton From knepley at gmail.com Fri Sep 26 08:52:06 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Sep 2014 08:52:06 -0500 Subject: [petsc-users] fieldsplit doesn't pass prefix to inner ksp In-Reply-To: <54255C34.1090801@uni-mainz.de> References: <54255C34.1090801@uni-mainz.de> Message-ID: On Fri, Sep 26, 2014 at 7:29 AM, anton wrote: > Create preconditioner: > > PCCreate(PETSC_COMM_WORLD, &pc); > PCSetOptionsPrefix(pc, "bf_"); > PCSetFromOptions(pc); > > Define fieldsplit options: > > -bf_pc_type fieldsplit > -bf_pc_fieldsplit_type SCHUR > -bf_pc_fieldsplit_schur_factorization_type UPPER > > Works OK. > > Set options for the first field solver: > > -bf_fieldsplit_0_ksp_type preonly > -bf_fieldsplit_0_pc_type lu > > Doesn't work (ignored), because "bf_" prefix isn't pass to inner solver > ksp (checked in the debugger). > > Indeed, the following works: > > -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type lu > > Observed with 3.5 but not with 3.4 > I just tried this with master on SNES ex19, and got the correct result: knepley/feature-parallel-partition *$:/PETSc3/petsc/petsc-dev/src/snes/examples/tutorials$ ./ex19 -bf_pc_type fieldsplit -bf_snes_view ./ex19 -bf_pc_type fieldsplit -bf_snes_view lid velocity = 0.0625, prandtl # = 1, grashof # = 1 SNES Object:(bf_) 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=13 total number of function evaluations=3 SNESLineSearch Object: (bf_) 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: (bf_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (bf_) 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 4 Solver info for each split is in the following KSP objects: Split number 0 Defined by IS KSP Object: (bf_fieldsplit_x_velocity_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (bf_fieldsplit_x_velocity_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=16, cols=16 package used to perform factorization: petsc total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (bf_fieldsplit_x_velocity_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Defined by IS KSP Object: (bf_fieldsplit_y_velocity_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (bf_fieldsplit_y_velocity_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=16, cols=16 package used to perform factorization: petsc total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (bf_fieldsplit_y_velocity_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 2 Defined by IS KSP Object: (bf_fieldsplit_Omega_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (bf_fieldsplit_Omega_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=16, cols=16 package used to perform factorization: petsc total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (bf_fieldsplit_Omega_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 3 Defined by IS KSP Object: (bf_fieldsplit_temperature_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (bf_fieldsplit_temperature_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=16, cols=16 package used to perform factorization: petsc total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (bf_fieldsplit_temperature_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=64, cols=64, bs=4 total: nonzeros=1024, allocated nonzeros=1024 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 Number of SNES iterations = 2 I will try with 3.5.2. Thanks, Matt > Thanks. > Anton > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 26 09:26:57 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Sep 2014 09:26:57 -0500 Subject: [petsc-users] fieldsplit doesn't pass prefix to inner ksp In-Reply-To: References: <54255C34.1090801@uni-mainz.de> Message-ID: Here is the result for 3.5.2, which looks right to me: (v3.5.2) *:/PETSc3/petsc/release-petsc-3.5.1/src/snes/examples/tutorials$ ./ex19 -bf_pc_type fieldsplit -bf_snes_view -bf_pc_fieldsplit_type schur -bf_pc_fieldsplit_0_fields 0,1,2 -bf_pc_fieldsplit_1_fields 3 -bf_pc_fieldsplit_schur_factorization_type upper lid velocity = 0.0625, prandtl # = 1, grashof # = 1 SNES Object:(bf_) 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=4 total number of function evaluations=3 SNESLineSearch Object: (bf_) 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: (bf_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (bf_) 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization UPPER Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (bf_fieldsplit_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (bf_fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=48, cols=48 package used to perform factorization: petsc total: nonzeros=576, allocated nonzeros=576 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (bf_fieldsplit_0_) 1 MPI processes type: seqaij rows=48, cols=48 total: nonzeros=576, allocated nonzeros=576 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (bf_fieldsplit_temperature_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (bf_fieldsplit_temperature_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=16, cols=16 package used to perform factorization: petsc total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix followed by preconditioner matrix: Mat Object: (bf_fieldsplit_temperature_) 1 MPI processes type: schurcomplement rows=16, cols=16 Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (bf_fieldsplit_temperature_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=16, cols=48 total: nonzeros=192, allocated nonzeros=192 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (bf_fieldsplit_0_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (bf_fieldsplit_0_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: natural factor fill ratio given 1, needed 1 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=48, cols=48 package used to perform factorization: petsc total: nonzeros=576, allocated nonzeros=576 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (bf_fieldsplit_0_) 1 MPI processes type: seqaij rows=48, cols=48 total: nonzeros=576, allocated nonzeros=576 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=48, cols=16 total: nonzeros=192, allocated nonzeros=192 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 Mat Object: (bf_fieldsplit_temperature_) 1 MPI processes type: seqaij rows=16, cols=16 total: nonzeros=64, allocated nonzeros=64 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=64, cols=64, bs=4 total: nonzeros=1024, allocated nonzeros=1024 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 16 nodes, limit used is 5 Number of SNES iterations = 2 Thanks, Matt On Fri, Sep 26, 2014 at 8:52 AM, Matthew Knepley wrote: > On Fri, Sep 26, 2014 at 7:29 AM, anton wrote: > >> Create preconditioner: >> >> PCCreate(PETSC_COMM_WORLD, &pc); >> PCSetOptionsPrefix(pc, "bf_"); >> PCSetFromOptions(pc); >> >> Define fieldsplit options: >> >> -bf_pc_type fieldsplit >> -bf_pc_fieldsplit_type SCHUR >> -bf_pc_fieldsplit_schur_factorization_type UPPER >> >> Works OK. >> >> Set options for the first field solver: >> >> -bf_fieldsplit_0_ksp_type preonly >> -bf_fieldsplit_0_pc_type lu >> >> Doesn't work (ignored), because "bf_" prefix isn't pass to inner solver >> ksp (checked in the debugger). >> >> Indeed, the following works: >> >> -fieldsplit_0_ksp_type preonly >> -fieldsplit_0_pc_type lu >> >> Observed with 3.5 but not with 3.4 >> > > I just tried this with master on SNES ex19, and got the correct result: > > knepley/feature-parallel-partition > *$:/PETSc3/petsc/petsc-dev/src/snes/examples/tutorials$ ./ex19 -bf_pc_type > fieldsplit -bf_snes_view > ./ex19 -bf_pc_type fieldsplit -bf_snes_view > lid velocity = 0.0625, prandtl # = 1, grashof # = 1 > SNES Object:(bf_) 1 MPI processes > type: newtonls > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 > total number of linear solver iterations=13 > total number of function evaluations=3 > SNESLineSearch Object: (bf_) 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: (bf_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (bf_) 1 MPI processes > type: fieldsplit > FieldSplit with MULTIPLICATIVE composition: total splits = 4 > Solver info for each split is in the following KSP objects: > Split number 0 Defined by IS > KSP Object: (bf_fieldsplit_x_velocity_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (bf_fieldsplit_x_velocity_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_x_velocity_) 1 MPI > processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Split number 1 Defined by IS > KSP Object: (bf_fieldsplit_y_velocity_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (bf_fieldsplit_y_velocity_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_y_velocity_) 1 MPI > processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Split number 2 Defined by IS > KSP Object: (bf_fieldsplit_Omega_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (bf_fieldsplit_Omega_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_Omega_) 1 MPI processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Split number 3 Defined by IS > KSP Object: (bf_fieldsplit_temperature_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (bf_fieldsplit_temperature_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_temperature_) 1 MPI > processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=64, cols=64, bs=4 > total: nonzeros=1024, allocated nonzeros=1024 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 16 nodes, limit used is 5 > Number of SNES iterations = 2 > > I will try with 3.5.2. > > Thanks, > > Matt > > >> Thanks. >> Anton >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Fri Sep 26 09:31:35 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Fri, 26 Sep 2014 09:31:35 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: Thanks. I had another question about the DM and SNES and TS. There are similar routines to assign the residual and jacobian evaluation to both objects. For the SNES case are: DMSNESSetFunctionLocal DMSNESSetJacobianLocal What are the differences of these with: SNESSetFunction SNESSetJacobian and when should we use each? With "Local", it is meant to evaluate the function/jacobian for the elements in the local processor? I could get the local edges in DMNetwork by calling DMNetworkGetEdgeRange? Miguel On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley wrote: > On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya < > salazardetroya at gmail.com> wrote: > >> > If you need a symmetric Jacobian, you can use the BC facility in >> > PetscSection, which eliminates the >> > variables completely. This is how the FEM examples, like ex12, work. >> >> Would that be with PetscSectionSetConstraintDof ? For that I will need >> the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >> could cast it to DM_Network from the dm, networkdm, declared in the main >> program, maybe something like this: >> >> DM_Network *network = (DM_Network*) networkdm->data; >> >> Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) >> >> I admit to not completely understanding DMNetwork. However, it eventually > builds a PetscSection for data layout, which > you could get from DMGetDefaultSection(). The right thing to do is find > where it builds the Section, and put in your BC > there, but that sounds like it would entail coding. > > Thanks, > > Matt > > >> Thanks for your responses. >> >> Miguel >> >> >> >> On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >> >>> Matthew Knepley writes: >>> >>> > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < >>> abhyshr at mcs.anl.gov >>> >> wrote: >>> > >>> >> You are right. The Jacobian for the power grid application is indeed >>> >> non-symmetric. Is that a problem for your application? >>> >> >>> > >>> > If you need a symmetric Jacobian, you can use the BC facility in >>> > PetscSection, which eliminates the >>> > variables completely. This is how the FEM examples, like ex12, work. >>> >>> You can also use MatZeroRowsColumns() or do the equivalent >>> transformation during assembly (my preference). >>> >> >> >> >> -- >> *Miguel Angel Salazar de Troya* >> Graduate Research Assistant >> Department of Mechanical Science and Engineering >> University of Illinois at Urbana-Champaign >> (217) 550-2360 >> salaza11 at illinois.edu >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 26 09:34:21 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Sep 2014 09:34:21 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > Thanks. I had another question about the DM and SNES and TS. There are > similar routines to assign the residual and jacobian evaluation to both > objects. For the SNES case are: > > DMSNESSetFunctionLocal > DMSNESSetJacobianLocal > > What are the differences of these with: > > SNESSetFunction > SNESSetJacobian > SNESSetFunction() expects the user to construct the entire parallel residual vector. DMSNESSetFunctionLocal() expects the user to construct the local pieces of the residual, and then it automatically calls DMLocalToGlobal() to assembly the full residual. It also converts the input from global vectors to local vectors, and in the case of DMDA multidimensional arrays. Thanks, Matt > and when should we use each? With "Local", it is meant to evaluate the > function/jacobian for the elements in the local processor? I could get the > local edges in DMNetwork by calling DMNetworkGetEdgeRange? > > Miguel > > On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley > wrote: > >> On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya < >> salazardetroya at gmail.com> wrote: >> >>> > If you need a symmetric Jacobian, you can use the BC facility in >>> > PetscSection, which eliminates the >>> > variables completely. This is how the FEM examples, like ex12, work. >>> >>> Would that be with PetscSectionSetConstraintDof ? For that I will need >>> the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>> could cast it to DM_Network from the dm, networkdm, declared in the main >>> program, maybe something like this: >>> >>> DM_Network *network = (DM_Network*) networkdm->data; >>> >>> Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) >>> >>> I admit to not completely understanding DMNetwork. However, it >> eventually builds a PetscSection for data layout, which >> you could get from DMGetDefaultSection(). The right thing to do is find >> where it builds the Section, and put in your BC >> there, but that sounds like it would entail coding. >> >> Thanks, >> >> Matt >> >> >>> Thanks for your responses. >>> >>> Miguel >>> >>> >>> >>> On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >>> >>>> Matthew Knepley writes: >>>> >>>> > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < >>>> abhyshr at mcs.anl.gov >>>> >> wrote: >>>> > >>>> >> You are right. The Jacobian for the power grid application is indeed >>>> >> non-symmetric. Is that a problem for your application? >>>> >> >>>> > >>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>> > PetscSection, which eliminates the >>>> > variables completely. This is how the FEM examples, like ex12, work. >>>> >>>> You can also use MatZeroRowsColumns() or do the equivalent >>>> transformation during assembly (my preference). >>>> >>> >>> >>> >>> -- >>> *Miguel Angel Salazar de Troya* >>> Graduate Research Assistant >>> Department of Mechanical Science and Engineering >>> University of Illinois at Urbana-Champaign >>> (217) 550-2360 >>> salaza11 at illinois.edu >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Fri Sep 26 10:06:26 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Fri, 26 Sep 2014 10:06:26 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: That means that if we call SNESSetFunction() we don't build the residual vector in parallel? In the pflow example ( http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tutorials/network/pflow/pf.c.html) the function FormFunction() (Input for SNESSetFunction() works with the local vectors. I don't understand this. Thanks Miguel On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley wrote: > On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya < > salazardetroya at gmail.com> wrote: > >> Thanks. I had another question about the DM and SNES and TS. There are >> similar routines to assign the residual and jacobian evaluation to both >> objects. For the SNES case are: >> >> DMSNESSetFunctionLocal >> DMSNESSetJacobianLocal >> >> What are the differences of these with: >> >> SNESSetFunction >> SNESSetJacobian >> > > SNESSetFunction() expects the user to construct the entire parallel > residual vector. DMSNESSetFunctionLocal() > expects the user to construct the local pieces of the residual, and then > it automatically calls DMLocalToGlobal() > to assembly the full residual. It also converts the input from global > vectors to local vectors, and in the case of > DMDA multidimensional arrays. > > Thanks, > > Matt > > >> and when should we use each? With "Local", it is meant to evaluate the >> function/jacobian for the elements in the local processor? I could get the >> local edges in DMNetwork by calling DMNetworkGetEdgeRange? >> >> Miguel >> >> On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >> wrote: >> >>> On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya < >>> salazardetroya at gmail.com> wrote: >>> >>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>> > PetscSection, which eliminates the >>>> > variables completely. This is how the FEM examples, like ex12, work. >>>> >>>> Would that be with PetscSectionSetConstraintDof ? For that I will need >>>> the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>>> could cast it to DM_Network from the dm, networkdm, declared in the main >>>> program, maybe something like this: >>>> >>>> DM_Network *network = (DM_Network*) networkdm->data; >>>> >>>> Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) >>>> >>>> I admit to not completely understanding DMNetwork. However, it >>> eventually builds a PetscSection for data layout, which >>> you could get from DMGetDefaultSection(). The right thing to do is find >>> where it builds the Section, and put in your BC >>> there, but that sounds like it would entail coding. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks for your responses. >>>> >>>> Miguel >>>> >>>> >>>> >>>> On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >>>> >>>>> Matthew Knepley writes: >>>>> >>>>> > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < >>>>> abhyshr at mcs.anl.gov >>>>> >> wrote: >>>>> > >>>>> >> You are right. The Jacobian for the power grid application is indeed >>>>> >> non-symmetric. Is that a problem for your application? >>>>> >> >>>>> > >>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>> > PetscSection, which eliminates the >>>>> > variables completely. This is how the FEM examples, like ex12, work. >>>>> >>>>> You can also use MatZeroRowsColumns() or do the equivalent >>>>> transformation during assembly (my preference). >>>>> >>>> >>>> >>>> >>>> -- >>>> *Miguel Angel Salazar de Troya* >>>> Graduate Research Assistant >>>> Department of Mechanical Science and Engineering >>>> University of Illinois at Urbana-Champaign >>>> (217) 550-2360 >>>> salaza11 at illinois.edu >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> *Miguel Angel Salazar de Troya* >> Graduate Research Assistant >> Department of Mechanical Science and Engineering >> University of Illinois at Urbana-Champaign >> (217) 550-2360 >> salaza11 at illinois.edu >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmeiser at txcorp.com Fri Sep 26 10:16:41 2014 From: dmeiser at txcorp.com (Dominic Meiser) Date: Fri, 26 Sep 2014 09:16:41 -0600 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> Message-ID: <54258359.6010809@txcorp.com> On 09/22/2014 01:47 PM, Ashwin Srinath wrote: > Dominic, I second a request for such a branch. > > Thanks, > Ashwin > Hi Ashwin, I put together a branch with our GPU bug fixes: https://github.com/Tech-XCorp/petsc/ There is only one branch in this repo: gpu-master. Be aware that this branch is even more experimental than petsc/next. It contains experimental code that may be wrong or that will change before this can be merged into petsc/next or petsc/master. Let me know if you run into problems. Karl, this branch may also be helpful in reviewing PR #178. Cheers, Dominic -- Dominic Meiser Tech-X Corporation 5621 Arapahoe Avenue Boulder, CO 80303 USA Telephone: 303-996-2036 Fax: 303-448-7756 www.txcorp.com From knepley at gmail.com Fri Sep 26 10:10:02 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Sep 2014 10:10:02 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > That means that if we call SNESSetFunction() we don't build the residual > vector in parallel? In the pflow example ( > http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tutorials/network/pflow/pf.c.html) > the function FormFunction() (Input for SNESSetFunction() works with the > local vectors. I don't understand this. > FormFunction() in that link clearly takes in a global vector X and returns a global vector F. Inside, it converts them to local vectors. This is exactly what you would do for a function given to SNESSetFunction(). Matt > > Thanks > Miguel > > On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley > wrote: > >> On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya < >> salazardetroya at gmail.com> wrote: >> >>> Thanks. I had another question about the DM and SNES and TS. There are >>> similar routines to assign the residual and jacobian evaluation to both >>> objects. For the SNES case are: >>> >>> DMSNESSetFunctionLocal >>> DMSNESSetJacobianLocal >>> >>> What are the differences of these with: >>> >>> SNESSetFunction >>> SNESSetJacobian >>> >> >> SNESSetFunction() expects the user to construct the entire parallel >> residual vector. DMSNESSetFunctionLocal() >> expects the user to construct the local pieces of the residual, and then >> it automatically calls DMLocalToGlobal() >> to assembly the full residual. It also converts the input from global >> vectors to local vectors, and in the case of >> DMDA multidimensional arrays. >> >> Thanks, >> >> Matt >> >> >>> and when should we use each? With "Local", it is meant to evaluate the >>> function/jacobian for the elements in the local processor? I could get the >>> local edges in DMNetwork by calling DMNetworkGetEdgeRange? >>> >>> Miguel >>> >>> On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >>> wrote: >>> >>>> On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya < >>>> salazardetroya at gmail.com> wrote: >>>> >>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>> > PetscSection, which eliminates the >>>>> > variables completely. This is how the FEM examples, like ex12, work. >>>>> >>>>> Would that be with PetscSectionSetConstraintDof ? For that I will need >>>>> the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>>>> could cast it to DM_Network from the dm, networkdm, declared in the main >>>>> program, maybe something like this: >>>>> >>>>> DM_Network *network = (DM_Network*) networkdm->data; >>>>> >>>>> Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) >>>>> >>>>> I admit to not completely understanding DMNetwork. However, it >>>> eventually builds a PetscSection for data layout, which >>>> you could get from DMGetDefaultSection(). The right thing to do is find >>>> where it builds the Section, and put in your BC >>>> there, but that sounds like it would entail coding. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks for your responses. >>>>> >>>>> Miguel >>>>> >>>>> >>>>> >>>>> On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >>>>> >>>>>> Matthew Knepley writes: >>>>>> >>>>>> > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < >>>>>> abhyshr at mcs.anl.gov >>>>>> >> wrote: >>>>>> > >>>>>> >> You are right. The Jacobian for the power grid application is >>>>>> indeed >>>>>> >> non-symmetric. Is that a problem for your application? >>>>>> >> >>>>>> > >>>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>>> > PetscSection, which eliminates the >>>>>> > variables completely. This is how the FEM examples, like ex12, work. >>>>>> >>>>>> You can also use MatZeroRowsColumns() or do the equivalent >>>>>> transformation during assembly (my preference). >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *Miguel Angel Salazar de Troya* >>>>> Graduate Research Assistant >>>>> Department of Mechanical Science and Engineering >>>>> University of Illinois at Urbana-Champaign >>>>> (217) 550-2360 >>>>> salaza11 at illinois.edu >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >>> >>> -- >>> *Miguel Angel Salazar de Troya* >>> Graduate Research Assistant >>> Department of Mechanical Science and Engineering >>> University of Illinois at Urbana-Champaign >>> (217) 550-2360 >>> salaza11 at illinois.edu >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashwinsrnth at gmail.com Fri Sep 26 10:14:17 2014 From: ashwinsrnth at gmail.com (Ashwin Srinath) Date: Fri, 26 Sep 2014 11:14:17 -0400 Subject: [petsc-users] GPU speedup in Poisson solvers In-Reply-To: <54258359.6010809@txcorp.com> References: <1411412227.54724.YahooMailBasic@web140103.mail.bf1.yahoo.com> <54207AB5.2050104@txcorp.com> <54258359.6010809@txcorp.com> Message-ID: Thanks Dominic! I'll give you some feedback soon! On Fri, Sep 26, 2014 at 11:16 AM, Dominic Meiser wrote: > On 09/22/2014 01:47 PM, Ashwin Srinath wrote: > >> Dominic, I second a request for such a branch. >> >> Thanks, >> Ashwin >> >> Hi Ashwin, > > I put together a branch with our GPU bug fixes: > > https://github.com/Tech-XCorp/petsc/ > > There is only one branch in this repo: gpu-master. Be aware that this > branch is even more experimental than petsc/next. It contains experimental > code that may be wrong or that will change before this can be merged into > petsc/next or petsc/master. Let me know if you run into problems. > > Karl, this branch may also be helpful in reviewing PR #178. > > > Cheers, > Dominic > > -- > Dominic Meiser > Tech-X Corporation > 5621 Arapahoe Avenue > Boulder, CO 80303 > USA > Telephone: 303-996-2036 > Fax: 303-448-7756 > www.txcorp.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Fri Sep 26 10:17:34 2014 From: popov at uni-mainz.de (anton) Date: Fri, 26 Sep 2014 17:17:34 +0200 Subject: [petsc-users] fieldsplit doesn't pass prefix to inner ksp In-Reply-To: References: <54255C34.1090801@uni-mainz.de> Message-ID: <5425838E.6030909@uni-mainz.de> This is what I get for 3.5.0: anton at anton ~/LIB/petsc-3.5.0-deb/src/snes/examples/tutorials $ ./ex19 -bf_pc_type fieldsplit -bf_snes_view lid velocity = 0.0625, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-bf_pc_type value: fieldsplit Option left: name:-bf_snes_view (no value) It seems like it was already corrected between 3.5.0 & 3.5.2 On 09/26/2014 04:26 PM, Matthew Knepley wrote: > Here is the result for 3.5.2, which looks right to me: > > (v3.5.2) > *:/PETSc3/petsc/release-petsc-3.5.1/src/snes/examples/tutorials$ > ./ex19 -bf_pc_type fieldsplit -bf_snes_view -bf_pc_fieldsplit_type > schur -bf_pc_fieldsplit_0_fields 0,1,2 -bf_pc_fieldsplit_1_fields 3 > -bf_pc_fieldsplit_schur_factorization_type upper > lid velocity = 0.0625, prandtl # = 1, grashof # = 1 > SNES Object:(bf_) 1 MPI processes > type: newtonls > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 > total number of linear solver iterations=4 > total number of function evaluations=3 > SNESLineSearch Object: (bf_) 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: (bf_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (bf_) 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization UPPER > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (bf_fieldsplit_0_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (bf_fieldsplit_0_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=48, cols=48 > package used to perform factorization: petsc > total: nonzeros=576, allocated nonzeros=576 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 16 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_0_) 1 MPI processes > type: seqaij > rows=48, cols=48 > total: nonzeros=576, allocated nonzeros=576 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 16 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (bf_fieldsplit_temperature_) 1 MPI > processes > type: gmres > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (bf_fieldsplit_temperature_) 1 MPI > processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues > calls =0 > not using I-node routines > linear system matrix followed by preconditioner matrix: > Mat Object: (bf_fieldsplit_temperature_) 1 MPI > processes > type: schurcomplement > rows=16, cols=16 > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: (bf_fieldsplit_temperature_) > 1 MPI processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues > calls =0 > not using I-node routines > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=48 > total: nonzeros=192, allocated nonzeros=192 > total number of mallocs used during MatSetValues > calls =0 > not using I-node routines > KSP of A00 > KSP Object: (bf_fieldsplit_0_) 1 MPI > processes > type: gmres > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (bf_fieldsplit_0_) 1 MPI > processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero > pivot [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=48, cols=48 > package used to perform factorization: petsc > total: nonzeros=576, allocated nonzeros=576 > total number of mallocs used during > MatSetValues calls =0 > using I-node routines: found 16 nodes, > limit used is 5 > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_0_) 1 > MPI processes > type: seqaij > rows=48, cols=48 > total: nonzeros=576, allocated nonzeros=576 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 16 nodes, limit > used is 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=48, cols=16 > total: nonzeros=192, allocated nonzeros=192 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 16 nodes, limit used is 5 > Mat Object: (bf_fieldsplit_temperature_) 1 MPI > processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=64, cols=64, bs=4 > total: nonzeros=1024, allocated nonzeros=1024 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 16 nodes, limit used is 5 > Number of SNES iterations = 2 > > Thanks, > > Matt > > > On Fri, Sep 26, 2014 at 8:52 AM, Matthew Knepley > wrote: > > On Fri, Sep 26, 2014 at 7:29 AM, anton > wrote: > > Create preconditioner: > > PCCreate(PETSC_COMM_WORLD, &pc); > PCSetOptionsPrefix(pc, "bf_"); > PCSetFromOptions(pc); > > Define fieldsplit options: > > -bf_pc_type fieldsplit > -bf_pc_fieldsplit_type SCHUR > -bf_pc_fieldsplit_schur_factorization_type UPPER > > Works OK. > > Set options for the first field solver: > > -bf_fieldsplit_0_ksp_type preonly > -bf_fieldsplit_0_pc_type lu > > Doesn't work (ignored), because "bf_" prefix isn't pass to > inner solver ksp (checked in the debugger). > > Indeed, the following works: > > -fieldsplit_0_ksp_type preonly > -fieldsplit_0_pc_type lu > > Observed with 3.5 but not with 3.4 > > > I just tried this with master on SNES ex19, and got the correct > result: > > knepley/feature-parallel-partition > *$:/PETSc3/petsc/petsc-dev/src/snes/examples/tutorials$ ./ex19 > -bf_pc_type fieldsplit -bf_snes_view > ./ex19 -bf_pc_type fieldsplit -bf_snes_view > lid velocity = 0.0625, prandtl # = 1, grashof # = 1 > SNES Object:(bf_) 1 MPI processes > type: newtonls > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 > total number of linear solver iterations=13 > total number of function evaluations=3 > SNESLineSearch Object: (bf_) 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: (bf_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (bf_) 1 MPI processes > type: fieldsplit > FieldSplit with MULTIPLICATIVE composition: total splits = 4 > Solver info for each split is in the following KSP objects: > Split number 0 Defined by IS > KSP Object: (bf_fieldsplit_x_velocity_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (bf_fieldsplit_x_velocity_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues > calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_x_velocity_) 1 MPI > processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Split number 1 Defined by IS > KSP Object: (bf_fieldsplit_y_velocity_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (bf_fieldsplit_y_velocity_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues > calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_y_velocity_) 1 MPI > processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Split number 2 Defined by IS > KSP Object: (bf_fieldsplit_Omega_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (bf_fieldsplit_Omega_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues > calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_Omega_) 1 MPI processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Split number 3 Defined by IS > KSP Object: (bf_fieldsplit_temperature_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (bf_fieldsplit_temperature_) 1 MPI processes > type: ilu > ILU: out-of-place factorization > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using diagonal shift on blocks to prevent zero pivot > [INBLOCKS] > matrix ordering: natural > factor fill ratio given 1, needed 1 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=16, cols=16 > package used to perform factorization: petsc > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues > calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: (bf_fieldsplit_temperature_) 1 MPI > processes > type: seqaij > rows=16, cols=16 > total: nonzeros=64, allocated nonzeros=64 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=64, cols=64, bs=4 > total: nonzeros=1024, allocated nonzeros=1024 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 16 nodes, limit used is 5 > Number of SNES iterations = 2 > > I will try with 3.5.2. > > Thanks, > > Matt > > Thanks. > Anton > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to > which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Fri Sep 26 10:24:53 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Fri, 26 Sep 2014 15:24:53 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: DMNetwork does not have the functionality to constraint a dof. The user has to do it by specifying the constraint equation. As Jed suggested, you could simply not set any values in the rows/columns (except the diagonal) corresponding to the constrained dof in the Jacobian evaluation to keep the matrix symmetric. Shri From: Matthew Knepley Date: Thu, 25 Sep 2014 17:17:52 -0500 To: Miguel Angel Salazar de Troya Cc: Jed Brown , Shri , "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] DMPlex with spring elements >On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya > wrote: > >> If you need a symmetric Jacobian, you can use the BC facility in >> PetscSection, which eliminates the >> variables completely. This is how the FEM examples, like ex12, work. >Would that be with PetscSectionSetConstraintDof ? For that I will need >the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >could cast it to DM_Network from the dm, networkdm, declared in the main >program, maybe something like this: >DM_Network *network = (DM_Network*) networkdm->data;Then I would loop >over the vertices and call PetscSectionSetConstraintDof if it's a >boundary node (by checking the corresponding component) > > > > >I admit to not completely understanding DMNetwork. However, it eventually >builds a PetscSection for data layout, which >you could get from DMGetDefaultSection(). The right thing to do is find >where it builds the Section, and put in your BC >there, but that sounds like it would entail coding. > > Thanks, > > Matt > > >Thanks for your responses.Miguel > > >On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: > >Matthew Knepley writes: > >> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. >>>> wrote: >> >>> You are right. The Jacobian for the power grid application is indeed >>> non-symmetric. Is that a problem for your application? >>> >> >> If you need a symmetric Jacobian, you can use the BC facility in >> PetscSection, which eliminates the >> variables completely. This is how the FEM examples, like ex12, work. > >You can also use MatZeroRowsColumns() or do the equivalent >transformation during assembly (my preference). > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu > > > > > > > > > > >-- >What most experimenters take for granted before they begin their >experiments is infinitely more interesting than any results to which >their experiments lead. >-- Norbert Wiener From salazardetroya at gmail.com Fri Sep 26 10:26:31 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Fri, 26 Sep 2014 10:26:31 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: Yeah, but doesn't it only work with the local vectors localX and localF? Miguel On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley wrote: > On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya < > salazardetroya at gmail.com> wrote: > >> That means that if we call SNESSetFunction() we don't build the residual >> vector in parallel? In the pflow example ( >> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tutorials/network/pflow/pf.c.html) >> the function FormFunction() (Input for SNESSetFunction() works with the >> local vectors. I don't understand this. >> > > FormFunction() in that link clearly takes in a global vector X and returns > a global vector F. Inside, it > converts them to local vectors. This is exactly what you would do for a > function given to SNESSetFunction(). > > Matt > > >> >> Thanks >> Miguel >> >> On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley >> wrote: >> >>> On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya < >>> salazardetroya at gmail.com> wrote: >>> >>>> Thanks. I had another question about the DM and SNES and TS. There are >>>> similar routines to assign the residual and jacobian evaluation to both >>>> objects. For the SNES case are: >>>> >>>> DMSNESSetFunctionLocal >>>> DMSNESSetJacobianLocal >>>> >>>> What are the differences of these with: >>>> >>>> SNESSetFunction >>>> SNESSetJacobian >>>> >>> >>> SNESSetFunction() expects the user to construct the entire parallel >>> residual vector. DMSNESSetFunctionLocal() >>> expects the user to construct the local pieces of the residual, and then >>> it automatically calls DMLocalToGlobal() >>> to assembly the full residual. It also converts the input from global >>> vectors to local vectors, and in the case of >>> DMDA multidimensional arrays. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> and when should we use each? With "Local", it is meant to evaluate the >>>> function/jacobian for the elements in the local processor? I could get the >>>> local edges in DMNetwork by calling DMNetworkGetEdgeRange? >>>> >>>> Miguel >>>> >>>> On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >>>> wrote: >>>> >>>>> On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya < >>>>> salazardetroya at gmail.com> wrote: >>>>> >>>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>>> > PetscSection, which eliminates the >>>>>> > variables completely. This is how the FEM examples, like ex12, work. >>>>>> >>>>>> Would that be with PetscSectionSetConstraintDof ? For that I will >>>>>> need the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>>>>> could cast it to DM_Network from the dm, networkdm, declared in the main >>>>>> program, maybe something like this: >>>>>> >>>>>> DM_Network *network = (DM_Network*) networkdm->data; >>>>>> >>>>>> Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) >>>>>> >>>>>> I admit to not completely understanding DMNetwork. However, it >>>>> eventually builds a PetscSection for data layout, which >>>>> you could get from DMGetDefaultSection(). The right thing to do is >>>>> find where it builds the Section, and put in your BC >>>>> there, but that sounds like it would entail coding. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks for your responses. >>>>>> >>>>>> Miguel >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >>>>>> >>>>>>> Matthew Knepley writes: >>>>>>> >>>>>>> > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < >>>>>>> abhyshr at mcs.anl.gov >>>>>>> >> wrote: >>>>>>> > >>>>>>> >> You are right. The Jacobian for the power grid application is >>>>>>> indeed >>>>>>> >> non-symmetric. Is that a problem for your application? >>>>>>> >> >>>>>>> > >>>>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>>>> > PetscSection, which eliminates the >>>>>>> > variables completely. This is how the FEM examples, like ex12, >>>>>>> work. >>>>>>> >>>>>>> You can also use MatZeroRowsColumns() or do the equivalent >>>>>>> transformation during assembly (my preference). >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Miguel Angel Salazar de Troya* >>>>>> Graduate Research Assistant >>>>>> Department of Mechanical Science and Engineering >>>>>> University of Illinois at Urbana-Champaign >>>>>> (217) 550-2360 >>>>>> salaza11 at illinois.edu >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>>> >>>> -- >>>> *Miguel Angel Salazar de Troya* >>>> Graduate Research Assistant >>>> Department of Mechanical Science and Engineering >>>> University of Illinois at Urbana-Champaign >>>> (217) 550-2360 >>>> salaza11 at illinois.edu >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> *Miguel Angel Salazar de Troya* >> Graduate Research Assistant >> Department of Mechanical Science and Engineering >> University of Illinois at Urbana-Champaign >> (217) 550-2360 >> salaza11 at illinois.edu >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 26 10:28:26 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Sep 2014 10:28:26 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > Yeah, but doesn't it only work with the local vectors localX and localF? > I am telling you what the interface for the functions is. You can do whatever you want inside. Matt > Miguel > > On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley > wrote: > >> On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya < >> salazardetroya at gmail.com> wrote: >> >>> That means that if we call SNESSetFunction() we don't build the >>> residual vector in parallel? In the pflow example ( >>> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tutorials/network/pflow/pf.c.html) >>> the function FormFunction() (Input for SNESSetFunction() works with the >>> local vectors. I don't understand this. >>> >> >> FormFunction() in that link clearly takes in a global vector X and >> returns a global vector F. Inside, it >> converts them to local vectors. This is exactly what you would do for a >> function given to SNESSetFunction(). >> >> Matt >> >> >>> >>> Thanks >>> Miguel >>> >>> On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley >>> wrote: >>> >>>> On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya < >>>> salazardetroya at gmail.com> wrote: >>>> >>>>> Thanks. I had another question about the DM and SNES and TS. There are >>>>> similar routines to assign the residual and jacobian evaluation to both >>>>> objects. For the SNES case are: >>>>> >>>>> DMSNESSetFunctionLocal >>>>> DMSNESSetJacobianLocal >>>>> >>>>> What are the differences of these with: >>>>> >>>>> SNESSetFunction >>>>> SNESSetJacobian >>>>> >>>> >>>> SNESSetFunction() expects the user to construct the entire parallel >>>> residual vector. DMSNESSetFunctionLocal() >>>> expects the user to construct the local pieces of the residual, and >>>> then it automatically calls DMLocalToGlobal() >>>> to assembly the full residual. It also converts the input from global >>>> vectors to local vectors, and in the case of >>>> DMDA multidimensional arrays. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> and when should we use each? With "Local", it is meant to evaluate the >>>>> function/jacobian for the elements in the local processor? I could get the >>>>> local edges in DMNetwork by calling DMNetworkGetEdgeRange? >>>>> >>>>> Miguel >>>>> >>>>> On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya < >>>>>> salazardetroya at gmail.com> wrote: >>>>>> >>>>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>>>> > PetscSection, which eliminates the >>>>>>> > variables completely. This is how the FEM examples, like ex12, >>>>>>> work. >>>>>>> >>>>>>> Would that be with PetscSectionSetConstraintDof ? For that I will >>>>>>> need the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>>>>>> could cast it to DM_Network from the dm, networkdm, declared in the main >>>>>>> program, maybe something like this: >>>>>>> >>>>>>> DM_Network *network = (DM_Network*) networkdm->data; >>>>>>> >>>>>>> Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) >>>>>>> >>>>>>> I admit to not completely understanding DMNetwork. However, it >>>>>> eventually builds a PetscSection for data layout, which >>>>>> you could get from DMGetDefaultSection(). The right thing to do is >>>>>> find where it builds the Section, and put in your BC >>>>>> there, but that sounds like it would entail coding. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks for your responses. >>>>>>> >>>>>>> Miguel >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >>>>>>> >>>>>>>> Matthew Knepley writes: >>>>>>>> >>>>>>>> > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < >>>>>>>> abhyshr at mcs.anl.gov >>>>>>>> >> wrote: >>>>>>>> > >>>>>>>> >> You are right. The Jacobian for the power grid application is >>>>>>>> indeed >>>>>>>> >> non-symmetric. Is that a problem for your application? >>>>>>>> >> >>>>>>>> > >>>>>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>>>>> > PetscSection, which eliminates the >>>>>>>> > variables completely. This is how the FEM examples, like ex12, >>>>>>>> work. >>>>>>>> >>>>>>>> You can also use MatZeroRowsColumns() or do the equivalent >>>>>>>> transformation during assembly (my preference). >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Miguel Angel Salazar de Troya* >>>>>>> >>>>>>> Graduate Research Assistant >>>>>>> Department of Mechanical Science and Engineering >>>>>>> University of Illinois at Urbana-Champaign >>>>>>> (217) 550-2360 >>>>>>> salaza11 at illinois.edu >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *Miguel Angel Salazar de Troya* >>>>> Graduate Research Assistant >>>>> Department of Mechanical Science and Engineering >>>>> University of Illinois at Urbana-Champaign >>>>> (217) 550-2360 >>>>> salaza11 at illinois.edu >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >>> >>> -- >>> *Miguel Angel Salazar de Troya* >>> Graduate Research Assistant >>> Department of Mechanical Science and Engineering >>> University of Illinois at Urbana-Champaign >>> (217) 550-2360 >>> salaza11 at illinois.edu >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Fri Sep 26 10:34:09 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Fri, 26 Sep 2014 10:34:09 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: <878ul7cwrz.fsf@jedbrown.org> Message-ID: Ok thanks. Miguel On Fri, Sep 26, 2014 at 10:28 AM, Matthew Knepley wrote: > On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya < > salazardetroya at gmail.com> wrote: > >> Yeah, but doesn't it only work with the local vectors localX and localF? >> > > I am telling you what the interface for the functions is. You can do > whatever you want inside. > > Matt > > >> Miguel >> >> On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley >> wrote: >> >>> On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya < >>> salazardetroya at gmail.com> wrote: >>> >>>> That means that if we call SNESSetFunction() we don't build the >>>> residual vector in parallel? In the pflow example ( >>>> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tutorials/network/pflow/pf.c.html) >>>> the function FormFunction() (Input for SNESSetFunction() works with the >>>> local vectors. I don't understand this. >>>> >>> >>> FormFunction() in that link clearly takes in a global vector X and >>> returns a global vector F. Inside, it >>> converts them to local vectors. This is exactly what you would do for a >>> function given to SNESSetFunction(). >>> >>> Matt >>> >>> >>>> >>>> Thanks >>>> Miguel >>>> >>>> On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley >>>> wrote: >>>> >>>>> On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya < >>>>> salazardetroya at gmail.com> wrote: >>>>> >>>>>> Thanks. I had another question about the DM and SNES and TS. There >>>>>> are similar routines to assign the residual and jacobian evaluation to both >>>>>> objects. For the SNES case are: >>>>>> >>>>>> DMSNESSetFunctionLocal >>>>>> DMSNESSetJacobianLocal >>>>>> >>>>>> What are the differences of these with: >>>>>> >>>>>> SNESSetFunction >>>>>> SNESSetJacobian >>>>>> >>>>> >>>>> SNESSetFunction() expects the user to construct the entire parallel >>>>> residual vector. DMSNESSetFunctionLocal() >>>>> expects the user to construct the local pieces of the residual, and >>>>> then it automatically calls DMLocalToGlobal() >>>>> to assembly the full residual. It also converts the input from global >>>>> vectors to local vectors, and in the case of >>>>> DMDA multidimensional arrays. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> and when should we use each? With "Local", it is meant to evaluate >>>>>> the function/jacobian for the elements in the local processor? I could get >>>>>> the local edges in DMNetwork by calling DMNetworkGetEdgeRange? >>>>>> >>>>>> Miguel >>>>>> >>>>>> On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >>>>>> wrote: >>>>>> >>>>>>> On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya < >>>>>>> salazardetroya at gmail.com> wrote: >>>>>>> >>>>>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>>>>> > PetscSection, which eliminates the >>>>>>>> > variables completely. This is how the FEM examples, like ex12, >>>>>>>> work. >>>>>>>> >>>>>>>> Would that be with PetscSectionSetConstraintDof ? For that I will >>>>>>>> need the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>>>>>>> could cast it to DM_Network from the dm, networkdm, declared in the main >>>>>>>> program, maybe something like this: >>>>>>>> >>>>>>>> DM_Network *network = (DM_Network*) networkdm->data; >>>>>>>> >>>>>>>> Then I would loop over the vertices and call PetscSectionSetConstraintDof if it's a boundary node (by checking the corresponding component) >>>>>>>> >>>>>>>> I admit to not completely understanding DMNetwork. However, it >>>>>>> eventually builds a PetscSection for data layout, which >>>>>>> you could get from DMGetDefaultSection(). The right thing to do is >>>>>>> find where it builds the Section, and put in your BC >>>>>>> there, but that sounds like it would entail coding. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Thanks for your responses. >>>>>>>> >>>>>>>> Miguel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Matthew Knepley writes: >>>>>>>>> >>>>>>>>> > On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. < >>>>>>>>> abhyshr at mcs.anl.gov >>>>>>>>> >> wrote: >>>>>>>>> > >>>>>>>>> >> You are right. The Jacobian for the power grid application is >>>>>>>>> indeed >>>>>>>>> >> non-symmetric. Is that a problem for your application? >>>>>>>>> >> >>>>>>>>> > >>>>>>>>> > If you need a symmetric Jacobian, you can use the BC facility in >>>>>>>>> > PetscSection, which eliminates the >>>>>>>>> > variables completely. This is how the FEM examples, like ex12, >>>>>>>>> work. >>>>>>>>> >>>>>>>>> You can also use MatZeroRowsColumns() or do the equivalent >>>>>>>>> transformation during assembly (my preference). >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Miguel Angel Salazar de Troya* >>>>>>>> >>>>>>>> Graduate Research Assistant >>>>>>>> Department of Mechanical Science and Engineering >>>>>>>> University of Illinois at Urbana-Champaign >>>>>>>> (217) 550-2360 >>>>>>>> salaza11 at illinois.edu >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Miguel Angel Salazar de Troya* >>>>>> Graduate Research Assistant >>>>>> Department of Mechanical Science and Engineering >>>>>> University of Illinois at Urbana-Champaign >>>>>> (217) 550-2360 >>>>>> salaza11 at illinois.edu >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>>> >>>> -- >>>> *Miguel Angel Salazar de Troya* >>>> Graduate Research Assistant >>>> Department of Mechanical Science and Engineering >>>> University of Illinois at Urbana-Champaign >>>> (217) 550-2360 >>>> salaza11 at illinois.edu >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> >> >> -- >> *Miguel Angel Salazar de Troya* >> Graduate Research Assistant >> Department of Mechanical Science and Engineering >> University of Illinois at Urbana-Champaign >> (217) 550-2360 >> salaza11 at illinois.edu >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Fri Sep 26 10:53:29 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Fri, 26 Sep 2014 15:53:29 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: What Matt is saying is that there are two interfaces in PETSc for setting the residual evaluation routine: i) SNESSetFunction takes in a function pointer for the residual evaluation routine that has the prototype PetscErrorCode xyzroutine(SNES snes, Vec X, Vec F, void* ctx); X and F are the "global" solution and residual vectors. To compute the global residual evaluation, typically one does -- (a) scattering X and F onto local vectors localX and localF (DMGlobalToLocal), (b) computing the local residual, and (c) gathering the localF in the global F (DMLocalToGlobal). This is what is done in the example. ii) DMSNESSetFunctionLocal takes in a function pointer for the residual evaluation routine that has the prototype PetscErrorCode xyzlocalroutine(DM, Vec localX, localF, void* ctx) In this case, the localX and localF get passed to the routine. So, you only have to do the local residual evaluation. PETSc does the LocalToGlobal gather to form the global residual. I chose to use SNESSetFunction in the example. You can use either of them. Shri From: Matthew Knepley Date: Fri, 26 Sep 2014 10:28:26 -0500 To: Miguel Angel Salazar de Troya Cc: Jed Brown , Shri , "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] DMPlex with spring elements >On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya > wrote: > >Yeah, but doesn't it only work with the local vectors localX and localF? > > > >I am telling you what the interface for the functions is. You can do >whatever you want inside. > > Matt > > >Miguel > >On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley >wrote: > >On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya > wrote: > >That means that if we call SNESSetFunction() we don't build the residual >vector in parallel? In the pflow example >(http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tut >orials/network/pflow/pf.c.html) the function FormFunction() (Input for >SNESSetFunction() works with the local vectors. I don't understand this. > > > >FormFunction() in that link clearly takes in a global vector X and >returns a global vector F. Inside, it >converts them to local vectors. This is exactly what you would do for a >function given to SNESSetFunction(). > > Matt > > > >Thanks >Miguel > > >On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley >wrote: > >On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya > wrote: > >Thanks. I had another question about the DM and SNES and TS. There are >similar routines to assign the residual and jacobian evaluation to both >objects. For the SNES case are: >DMSNESSetFunctionLocal >DMSNESSetJacobianLocal > >What are the differences of these with: > >SNESSetFunction >SNESSetJacobian > > > > >SNESSetFunction() expects the user to construct the entire parallel >residual vector. DMSNESSetFunctionLocal() >expects the user to construct the local pieces of the residual, and then >it automatically calls DMLocalToGlobal() >to assembly the full residual. It also converts the input from global >vectors to local vectors, and in the case of >DMDA multidimensional arrays. > > Thanks, > > Matt > > >and when should we use each? With "Local", it is meant to evaluate the >function/jacobian for the elements in the local processor? I could get >the local edges in DMNetwork by calling DMNetworkGetEdgeRange? > >Miguel > > >On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >wrote: > > > >On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya > wrote: > >> If you need a symmetric Jacobian, you can use the BC facility in >> PetscSection, which eliminates the >> variables completely. This is how the FEM examples, like ex12, work. >Would that be with PetscSectionSetConstraintDof ? For that I will need >the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >could cast it to DM_Network from the dm, networkdm, declared in the main >program, maybe something like this: >DM_Network *network = (DM_Network*) networkdm->data;Then I would loop >over the vertices and call PetscSectionSetConstraintDof if it's a >boundary node (by checking the corresponding component) > > > > >I admit to not completely understanding DMNetwork. However, it eventually >builds a PetscSection for data layout, which >you could get from DMGetDefaultSection(). The right thing to do is find >where it builds the Section, and put in your BC >there, but that sounds like it would entail coding. > > Thanks, > > Matt > > > > >Thanks for your responses.Miguel > > > > >On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: > >Matthew Knepley writes: > >> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. >>>> wrote: >> >>> You are right. The Jacobian for the power grid application is indeed >>> non-symmetric. Is that a problem for your application? >>> >> >> If you need a symmetric Jacobian, you can use the BC facility in >> PetscSection, which eliminates the >> variables completely. This is how the FEM examples, like ex12, work. > >You can also use MatZeroRowsColumns() or do the equivalent >transformation during assembly (my preference). > > > > > > > > >-- >Miguel Angel Salazar de Troya > > >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 > > >salaza11 at illinois.edu > > > > > > > > > > >-- >What most experimenters take for granted before they begin their >experiments is infinitely more interesting than any results to which >their experiments lead. >-- Norbert Wiener > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu > > > > > > > > > > >-- >What most experimenters take for granted before they begin their >experiments is infinitely more interesting than any results to which >their experiments lead. >-- Norbert Wiener > > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu > > > > > > > > > > >-- >What most experimenters take for granted before they begin their >experiments is infinitely more interesting than any results to which >their experiments lead. >-- Norbert Wiener > > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu > > > > > > > > > >-- >What most experimenters take for granted before they begin their >experiments is infinitely more interesting than any results to which >their experiments lead. >-- Norbert Wiener From 4bikerboyjohn at gmail.com Fri Sep 26 11:52:09 2014 From: 4bikerboyjohn at gmail.com (John Alletto) Date: Fri, 26 Sep 2014 09:52:09 -0700 Subject: [petsc-users] General question about a structured non-uniform grid Message-ID: <5FADFB8A-F6FF-4EC0-8E2F-3E5E306D4AAC@gmail.com> All, I have a general question about a structured non-uniform grid. I have set the coordinates X,Y,Z in my code by first calling setUniformGrid and then by setting the coordinates. I have printed out the coordinates and those are as expected. I an non-uniform grid the code will need a specific delta_x delta_y and delta_z since they will differ through out the grid. Does PETSc take a derivative of the coordinates (by default) to get the list of delta?s or do I need to do something specific ? Many Thanks in Advance John -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 26 11:58:23 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Sep 2014 11:58:23 -0500 Subject: [petsc-users] General question about a structured non-uniform grid In-Reply-To: <5FADFB8A-F6FF-4EC0-8E2F-3E5E306D4AAC@gmail.com> References: <5FADFB8A-F6FF-4EC0-8E2F-3E5E306D4AAC@gmail.com> Message-ID: On Fri, Sep 26, 2014 at 11:52 AM, John Alletto <4bikerboyjohn at gmail.com> wrote: > > > All, > > I have a general question about a structured non-uniform grid. > > I have set the coordinates X,Y,Z in my code by first calling > setUniformGrid and then by setting the coordinates. > I have printed out the coordinates and those are as expected. > > I an non-uniform grid the code will need a specific delta_x delta_y and > delta_z since they will differ through out the grid. > > Does PETSc take a derivative of the coordinates (by default) to get the > list of delta?s or do I need to do something specific ? > The code that evaluates residuals is all written by the user, so PETSc is not doing anything with the coordinates. In our examples, we pull out the coordinates for adjacent nodes and take the difference. Matt > Many Thanks in Advance > John > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Fri Sep 26 12:33:05 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Fri, 26 Sep 2014 12:33:05 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: I understand. Thanks a lot. Miguel On Fri, Sep 26, 2014 at 10:53 AM, Abhyankar, Shrirang G. < abhyshr at mcs.anl.gov> wrote: > What Matt is saying is that there are two interfaces in PETSc for setting > the residual evaluation routine: > > i) SNESSetFunction takes in a function pointer for the residual evaluation > routine that has the prototype > PetscErrorCode xyzroutine(SNES snes, Vec X, Vec F, void* > ctx); > > X and F are the "global" solution and residual vectors. To compute the > global residual evaluation, typically one does -- (a) scattering X and F > onto local vectors localX and localF (DMGlobalToLocal), (b) computing the > local residual, and (c) gathering the localF in the global F > (DMLocalToGlobal). This is what is done in the example. > > ii) DMSNESSetFunctionLocal takes in a function pointer for the residual > evaluation routine that has the prototype > PetscErrorCode xyzlocalroutine(DM, Vec localX, localF, > void* ctx) > > In this case, the localX and localF get passed to the routine. So, you > only have to do the local residual evaluation. PETSc does the > LocalToGlobal gather to form the global residual. > > I chose to use SNESSetFunction in the example. You can use either of them. > > Shri > > From: Matthew Knepley > Date: Fri, 26 Sep 2014 10:28:26 -0500 > To: Miguel Angel Salazar de Troya > Cc: Jed Brown , Shri , > "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements > > > >On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya > > wrote: > > > >Yeah, but doesn't it only work with the local vectors localX and localF? > > > > > > > >I am telling you what the interface for the functions is. You can do > >whatever you want inside. > > > > Matt > > > > > >Miguel > > > >On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley > >wrote: > > > >On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya > > wrote: > > > >That means that if we call SNESSetFunction() we don't build the residual > >vector in parallel? In the pflow example > >( > http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tut > >orials/network/pflow/pf.c.html) the function FormFunction() (Input for > >SNESSetFunction() works with the local vectors. I don't understand this. > > > > > > > >FormFunction() in that link clearly takes in a global vector X and > >returns a global vector F. Inside, it > >converts them to local vectors. This is exactly what you would do for a > >function given to SNESSetFunction(). > > > > Matt > > > > > > > >Thanks > >Miguel > > > > > >On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley > >wrote: > > > >On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya > > wrote: > > > >Thanks. I had another question about the DM and SNES and TS. There are > >similar routines to assign the residual and jacobian evaluation to both > >objects. For the SNES case are: > >DMSNESSetFunctionLocal > >DMSNESSetJacobianLocal > > > >What are the differences of these with: > > > >SNESSetFunction > >SNESSetJacobian > > > > > > > > > >SNESSetFunction() expects the user to construct the entire parallel > >residual vector. DMSNESSetFunctionLocal() > >expects the user to construct the local pieces of the residual, and then > >it automatically calls DMLocalToGlobal() > >to assembly the full residual. It also converts the input from global > >vectors to local vectors, and in the case of > >DMDA multidimensional arrays. > > > > Thanks, > > > > Matt > > > > > >and when should we use each? With "Local", it is meant to evaluate the > >function/jacobian for the elements in the local processor? I could get > >the local edges in DMNetwork by calling DMNetworkGetEdgeRange? > > > >Miguel > > > > > >On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley > >wrote: > > > > > > > >On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya > > wrote: > > > >> If you need a symmetric Jacobian, you can use the BC facility in > >> PetscSection, which eliminates the > >> variables completely. This is how the FEM examples, like ex12, work. > >Would that be with PetscSectionSetConstraintDof ? For that I will need > >the PetscSection, DofSection, within DMNetwork, how can I obtain it? I > >could cast it to DM_Network from the dm, networkdm, declared in the main > >program, maybe something like this: > >DM_Network *network = (DM_Network*) networkdm->data;Then I would loop > >over the vertices and call PetscSectionSetConstraintDof if it's a > >boundary node (by checking the corresponding component) > > > > > > > > > >I admit to not completely understanding DMNetwork. However, it eventually > >builds a PetscSection for data layout, which > >you could get from DMGetDefaultSection(). The right thing to do is find > >where it builds the Section, and put in your BC > >there, but that sounds like it would entail coding. > > > > Thanks, > > > > Matt > > > > > > > > > >Thanks for your responses.Miguel > > > > > > > > > >On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: > > > >Matthew Knepley writes: > > > >> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. > >> >>> wrote: > >> > >>> You are right. The Jacobian for the power grid application is indeed > >>> non-symmetric. Is that a problem for your application? > >>> > >> > >> If you need a symmetric Jacobian, you can use the BC facility in > >> PetscSection, which eliminates the > >> variables completely. This is how the FEM examples, like ex12, work. > > > >You can also use MatZeroRowsColumns() or do the equivalent > >transformation during assembly (my preference). > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > > > > > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > > > > > >salaza11 at illinois.edu > > > > > > > > > > > > > > > > > > > > > >-- > >What most experimenters take for granted before they begin their > >experiments is infinitely more interesting than any results to which > >their experiments lead. > >-- Norbert Wiener > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > > > > > > > > > > > > > > > > > > > > >-- > >What most experimenters take for granted before they begin their > >experiments is infinitely more interesting than any results to which > >their experiments lead. > >-- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > > > > > > > > > > > > > > > > > > > > >-- > >What most experimenters take for granted before they begin their > >experiments is infinitely more interesting than any results to which > >their experiments lead. > >-- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > > > > > > > > > > > > > > > > > > >-- > >What most experimenters take for granted before they begin their > >experiments is infinitely more interesting than any results to which > >their experiments lead. > >-- Norbert Wiener > > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Sep 26 14:52:52 2014 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Sep 2014 14:52:52 -0500 Subject: [petsc-users] Generating xdmf from h5 file. In-Reply-To: References: Message-ID: On Thu, Sep 25, 2014 at 10:29 PM, subramanya sadasiva wrote: > Hi Matt, > Sorry about that, > I changed > if 'time' in h5: > time = np.array(h5['time']).flatten() > else: > time = np.empty(1) > > The code now fails in the writeSpaceGridHeader function. with the error, > > > Traceback (most recent call last): > > File > "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line > 232, in > > generateXdmf(sys.argv[1]) > > File > "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line > 227, in generateXdmf > > Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners, > cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields) > > File > "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line > 180, in write > > self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim, spaceDim) > > File > "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line > 64, in writeSpaceGridHeader > print self.cellMap[cellDim][numCorners] > > The error is due to the fact that numCorners is set to be 1 , while > celldim=2. cellMap has the following elements. > > {1: {1: 'Polyvertex', 2: 'Polyline'}, 2: {3: 'Triangle', 4: > 'Quadrilateral'}, 3: {8: 'Hexahedron', 4: 'Tetrahedron'}} > > > I also tried > > ./ex12 -dm_view vtk:my.vtk:vtk_vtu . This doesn't seem to do anything. Is > there any specific option I need to build petsc with to get vtk output? My > current build has hdf5 and netcdf enabled. > I made the change in a branch, and merged to 'next', so it will work if you pull. Here is a test from ex12 (testnum 39 in builder.py): ./ex12 -run_type full -refinement_limit 0.015625 -interpolate 1 -petscspace_order 2 -pc_type gamg -ksp_rtol 1.0e-10 -ksp_monitor_short -ksp_converged_reason -snes_monitor_short -snes_converged_reason -dm_view hdf5:sol.h5 -snes_view_solution hdf5:sol.h5::append which makes sol.h5. Then I run ./bin/pythonscripts/petsc_gen_xdmf.py sol.h5 which makes sol.xmf. I load it up in Paraview and it makes the attached picture. Thanks, Matt > Thanks, > > Subramanya > > Date: Wed, 24 Sep 2014 17:36:52 -0500 > Subject: Re: [petsc-users] Generating xdmf from h5 file. > From: knepley at gmail.com > To: potaman at outlook.com; petsc-maint at mcs.anl.gov; petsc-users at mcs.anl.gov > > On Wed, Sep 24, 2014 at 5:29 PM, subramanya sadasiva > wrote: > > Hi Matt, > That did not help. > > > That's not enough description to fix anything, and fixing it will require > programming. > > > Is there any other way to output the mesh to something that paraview can > view? I tried outputting the file to a vtk file using > ex12 -dm_view vtk:my.vtk:ascii_vtk > > which, I saw in another post on the forums, but that did not give me any > output. > > > This is mixing two different things. PETSc has a diagnostic ASCII vtk > output, so the type would be ascii, not vtk, > and format ascii_vtk . It also has a production VTU output, which is type > vtk with format vtk_vtu. > > Thanks, > > Matt > > > > Subramanya > > ------------------------------ > Date: Wed, 24 Sep 2014 17:19:51 -0500 > Subject: Re: [petsc-users] Generating xdmf from h5 file. > From: knepley at gmail.com > To: potaman at outlook.com > CC: petsc-users at mcs.anl.gov > > On Wed, Sep 24, 2014 at 5:08 PM, subramanya sadasiva > wrote: > > Hi, > i was trying to use petsc_gen_xdmf.py to convert a h5 file to a xdmf file. > The h5 file was generated by snes/ex12 which was run as, > > ex12 -dm_view hdf5:my.h5 > > When I do, > petsc_gen_xdmf.py my.h5 > > I get the following error, > > File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", > line 220, in > generateXdmf(sys.argv[1]) > File > "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line > 208, in generateXdmf > time = np.array(h5['time']).flatten() > File "/usr/lib/python2.7/dist-packages/h5py/_hl/group.py", line 153, in > __getitem__ > oid = h5o.open(self.id, self._e(name), lapl=self._lapl) > File "h5o.pyx", line 173, in h5py.h5o.open (h5py/h5o.c:3403) > KeyError: "unable to open object (Symbol table: Can't open object)" > > I am not sure if the error is on my end. This is on Ubuntu 14.04 with the > serial version of hdf5. I built petsc with --download-hdf5, is it necessary > to use the same version of hdf5 to generate the xdmf file? > > > That code is alpha, and mainly built for me to experiment with an > application here, so it is not user-friendly. In your > HDF5 file, there is no 'time' since you are not running a TS. This access > to h5['time'] should just be protected, and > an empty array should be put in if its not there. > > Matt > > > Thanks > Subramanya > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sol.pdf Type: application/pdf Size: 1331865 bytes Desc: not available URL: From salazardetroya at gmail.com Sat Sep 27 12:31:18 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Sat, 27 Sep 2014 12:31:18 -0500 Subject: [petsc-users] Best way to save entire solution history of a TS Message-ID: Hello I'm performing an adjoint sensitivity analysis in a transient problem and I need to save the entire solution history. I don't have many degrees of freedom or time steps so I should be able to avoid memory problems. I've thought of creating a std::vector and save all the PETSc Vec that I get from TSGetSolution there. Would this be a problem if I'm running the simulation in parallel? Is there a better way to save the entire solution history in memory? Thanks Miguel -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From phoenixauriga at gmail.com Sun Sep 28 06:15:31 2014 From: phoenixauriga at gmail.com (Parvathi M.K) Date: Sun, 28 Sep 2014 16:45:31 +0530 Subject: [petsc-users] Query regarding accessing data in a Petsc Vec object Message-ID: Hey, I am fairly new to PetSc. I was just curious as to how you can directly access the data stored in a Petsc object, say a vector without using VecGetValues(). I've gone through the header files, petscimpl.h and vecimpl.h which contain the definitions of the PetscObject and Vec structures. I cannot figure out which member of the struct holds the Vector data entered using VecSet(). Also, I could not find the definitions of the functions in VecOps. Could someone tell me where I can find these function definitions? Thank you :) Parvathi -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Sun Sep 28 06:17:40 2014 From: popov at uni-mainz.de (Anton Popov) Date: Sun, 28 Sep 2014 13:17:40 +0200 Subject: [petsc-users] Query regarding accessing data in a Petsc Vec object In-Reply-To: References: Message-ID: <5427EE54.1000208@uni-mainz.de> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecGetArray.html On 9/28/14 1:15 PM, Parvathi M.K wrote: > Hey, > I am fairly new to PetSc. I was just curious as to how you can > directly access the data stored in a Petsc object, say a vector > without using VecGetValues(). > > I've gone through the header files, petscimpl.h and vecimpl.h which > contain the definitions of the PetscObject and Vec structures. I > cannot figure out which member of the struct holds the Vector data > entered using VecSet(). Also, I could not find the definitions of the > functions in VecOps. Could someone tell me where I can find these > function definitions? > > Thank you :) > > Parvathi -------------- next part -------------- An HTML attachment was scrubbed... URL: From phoenixauriga at gmail.com Sun Sep 28 06:57:43 2014 From: phoenixauriga at gmail.com (Parvathi M.K) Date: Sun, 28 Sep 2014 17:27:43 +0530 Subject: [petsc-users] Query regarding accessing data in a Petsc Vec object In-Reply-To: References: Message-ID: I want to access the data stored in the vec object directly without using any functions. Is this possible? I tried using something like p->hdr._ in a printf statement (where p is a vector). I wish to know which element of the _p_Vec structure stores the data. On Sun, Sep 28, 2014 at 4:45 PM, Parvathi M.K wrote: > Hey, > > I am fairly new to PetSc. I was just curious as to how you can directly > access the data stored in a Petsc object, say a vector without using > VecGetValues(). > > I've gone through the header files, petscimpl.h and vecimpl.h which > contain the definitions of the PetscObject and Vec structures. I cannot > figure out which member of the struct holds the Vector data entered using > VecSet(). Also, I could not find the definitions of the functions in > VecOps. Could someone tell me where I can find these function definitions? > > Thank you :) > > Parvathi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Sun Sep 28 07:04:20 2014 From: popov at uni-mainz.de (Anton Popov) Date: Sun, 28 Sep 2014 14:04:20 +0200 Subject: [petsc-users] Query regarding accessing data in a Petsc Vec object In-Reply-To: References: Message-ID: <5427F944.1020109@uni-mainz.de> On 9/28/14 1:57 PM, Parvathi M.K wrote: > I want to access the data stored in the vec object directly without > using any functions. Is this possible? > > I tried using something like p->hdr._ in a printf statement (where p > is a vector). I wish to know which element of the _p_Vec structure > stores the data. I think it's impossible for a good reason (or at least highly discouraged). Why don't you want to use a prescribed interface? It's really light-weight. But don't forget to call VecRestoreArray after directly manipulating the data. Anton > > On Sun, Sep 28, 2014 at 4:45 PM, Parvathi M.K > wrote: > > Hey, > I am fairly new to PetSc. I was just curious as to how you can > directly access the data stored in a Petsc object, say a vector > without using VecGetValues(). > > I've gone through the header files, petscimpl.h and vecimpl.h > which contain the definitions of the PetscObject and Vec > structures. I cannot figure out which member of the struct holds > the Vector data entered using VecSet(). Also, I could not find the > definitions of the functions in VecOps. Could someone tell me > where I can find these function definitions? > > Thank you :) > > Parvathi > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Sun Sep 28 07:09:13 2014 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Sun, 28 Sep 2014 14:09:13 +0200 Subject: [petsc-users] Query regarding accessing data in a Petsc Vec object In-Reply-To: References: Message-ID: <5427FA69.1030602@gmail.com> Note that PETSc is designed to hide the details of the implementations from you (It's in C, but uses the techniques of Object-Oriented programming that you might be more familiar with in C++), so normal use would never require the user to know anything about the way PETSc objects represent data. In cases cases where access to lower-level representations of the data is useful, there is an interface provided to do this, as with Vec[Get/Set]Array (see the tutorials included in the src/ tree for many examples - if you look at the bottom of the link Anton provided, you'll see a handy list of links to ones that use that particular function). The definitions of the functions in VecOps will depend on the implementation of Vec being used, and this is in general only known at runtime (which is a great advantage, as you can use command line arguments to change which implementation is used without recompiling anything). The various function definitions should be found in src/vec/vec/impls and subdirectories. So, while it is possible to access the data stored in an object and print it and so on, you'd have to include a private header and it'd be a bit ugly. Perhaps a better way to poke around is to use a debugger like gdb or lldb (or something else built into your IDE if you're using one) - that should be quicker, and not require to you to spend time recompiling things. On 9/28/14 1:57 PM, Parvathi M.K wrote: > I want to access the data stored in the vec object directly without > using any functions. Is this possible? > > I tried using something like p->hdr._ in a printf statement (where p > is a vector). I wish to know which element of the _p_Vec structure > stores the data. > > On Sun, Sep 28, 2014 at 4:45 PM, Parvathi M.K > wrote: > > Hey, > I am fairly new to PetSc. I was just curious as to how you can > directly access the data stored in a Petsc object, say a vector > without using VecGetValues(). > > I've gone through the header files, petscimpl.h and vecimpl.h > which contain the definitions of the PetscObject and Vec > structures. I cannot figure out which member of the struct holds > the Vector data entered using VecSet(). Also, I could not find the > definitions of the functions in VecOps. Could someone tell me > where I can find these function definitions? > > Thank you :) > > Parvathi > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From potaman at outlook.com Sun Sep 28 11:15:54 2014 From: potaman at outlook.com (subramanya sadasiva) Date: Sun, 28 Sep 2014 12:15:54 -0400 Subject: [petsc-users] Generating xdmf from h5 file. In-Reply-To: References: , , , , , Message-ID: Thanks Matt, That worked perfectly. Subramanya Date: Fri, 26 Sep 2014 14:52:52 -0500 Subject: Re: [petsc-users] Generating xdmf from h5 file. From: knepley at gmail.com To: potaman at outlook.com CC: petsc-users at mcs.anl.gov On Thu, Sep 25, 2014 at 10:29 PM, subramanya sadasiva wrote: Hi Matt, Sorry about that, I changed if 'time' in h5: time = np.array(h5['time']).flatten() else: time = np.empty(1) The code now fails in the writeSpaceGridHeader function. with the error, Traceback (most recent call last): File "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 232, in generateXdmf(sys.argv[1]) File "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 227, in generateXdmf Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields) File "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 180, in write self.writeSpaceGridHeader(fp, numCells, numCorners, cellDim, spaceDim) File "/Users/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 64, in writeSpaceGridHeader print self.cellMap[cellDim][numCorners] The error is due to the fact that numCorners is set to be 1 , while celldim=2. cellMap has the following elements. {1: {1: 'Polyvertex', 2: 'Polyline'}, 2: {3: 'Triangle', 4: 'Quadrilateral'}, 3: {8: 'Hexahedron', 4: 'Tetrahedron'}} I also tried ./ex12 -dm_view vtk:my.vtk:vtk_vtu . This doesn't seem to do anything. Is there any specific option I need to build petsc with to get vtk output? My current build has hdf5 and netcdf enabled. I made the change in a branch, and merged to 'next', so it will work if you pull. Here is a test from ex12 (testnum 39 in builder.py): ./ex12 -run_type full -refinement_limit 0.015625 -interpolate 1 -petscspace_order 2 -pc_type gamg -ksp_rtol 1.0e-10 -ksp_monitor_short -ksp_converged_reason -snes_monitor_short -snes_converged_reason -dm_view hdf5:sol.h5 -snes_view_solution hdf5:sol.h5::append which makes sol.h5. Then I run ./bin/pythonscripts/petsc_gen_xdmf.py sol.h5 which makes sol.xmf. I load it up in Paraview and it makes the attached picture. Thanks, Matt Thanks, Subramanya Date: Wed, 24 Sep 2014 17:36:52 -0500Subject: Re: [petsc-users] Generating xdmf from h5 file. From: knepley at gmail.com To: potaman at outlook.com; petsc-maint at mcs.anl.gov; petsc-users at mcs.anl.gov On Wed, Sep 24, 2014 at 5:29 PM, subramanya sadasiva wrote: Hi Matt, That did not help. That's not enough description to fix anything, and fixing it will require programming. Is there any other way to output the mesh to something that paraview can view? I tried outputting the file to a vtk file using ex12 -dm_view vtk:my.vtk:ascii_vtk which, I saw in another post on the forums, but that did not give me any output. This is mixing two different things. PETSc has a diagnostic ASCII vtk output, so the type would be ascii, not vtk,and format ascii_vtk . It also has a production VTU output, which is type vtk with format vtk_vtu. Thanks, Matt Subramanya Date: Wed, 24 Sep 2014 17:19:51 -0500 Subject: Re: [petsc-users] Generating xdmf from h5 file. From: knepley at gmail.com To: potaman at outlook.com CC: petsc-users at mcs.anl.gov On Wed, Sep 24, 2014 at 5:08 PM, subramanya sadasiva wrote: Hi, i was trying to use petsc_gen_xdmf.py to convert a h5 file to a xdmf file. The h5 file was generated by snes/ex12 which was run as, ex12 -dm_view hdf5:my.h5 When I do, petsc_gen_xdmf.py my.h5 I get the following error, File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 220, in generateXdmf(sys.argv[1]) File "/home/ssadasiv/software/petsc/bin/pythonscripts/petsc_gen_xdmf.py", line 208, in generateXdmf time = np.array(h5['time']).flatten() File "/usr/lib/python2.7/dist-packages/h5py/_hl/group.py", line 153, in __getitem__ oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5o.pyx", line 173, in h5py.h5o.open (h5py/h5o.c:3403) KeyError: "unable to open object (Symbol table: Can't open object)" I am not sure if the error is on my end. This is on Ubuntu 14.04 with the serial version of hdf5. I built petsc with --download-hdf5, is it necessary to use the same version of hdf5 to generate the xdmf file? That code is alpha, and mainly built for me to experiment with an application here, so it is not user-friendly. In yourHDF5 file, there is no 'time' since you are not running a TS. This access to h5['time'] should just be protected, andan empty array should be put in if its not there. Matt Thanks Subramanya -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Sep 28 13:34:42 2014 From: jed at jedbrown.org (Jed Brown) Date: Sun, 28 Sep 2014 12:34:42 -0600 Subject: [petsc-users] Best way to save entire solution history of a TS In-Reply-To: References: Message-ID: <87r3yva919.fsf@jedbrown.org> Miguel Angel Salazar de Troya writes: > Hello > > I'm performing an adjoint sensitivity analysis in a transient problem and I > need to save the entire solution history. I don't have many degrees of > freedom or time steps so I should be able to avoid memory problems. I've > thought of creating a std::vector and save all the PETSc Vec that I > get from TSGetSolution there. Would this be a problem if I'm running the > simulation in parallel? Is there a better way to save the entire solution > history in memory? Your suggestion is fine. I would do it from a monitor (TSMonitorSet) and VecCopy the solution vector into your array of solutions. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From evanum at gmail.com Sun Sep 28 18:24:35 2014 From: evanum at gmail.com (Evan Um) Date: Sun, 28 Sep 2014 16:24:35 -0700 Subject: [petsc-users] KSP_DIVERGED_INDEFINITE_PC message even with PCFactorSetShiftType Message-ID: Dear PETSC Users, I try to solve a diffusion problem in the time domain. Its system matrix is theoretically SPD. I use KSPCG solver. Direct MUMPS solver generates a preconditioner. Thus, within 1-2 iterations, a solution is expected to converge. Such convergence is observed in most test problems. However, in some test problems, I get non-convergence with a message: KSP_DIVERGED_INDEFINITE_PC. This problem is alleviated when I use PCFactorSetShiftType(pc_fetd_dt,MAT_SHIFT_POSITIVE_DEFINITE). However, the indefiniteness error message occurs later during time-stepping. I am aware of PETSC function PCFactorSetShiftAmount. However, before I try this, I couldn't find the information about a default shift amount when I use PCFactorSetShiftType(pc_fetd_dt,MAT_SHIFT_POSITIVE_DEFINITE). How can I find the default shift amount? More generally, how can I stably solve SPD problems that are close to indefiniteness using KSP options? To get the reference solution, I was able to solve the same problem above using SuiteSparse (serial) but its solution time was not practical. In advance, thanks for your advice. Regards, Evan Sample Code: KSPCreate(PETSC_COMM_WORLD, &ksp_fetd_dt); KSPSetOperators(ksp_fetd_dt, A_dt, A_dt); KSPSetType (ksp_fetd_dt, KSPPREONLY); KSPGetPC(ksp_fetd_dt, &pc_fetd_dt); MatSetOption(A_dt, MAT_SPD, PETSC_TRUE); PCSetType(pc_fetd_dt, PCCHOLESKY); PCFactorSetMatSolverPackage(pc_fetd_dt, MATSOLVERMUMPS); PCFactorSetUpMatSolverPackage(pc_fetd_dt); PCFactorGetMatrix(pc_fetd_dt, &F_dt); KSPSetType(ksp_fetd_dt, KSPCG); PCFactorSetShiftType(pc_fetd_dt,MAT_SHIFT_POSITIVE_DEFINITE); -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Sep 28 18:54:51 2014 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 28 Sep 2014 18:54:51 -0500 Subject: [petsc-users] KSP_DIVERGED_INDEFINITE_PC message even with PCFactorSetShiftType In-Reply-To: References: Message-ID: On Sun, Sep 28, 2014 at 6:24 PM, Evan Um wrote: > Dear PETSC Users, > > I try to solve a diffusion problem in the time domain. Its system matrix > is theoretically SPD. I use KSPCG solver. Direct MUMPS solver generates a > preconditioner. Thus, within 1-2 iterations, a solution is expected to > converge. Such convergence is observed in most test problems. However, in > some test problems, I get non-convergence with a message: > KSP_DIVERGED_INDEFINITE_PC. This problem is alleviated when I use > PCFactorSetShiftType(pc_fetd_dt,MAT_SHIFT_POSITIVE_DEFINITE). However, the > indefiniteness error message occurs later during time-stepping. I am aware > of PETSC function PCFactorSetShiftAmount. However, before I try this, I > couldn't find the information about a default shift amount when I use > PCFactorSetShiftType(pc_fetd_dt,MAT_SHIFT_POSITIVE_DEFINITE). How can I > find the default shift amount? > -ksp_view should tell you the shift amount > More generally, how can I stably solve SPD problems that are close to > indefiniteness using KSP options? To get the reference solution, I was able > to solve the same problem above using SuiteSparse (serial) but its solution > time was not practical. > This sounds like a discretization problem. However, you can always change solvers to GMRES. If you are only doing 1-2 iterates, there is no advantage to CG (other than it minimizing a different norm). Thanks, Matt > In advance, thanks for your advice. > > Regards, > Evan > > Sample Code: > > KSPCreate(PETSC_COMM_WORLD, &ksp_fetd_dt); > > KSPSetOperators(ksp_fetd_dt, A_dt, A_dt); > > KSPSetType (ksp_fetd_dt, KSPPREONLY); > > KSPGetPC(ksp_fetd_dt, &pc_fetd_dt); > > MatSetOption(A_dt, MAT_SPD, PETSC_TRUE); > > PCSetType(pc_fetd_dt, PCCHOLESKY); > > PCFactorSetMatSolverPackage(pc_fetd_dt, MATSOLVERMUMPS); > > PCFactorSetUpMatSolverPackage(pc_fetd_dt); > > PCFactorGetMatrix(pc_fetd_dt, &F_dt); > > KSPSetType(ksp_fetd_dt, KSPCG); > > PCFactorSetShiftType(pc_fetd_dt,MAT_SHIFT_POSITIVE_DEFINITE); > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From filippo.leonardi at sam.math.ethz.ch Mon Sep 29 08:42:19 2014 From: filippo.leonardi at sam.math.ethz.ch (Filippo Leonardi) Date: Mon, 29 Sep 2014 15:42:19 +0200 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation Message-ID: <2490546.DNVhllGaLT@besikovitch-ii> Hi, I am trying to solve a standard second order central differenced Poisson equation in parallel, in 3D, using a 3D structured DMDAs (extremely standard Laplacian matrix). I want to get some nice scaling (especially weak), but my results show that the Krylow method is not performing as expected. The problem (at leas for CG + Bjacobi) seems to lie on the number of iterations. In particular the number of iterations grows with CG (the matrix is SPD) + BJacobi as mesh is refined (probably due to condition number increasing) and number of processors is increased (probably due to the Bjacobi preconditioner). For instance I tried the following setup: 1 procs to solve 32^3 domain => 20 iterations 8 procs to solve 64^3 domain => 60 iterations 64 procs to solve 128^3 domain => 101 iterations Is there something pathological with my runs (maybe I am missing something)? Is there somebody who can provide me weak scaling benchmarks for equivalent problems? (Maybe there is some better preconditioner for this problem). I am also aware that Multigrid is even better for this problems but the **scalability** of my runs seems to be as bad as with CG. -pc_mg_galerkin -pc_type mg (both directly with richardson or as preconditioner to cg) The following is the "-log_summary" of a 128^3 run, notice that I solve the system multiple times (hence KSPSolve is multiplied by 128). Using CG + BJacobi. Tell me if I missed some detail and sorry for the length of the post. Thanks, Filippo Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 Max Max/Min Avg Total Time (sec): 9.095e+01 1.00001 9.095e+01 Objects: 1.875e+03 1.00000 1.875e+03 Flops: 1.733e+10 1.00000 1.733e+10 1.109e+12 Flops/sec: 1.905e+08 1.00001 1.905e+08 1.219e+10 MPI Messages: 1.050e+05 1.00594 1.044e+05 6.679e+06 MPI Message Lengths: 1.184e+09 1.37826 8.283e+03 5.532e+10 MPI Reductions: 4.136e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.1468e-01 0.1% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: StepStage: 4.4170e-01 0.5% 7.2478e+09 0.7% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2: ConvStage: 8.8333e+00 9.7% 3.7044e+10 3.3% 1.475e+06 22.1% 1.809e+03 21.8% 0.000e+00 0.0% 3: ProjStage: 7.7169e+01 84.8% 1.0556e+12 95.2% 5.151e+06 77.1% 6.317e+03 76.3% 4.024e+04 97.3% 4: IoStage: 2.4789e+00 2.7% 0.0000e+00 0.0% 3.564e+03 0.1% 1.017e+02 1.2% 5.000e+01 0.1% 5: SolvAlloc: 7.0947e-01 0.8% 0.0000e+00 0.0% 5.632e+03 0.1% 9.587e-01 0.0% 3.330e+02 0.8% 6: SolvSolve: 1.2044e+00 1.3% 9.1679e+09 0.8% 4.454e+04 0.7% 5.464e+01 0.7% 7.320e+02 1.8% 7: SolvDeall: 7.5711e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: StepStage VecAXPY 1536 1.0 4.6436e-01 1.1 1.13e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 99100 0 0 0 15608 --- Event Stage 2: ConvStage VecCopy 2304 1.0 8.1658e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 VecAXPY 2304 1.0 6.1324e-01 1.2 1.51e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 6 26 0 0 0 15758 VecAXPBYCZ 2688 1.0 1.3029e+00 1.1 3.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 14 61 0 0 0 17306 VecPointwiseMult 2304 1.0 7.2368e-01 1.0 7.55e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 8 13 0 0 0 6677 VecScatterBegin 3840 1.0 1.8182e+00 1.3 0.00e+00 0.0 1.5e+06 8.2e+03 0.0e+00 2 0 22 22 0 18 0100100 0 0 VecScatterEnd 3840 1.0 1.1972e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 --- Event Stage 3: ProjStage VecTDot 25802 1.0 4.2552e+00 1.3 1.69e+09 1.0 0.0e+00 0.0e+00 2.6e+04 4 10 0 0 62 5 10 0 0 64 25433 VecNorm 13029 1.0 3.0772e+00 3.3 8.54e+08 1.0 0.0e+00 0.0e+00 1.3e+04 2 5 0 0 32 2 5 0 0 32 17759 VecCopy 640 1.0 2.4339e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 13157 1.0 7.0903e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 26186 1.0 4.1462e+00 1.1 1.72e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 10 0 0 0 5 10 0 0 0 26490 VecAYPX 12773 1.0 1.9135e+00 1.1 8.37e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 5 0 0 0 2 5 0 0 0 27997 VecScatterBegin 13413 1.0 1.0689e+00 1.1 0.00e+00 0.0 5.2e+06 8.2e+03 0.0e+00 1 0 77 76 0 1 0100100 0 0 VecScatterEnd 13413 1.0 2.7944e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 MatMult 12901 1.0 3.2072e+01 1.0 5.92e+09 1.0 5.0e+06 8.2e+03 0.0e+00 35 34 74 73 0 41 36 96 96 0 11810 MatSolve 13029 1.0 3.0851e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 0.0e+00 33 31 0 0 0 39 33 0 0 0 11182 MatLUFactorNum 128 1.0 1.2922e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 2 1 0 0 0 4358 MatILUFactorSym 128 1.0 7.5075e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.3e+02 1 0 0 0 0 1 0 0 0 0 0 MatGetRowIJ 128 1.0 1.4782e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 128 1.0 5.7567e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.6e+02 0 0 0 0 1 0 0 0 0 1 0 KSPSetUp 256 1.0 1.9913e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 7.7e+02 0 0 0 0 2 0 0 0 0 2 0 KSPSolve 128 1.0 7.6381e+01 1.0 1.65e+10 1.0 5.0e+06 8.2e+03 4.0e+04 84 95 74 73 97 99100 96 96100 13800 PCSetUp 256 1.0 2.1503e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 6.4e+02 2 1 0 0 2 3 1 0 0 2 2619 PCSetUpOnBlocks 128 1.0 2.1232e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 3.8e+02 2 1 0 0 1 3 1 0 0 1 2652 PCApply 13029 1.0 3.1812e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 0.0e+00 34 31 0 0 0 40 33 0 0 0 10844 --- Event Stage 4: IoStage VecView 10 1.0 1.7523e+00282.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 1 0 0 0 0 36 0 0 0 40 0 VecCopy 10 1.0 2.2449e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 6 1.0 2.3620e-03 2.4 0.00e+00 0.0 2.3e+03 8.2e+03 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 VecScatterEnd 6 1.0 4.4194e-01663.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 9 0 0 0 0 0 --- Event Stage 5: SolvAlloc VecSet 50 1.0 1.3170e-01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 MatAssemblyBegin 4 1.0 3.9801e-0230.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 3 0 0 0 2 0 MatAssemblyEnd 4 1.0 2.2752e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 --- Event Stage 6: SolvSolve VecTDot 224 1.0 3.5454e-02 1.3 1.47e+07 1.0 0.0e+00 0.0e+00 2.2e+02 0 0 0 0 1 3 10 0 0 31 26499 VecNorm 497 1.0 1.5268e-01 1.4 7.41e+06 1.0 0.0e+00 0.0e+00 5.0e+02 0 0 0 0 1 11 5 0 0 68 3104 VecCopy 8 1.0 2.7523e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 114 1.0 5.9965e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 230 1.0 3.7198e-02 1.1 1.51e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 3 11 0 0 0 25934 VecAYPX 111 1.0 1.7153e-02 1.1 7.27e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 5 0 0 0 27142 VecScatterBegin 116 1.0 1.1888e-02 1.2 0.00e+00 0.0 4.5e+04 8.2e+03 0.0e+00 0 0 1 1 0 1 0100100 0 0 VecScatterEnd 116 1.0 2.8105e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 MatMult 112 1.0 2.8080e-01 1.0 5.14e+07 1.0 4.3e+04 8.2e+03 0.0e+00 0 0 1 1 0 23 36 97 97 0 11711 MatSolve 113 1.0 2.6673e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 22 33 0 0 0 11217 MatLUFactorNum 1 1.0 1.0332e-02 1.0 6.87e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 4259 MatILUFactorSym 1 1.0 3.1291e-02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 2 0 0 0 0 0 MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 3.4251e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 2 1.0 3.6959e-0210.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 1 0 0 0 1 0 KSPSolve 1 1.0 6.9956e-01 1.0 1.43e+08 1.0 4.3e+04 8.2e+03 3.5e+02 1 1 1 1 1 58100 97 97 48 13069 PCSetUp 2 1.0 4.4161e-02 2.3 6.87e+05 1.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 3 0 0 0 1 996 PCSetUpOnBlocks 1 1.0 4.3894e-02 2.4 6.87e+05 1.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 3 0 0 0 0 1002 PCApply 113 1.0 2.7507e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 22 33 0 0 0 10877 --- Event Stage 7: SolvDeall ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 1 0 0 0 --- Event Stage 1: StepStage --- Event Stage 2: ConvStage --- Event Stage 3: ProjStage Vector 640 640 101604352 0 Matrix 128 128 410327040 0 Index Set 384 384 17062912 0 Krylov Solver 256 256 282624 0 Preconditioner 256 256 228352 0 --- Event Stage 4: IoStage Vector 10 10 2636400 0 Viewer 10 10 6880 0 --- Event Stage 5: SolvAlloc Vector 140 6 8848 0 Vector Scatter 6 0 0 0 Matrix 6 0 0 0 Distributed Mesh 2 0 0 0 Bipartite Graph 4 0 0 0 Index Set 14 14 372400 0 IS L to G Mapping 3 0 0 0 Krylov Solver 1 0 0 0 Preconditioner 1 0 0 0 --- Event Stage 6: SolvSolve Vector 5 0 0 0 Matrix 1 0 0 0 Index Set 3 0 0 0 Krylov Solver 2 1 1136 0 Preconditioner 2 1 824 0 --- Event Stage 7: SolvDeall Vector 0 133 36676728 0 Vector Scatter 0 1 1036 0 Matrix 0 4 7038924 0 Index Set 0 3 133304 0 Krylov Solver 0 2 2208 0 Preconditioner 0 2 1784 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 1.12057e-05 Average time for zero size MPI_Send(): 1.3113e-06 #PETSc Option Table entries: -ksp_type cg -log_summary -pc_type bjacobi #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Configure options: Application 9457215 resources: utime ~5920s, stime ~58s -------------- next part -------------- A non-text attachment was scrubbed... Name: ETHZ.vcf Type: text/vcard Size: 594 bytes Desc: not available URL: From knepley at gmail.com Mon Sep 29 08:58:35 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 29 Sep 2014 08:58:35 -0500 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: <2490546.DNVhllGaLT@besikovitch-ii> References: <2490546.DNVhllGaLT@besikovitch-ii> Message-ID: On Mon, Sep 29, 2014 at 8:42 AM, Filippo Leonardi < filippo.leonardi at sam.math.ethz.ch> wrote: > Hi, > > I am trying to solve a standard second order central differenced Poisson > equation in parallel, in 3D, using a 3D structured DMDAs (extremely > standard > Laplacian matrix). > > I want to get some nice scaling (especially weak), but my results show that > the Krylow method is not performing as expected. The problem (at leas for > CG + > Bjacobi) seems to lie on the number of iterations. > > In particular the number of iterations grows with CG (the matrix is SPD) + > BJacobi as mesh is refined (probably due to condition number increasing) > and > number of processors is increased (probably due to the Bjacobi > preconditioner). For instance I tried the following setup: > 1 procs to solve 32^3 domain => 20 iterations > 8 procs to solve 64^3 domain => 60 iterations > 64 procs to solve 128^3 domain => 101 iterations > > Is there something pathological with my runs (maybe I am missing > something)? > Is there somebody who can provide me weak scaling benchmarks for equivalent > problems? (Maybe there is some better preconditioner for this problem). > Bjacobi is not a scalable preconditioner. As you note, the number of iterates grows with the system size. You should always use MG here. > I am also aware that Multigrid is even better for this problems but the > **scalability** of my runs seems to be as bad as with CG. > MG will weak scale almost perfectly. Send -log_summary for each run if this does not happen. Thanks, Matt > -pc_mg_galerkin > -pc_type mg > (both directly with richardson or as preconditioner to cg) > > The following is the "-log_summary" of a 128^3 run, notice that I solve the > system multiple times (hence KSPSolve is multiplied by 128). Using CG + > BJacobi. > > Tell me if I missed some detail and sorry for the length of the post. > > Thanks, > Filippo > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 > > Max Max/Min Avg Total > Time (sec): 9.095e+01 1.00001 9.095e+01 > Objects: 1.875e+03 1.00000 1.875e+03 > Flops: 1.733e+10 1.00000 1.733e+10 1.109e+12 > Flops/sec: 1.905e+08 1.00001 1.905e+08 1.219e+10 > MPI Messages: 1.050e+05 1.00594 1.044e+05 6.679e+06 > MPI Message Lengths: 1.184e+09 1.37826 8.283e+03 5.532e+10 > MPI Reductions: 4.136e+04 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> > 2N flops > and VecAXPY() for complex vectors of length N > --> > 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- > -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total > Avg %Total counts %Total > 0: Main Stage: 1.1468e-01 0.1% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 1: StepStage: 4.4170e-01 0.5% 7.2478e+09 0.7% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 2: ConvStage: 8.8333e+00 9.7% 3.7044e+10 3.3% 1.475e+06 22.1% > 1.809e+03 21.8% 0.000e+00 0.0% > 3: ProjStage: 7.7169e+01 84.8% 1.0556e+12 95.2% 5.151e+06 77.1% > 6.317e+03 76.3% 4.024e+04 97.3% > 4: IoStage: 2.4789e+00 2.7% 0.0000e+00 0.0% 3.564e+03 0.1% > 1.017e+02 1.2% 5.000e+01 0.1% > 5: SolvAlloc: 7.0947e-01 0.8% 0.0000e+00 0.0% 5.632e+03 0.1% > 9.587e-01 0.0% 3.330e+02 0.8% > 6: SolvSolve: 1.2044e+00 1.3% 9.1679e+09 0.8% 4.454e+04 0.7% > 5.464e+01 0.7% 7.320e+02 1.8% > 7: SolvDeall: 7.5711e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting > output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %f - percent flops in this > phase > %M - percent messages in this phase %L - percent message lengths > in > this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all > processors) > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > > --- Event Stage 1: StepStage > > VecAXPY 1536 1.0 4.6436e-01 1.1 1.13e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 99100 0 0 0 15608 > > --- Event Stage 2: ConvStage > > VecCopy 2304 1.0 8.1658e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 > VecAXPY 2304 1.0 6.1324e-01 1.2 1.51e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 6 26 0 0 0 15758 > VecAXPBYCZ 2688 1.0 1.3029e+00 1.1 3.52e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 2 0 0 0 14 61 0 0 0 17306 > VecPointwiseMult 2304 1.0 7.2368e-01 1.0 7.55e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 8 13 0 0 0 6677 > VecScatterBegin 3840 1.0 1.8182e+00 1.3 0.00e+00 0.0 1.5e+06 8.2e+03 > 0.0e+00 2 0 22 22 0 18 0100100 0 0 > VecScatterEnd 3840 1.0 1.1972e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 > > --- Event Stage 3: ProjStage > > VecTDot 25802 1.0 4.2552e+00 1.3 1.69e+09 1.0 0.0e+00 0.0e+00 > 2.6e+04 4 10 0 0 62 5 10 0 0 64 25433 > VecNorm 13029 1.0 3.0772e+00 3.3 8.54e+08 1.0 0.0e+00 0.0e+00 > 1.3e+04 2 5 0 0 32 2 5 0 0 32 17759 > VecCopy 640 1.0 2.4339e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 13157 1.0 7.0903e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecAXPY 26186 1.0 4.1462e+00 1.1 1.72e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 10 0 0 0 5 10 0 0 0 26490 > VecAYPX 12773 1.0 1.9135e+00 1.1 8.37e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 2 5 0 0 0 2 5 0 0 0 27997 > VecScatterBegin 13413 1.0 1.0689e+00 1.1 0.00e+00 0.0 5.2e+06 8.2e+03 > 0.0e+00 1 0 77 76 0 1 0100100 0 0 > VecScatterEnd 13413 1.0 2.7944e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 > MatMult 12901 1.0 3.2072e+01 1.0 5.92e+09 1.0 5.0e+06 8.2e+03 > 0.0e+00 35 34 74 73 0 41 36 96 96 0 11810 > MatSolve 13029 1.0 3.0851e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 33 31 0 0 0 39 33 0 0 0 11182 > MatLUFactorNum 128 1.0 1.2922e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 2 1 0 0 0 4358 > MatILUFactorSym 128 1.0 7.5075e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.3e+02 1 0 0 0 0 1 0 0 0 0 0 > MatGetRowIJ 128 1.0 1.4782e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 128 1.0 5.7567e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.6e+02 0 0 0 0 1 0 0 0 0 1 0 > KSPSetUp 256 1.0 1.9913e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 7.7e+02 0 0 0 0 2 0 0 0 0 2 0 > KSPSolve 128 1.0 7.6381e+01 1.0 1.65e+10 1.0 5.0e+06 8.2e+03 > 4.0e+04 84 95 74 73 97 99100 96 96100 13800 > PCSetUp 256 1.0 2.1503e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > 6.4e+02 2 1 0 0 2 3 1 0 0 2 2619 > PCSetUpOnBlocks 128 1.0 2.1232e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > 3.8e+02 2 1 0 0 1 3 1 0 0 1 2652 > PCApply 13029 1.0 3.1812e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 34 31 0 0 0 40 33 0 0 0 10844 > > --- Event Stage 4: IoStage > > VecView 10 1.0 1.7523e+00282.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+01 1 0 0 0 0 36 0 0 0 40 0 > VecCopy 10 1.0 2.2449e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 6 1.0 2.3620e-03 2.4 0.00e+00 0.0 2.3e+03 8.2e+03 > 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 > VecScatterEnd 6 1.0 4.4194e-01663.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 9 0 0 0 0 0 > > --- Event Stage 5: SolvAlloc > > VecSet 50 1.0 1.3170e-01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 > MatAssemblyBegin 4 1.0 3.9801e-0230.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 3 0 0 0 2 0 > MatAssemblyEnd 4 1.0 2.2752e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 > 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 > > --- Event Stage 6: SolvSolve > > VecTDot 224 1.0 3.5454e-02 1.3 1.47e+07 1.0 0.0e+00 0.0e+00 > 2.2e+02 0 0 0 0 1 3 10 0 0 31 26499 > VecNorm 497 1.0 1.5268e-01 1.4 7.41e+06 1.0 0.0e+00 0.0e+00 > 5.0e+02 0 0 0 0 1 11 5 0 0 68 3104 > VecCopy 8 1.0 2.7523e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 114 1.0 5.9965e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 230 1.0 3.7198e-02 1.1 1.51e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 3 11 0 0 0 25934 > VecAYPX 111 1.0 1.7153e-02 1.1 7.27e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 5 0 0 0 27142 > VecScatterBegin 116 1.0 1.1888e-02 1.2 0.00e+00 0.0 4.5e+04 8.2e+03 > 0.0e+00 0 0 1 1 0 1 0100100 0 0 > VecScatterEnd 116 1.0 2.8105e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 > MatMult 112 1.0 2.8080e-01 1.0 5.14e+07 1.0 4.3e+04 8.2e+03 > 0.0e+00 0 0 1 1 0 23 36 97 97 0 11711 > MatSolve 113 1.0 2.6673e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 22 33 0 0 0 11217 > MatLUFactorNum 1 1.0 1.0332e-02 1.0 6.87e+05 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 4259 > MatILUFactorSym 1 1.0 3.1291e-02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 0 2 0 0 0 0 0 > MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetOrdering 1 1.0 3.4251e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSetUp 2 1.0 3.6959e-0210.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+00 0 0 0 0 0 1 0 0 0 1 0 > KSPSolve 1 1.0 6.9956e-01 1.0 1.43e+08 1.0 4.3e+04 8.2e+03 > 3.5e+02 1 1 1 1 1 58100 97 97 48 13069 > PCSetUp 2 1.0 4.4161e-02 2.3 6.87e+05 1.0 0.0e+00 0.0e+00 > 5.0e+00 0 0 0 0 0 3 0 0 0 1 996 > PCSetUpOnBlocks 1 1.0 4.3894e-02 2.4 6.87e+05 1.0 0.0e+00 0.0e+00 > 3.0e+00 0 0 0 0 0 3 0 0 0 0 1002 > PCApply 113 1.0 2.7507e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 22 33 0 0 0 10877 > > --- Event Stage 7: SolvDeall > > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Viewer 1 0 0 0 > > --- Event Stage 1: StepStage > > > --- Event Stage 2: ConvStage > > > --- Event Stage 3: ProjStage > > Vector 640 640 101604352 0 > Matrix 128 128 410327040 0 > Index Set 384 384 17062912 0 > Krylov Solver 256 256 282624 0 > Preconditioner 256 256 228352 0 > > --- Event Stage 4: IoStage > > Vector 10 10 2636400 0 > Viewer 10 10 6880 0 > > --- Event Stage 5: SolvAlloc > > Vector 140 6 8848 0 > Vector Scatter 6 0 0 0 > Matrix 6 0 0 0 > Distributed Mesh 2 0 0 0 > Bipartite Graph 4 0 0 0 > Index Set 14 14 372400 0 > IS L to G Mapping 3 0 0 0 > Krylov Solver 1 0 0 0 > Preconditioner 1 0 0 0 > > --- Event Stage 6: SolvSolve > > Vector 5 0 0 0 > Matrix 1 0 0 0 > Index Set 3 0 0 0 > Krylov Solver 2 1 1136 0 > Preconditioner 2 1 824 0 > > --- Event Stage 7: SolvDeall > > Vector 0 133 36676728 0 > Vector Scatter 0 1 1036 0 > Matrix 0 4 7038924 0 > Index Set 0 3 133304 0 > Krylov Solver 0 2 2208 0 > Preconditioner 0 2 1784 0 > > ======================================================================================================================== > Average time to get PetscTime(): 9.53674e-08 > Average time for MPI_Barrier(): 1.12057e-05 > Average time for zero size MPI_Send(): 1.3113e-06 > #PETSc Option Table entries: > -ksp_type cg > -log_summary > -pc_type bjacobi > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure run at: > Configure options: > Application 9457215 resources: utime ~5920s, stime ~58s > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From filippo.leonardi at sam.math.ethz.ch Mon Sep 29 09:36:48 2014 From: filippo.leonardi at sam.math.ethz.ch (Filippo Leonardi) Date: Mon, 29 Sep 2014 16:36:48 +0200 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: References: <2490546.DNVhllGaLT@besikovitch-ii> Message-ID: <20769360.6VtOyZos7S@besikovitch-ii> Thank you. Actually I had the feeling that it wasn't my problem with Bjacobi and CG. So I'll stick to MG. Problem with MG is that there are a lot of parameters to be tuned, so I leave the defaults (expect I select CG as Krylow method). I post just results for 64^3 and 128^3. Tell me if I'm missing some useful detail. (I get similar results with BoomerAMG). Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin -pc_type mg): 32^3 and 1 proc: 1.01e-1 64^3 and 8 proc: 6.56e-01 128^3 and 64 proc: 1.05e+00 Number of PCSetup per KSPSolve: 15 39 65 With BoomerAMG: stable 8 iterations per KSP but time per iteration greater than PETSc MG and still increases: 64^3: 3.17e+00 128^3: 9.99e+00 --> For instance with 64^3 (256 iterations): Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 Max Max/Min Avg Total Time (sec): 1.896e+02 1.00000 1.896e+02 Objects: 7.220e+03 1.00000 7.220e+03 Flops: 3.127e+10 1.00000 3.127e+10 2.502e+11 Flops/sec: 1.649e+08 1.00000 1.649e+08 1.319e+09 MPI Messages: 9.509e+04 1.00316 9.483e+04 7.586e+05 MPI Message Lengths: 1.735e+09 1.09967 1.685e+04 1.278e+10 MPI Reductions: 4.781e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3416e-02 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: StepStage: 8.7909e-01 0.5% 1.8119e+09 0.7% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2: ConvStage: 1.7172e+01 9.1% 9.2610e+09 3.7% 1.843e+05 24.3% 3.981e+03 23.6% 0.000e+00 0.0% 3: ProjStage: 1.6804e+02 88.6% 2.3813e+11 95.2% 5.703e+05 75.2% 1.232e+04 73.1% 4.627e+04 96.8% 4: IoStage: 1.5814e+00 0.8% 0.0000e+00 0.0% 1.420e+03 0.2% 4.993e+02 3.0% 2.500e+02 0.5% 5: SolvAlloc: 2.5722e-01 0.1% 0.0000e+00 0.0% 2.560e+02 0.0% 1.054e+00 0.0% 3.330e+02 0.7% 6: SolvSolve: 1.6776e+00 0.9% 9.5345e+08 0.4% 2.280e+03 0.3% 4.924e+01 0.3% 9.540e+02 2.0% 7: SolvDeall: 7.4017e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: StepStage VecAXPY 3072 1.0 8.8295e-01 1.0 2.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 99100 0 0 0 2052 --- Event Stage 2: ConvStage VecCopy 4608 1.0 1.6016e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 VecAXPY 4608 1.0 1.2212e+00 1.2 3.02e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 6 26 0 0 0 1978 VecAXPBYCZ 5376 1.0 2.5875e+00 1.1 7.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 15 61 0 0 0 2179 VecPointwiseMult 4608 1.0 1.4411e+00 1.0 1.51e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 8 13 0 0 0 838 VecScatterBegin 7680 1.0 3.4130e+00 1.0 0.00e+00 0.0 1.8e+05 1.6e+04 0.0e+00 2 0 24 24 0 20 0100100 0 0 VecScatterEnd 7680 1.0 9.3412e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 0 0 0 0 0 --- Event Stage 3: ProjStage VecMDot 2560 1.0 2.1944e+00 1.1 9.23e+08 1.0 0.0e+00 0.0e+00 2.6e+03 1 3 0 0 5 1 3 0 0 6 3364 VecTDot 19924 1.0 2.7283e+00 1.3 1.31e+09 1.0 0.0e+00 0.0e+00 2.0e+04 1 4 0 0 42 1 4 0 0 43 3829 VecNorm 13034 1.0 1.5385e+00 2.0 8.54e+08 1.0 0.0e+00 0.0e+00 1.3e+04 1 3 0 0 27 1 3 0 0 28 4442 VecScale 13034 1.0 9.0783e-01 1.3 4.27e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 3764 VecCopy 21972 1.0 3.5136e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSet 21460 1.0 1.3108e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 41384 1.0 5.9866e+00 1.1 2.71e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 9 0 0 0 3 9 0 0 0 3624 VecAYPX 30142 1.0 5.3362e+00 1.0 1.64e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 5 0 0 0 3 6 0 0 0 2460 VecMAXPY 2816 1.0 1.8561e+00 1.0 1.09e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 4 0 0 0 4700 VecScatterBegin 23764 1.0 1.7138e+00 1.1 0.00e+00 0.0 5.7e+05 1.6e+04 0.0e+00 1 0 75 73 0 1 0100100 0 0 VecScatterEnd 23764 1.0 3.1986e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecNormalize 2816 1.0 2.9511e-01 1.1 2.77e+08 1.0 0.0e+00 0.0e+00 2.8e+03 0 1 0 0 6 0 1 0 0 6 7504 MatMult 22740 1.0 4.6896e+01 1.0 1.04e+10 1.0 5.5e+05 1.6e+04 0.0e+00 25 33 72 70 0 28 35 96 96 0 1780 MatSOR 23252 1.0 9.5250e+01 1.0 1.04e+10 1.0 0.0e+00 0.0e+00 0.0e+00 50 33 0 0 0 56 35 0 0 0 872 KSPGMRESOrthog 2560 1.0 3.6142e+00 1.1 1.85e+09 1.0 0.0e+00 0.0e+00 2.6e+03 2 6 0 0 5 2 6 0 0 6 4085 KSPSetUp 768 1.0 7.9389e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.6e+03 0 0 0 0 12 0 0 0 0 12 0 KSPSolve 256 1.0 1.6661e+02 1.0 2.97e+10 1.0 5.5e+05 1.6e+04 4.6e+04 88 95 72 70 97 99100 96 96100 1427 PCSetUp 256 1.0 2.6755e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+03 0 0 0 0 3 0 0 0 0 3 0 PCApply 10218 1.0 1.3642e+02 1.0 2.12e+10 1.0 3.1e+05 1.6e+04 1.3e+04 72 68 40 39 27 81 71 54 54 28 1245 --- Event Stage 4: IoStage VecView 50 1.0 8.8377e-0138.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 0 29 0 0 0 40 0 VecCopy 50 1.0 8.9977e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecScatterBegin 30 1.0 1.0644e-02 1.6 0.00e+00 0.0 7.2e+02 1.6e+04 0.0e+00 0 0 0 0 0 1 0 51 3 0 0 VecScatterEnd 30 1.0 2.4857e-01109.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 8 0 0 0 0 0 --- Event Stage 5: SolvAlloc VecSet 50 1.0 1.9324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 7 0 0 0 0 0 MatAssemblyBegin 4 1.0 5.0378e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 1 0 0 0 2 0 MatAssemblyEnd 4 1.0 1.5030e-02 1.0 0.00e+00 0.0 9.6e+01 4.1e+03 1.6e+01 0 0 0 0 0 6 0 38 49 5 0 --- Event Stage 6: SolvSolve VecMDot 10 1.0 8.9154e-03 1.1 3.60e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 3 0 0 1 3234 VecTDot 80 1.0 1.1104e-02 1.1 5.24e+06 1.0 0.0e+00 0.0e+00 8.0e+01 0 0 0 0 0 1 4 0 0 8 3777 VecNorm 820 1.0 2.6904e-01 1.6 3.41e+06 1.0 0.0e+00 0.0e+00 8.2e+02 0 0 0 0 2 13 3 0 0 86 101 VecScale 52 1.0 3.6066e-03 1.2 1.70e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 3780 VecCopy 91 1.0 1.4363e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecSet 86 1.0 5.1112e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 169 1.0 2.4659e-02 1.1 1.11e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 9 0 0 0 3593 VecAYPX 121 1.0 2.2017e-02 1.1 6.59e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 6 0 0 0 2393 VecMAXPY 11 1.0 7.2782e-03 1.0 4.26e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 4 0 0 0 4682 VecScatterBegin 95 1.0 7.3617e-03 1.1 0.00e+00 0.0 2.3e+03 1.6e+04 0.0e+00 0 0 0 0 0 0 0100100 0 0 VecScatterEnd 95 1.0 1.3788e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecNormalize 11 1.0 1.2109e-03 1.1 1.08e+06 1.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 0 0 1 0 0 1 7144 MatMult 91 1.0 1.9398e-01 1.0 4.17e+07 1.0 2.2e+03 1.6e+04 0.0e+00 0 0 0 0 0 11 35 96 96 0 1722 MatSOR 93 1.0 3.8194e-01 1.0 4.16e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 23 35 0 0 0 870 KSPGMRESOrthog 10 1.0 1.4540e-02 1.1 7.21e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 1 6 0 0 1 3966 KSPSetUp 3 1.0 5.2021e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 0 0 0 0 0 0 0 0 0 3 0 KSPSolve 1 1.0 6.7911e-01 1.0 1.19e+08 1.0 2.2e+03 1.6e+04 1.9e+02 0 0 0 0 0 40100 96 96 19 1399 PCSetUp 1 1.0 1.9128e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 PCApply 41 1.0 5.5355e-01 1.0 8.47e+07 1.0 1.2e+03 1.6e+04 5.1e+01 0 0 0 0 0 33 71 54 54 5 1224 --- Event Stage 7: SolvDeall ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 1 0 0 0 --- Event Stage 1: StepStage --- Event Stage 2: ConvStage --- Event Stage 3: ProjStage Vector 5376 5376 1417328640 0 Krylov Solver 768 768 8298496 0 Preconditioner 768 768 645120 0 --- Event Stage 4: IoStage Vector 50 50 13182000 0 Viewer 50 50 34400 0 --- Event Stage 5: SolvAlloc Vector 140 6 8848 0 Vector Scatter 6 0 0 0 Matrix 6 0 0 0 Distributed Mesh 2 0 0 0 Bipartite Graph 4 0 0 0 Index Set 14 14 372400 0 IS L to G Mapping 3 0 0 0 Krylov Solver 2 0 0 0 Preconditioner 2 0 0 0 --- Event Stage 6: SolvSolve Vector 22 0 0 0 Krylov Solver 3 2 2296 0 Preconditioner 3 2 1760 0 --- Event Stage 7: SolvDeall Vector 0 149 41419384 0 Vector Scatter 0 1 1036 0 Matrix 0 3 4619676 0 Krylov Solver 0 3 32416 0 Preconditioner 0 3 2520 0 ======================================================================================================================== Average time to get PetscTime(): 1.90735e-07 Average time for MPI_Barrier(): 4.62532e-06 Average time for zero size MPI_Send(): 1.51992e-06 #PETSc Option Table entries: -ksp_type cg -log_summary -pc_mg_galerkin -pc_type mg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Configure options: --> And with 128^3 (512 iterations): Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 Max Max/Min Avg Total Time (sec): 5.889e+02 1.00000 5.889e+02 Objects: 1.413e+04 1.00000 1.413e+04 Flops: 9.486e+10 1.00000 9.486e+10 6.071e+12 Flops/sec: 1.611e+08 1.00000 1.611e+08 1.031e+10 MPI Messages: 5.392e+05 1.00578 5.361e+05 3.431e+07 MPI Message Lengths: 6.042e+09 1.36798 8.286e+03 2.843e+11 MPI Reductions: 1.343e+05 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.1330e-01 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: StepStage: 1.7508e+00 0.3% 2.8991e+10 0.5% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2: ConvStage: 3.5534e+01 6.0% 1.4818e+11 2.4% 5.898e+06 17.2% 1.408e+03 17.0% 0.000e+00 0.0% 3: ProjStage: 5.3568e+02 91.0% 5.8820e+12 96.9% 2.833e+07 82.6% 6.765e+03 81.6% 1.319e+05 98.2% 4: IoStage: 1.1365e+01 1.9% 0.0000e+00 0.0% 1.782e+04 0.1% 9.901e+01 1.2% 2.500e+02 0.2% 5: SolvAlloc: 7.1497e-01 0.1% 0.0000e+00 0.0% 5.632e+03 0.0% 1.866e-01 0.0% 3.330e+02 0.2% 6: SolvSolve: 3.7604e+00 0.6% 1.1888e+10 0.2% 5.722e+04 0.2% 1.366e+01 0.2% 1.803e+03 1.3% 7: SolvDeall: 7.6677e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: StepStage VecAXPY 6144 1.0 1.8187e+00 1.1 4.53e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 99100 0 0 0 15941 --- Event Stage 2: ConvStage VecCopy 9216 1.0 3.2440e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 VecAXPY 9216 1.0 2.4045e+00 1.1 6.04e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 6 26 0 0 0 16076 VecAXPBYCZ 10752 1.0 5.1656e+00 1.1 1.41e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 14 61 0 0 0 17460 VecPointwiseMult 9216 1.0 2.9012e+00 1.0 3.02e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 8 13 0 0 0 6662 VecScatterBegin 15360 1.0 7.3895e+00 1.3 0.00e+00 0.0 5.9e+06 8.2e+03 0.0e+00 1 0 17 17 0 18 0100100 0 0 VecScatterEnd 15360 1.0 4.4483e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 --- Event Stage 3: ProjStage VecMDot 5120 1.0 5.2159e+00 1.2 1.85e+09 1.0 0.0e+00 0.0e+00 5.1e+03 1 2 0 0 4 1 2 0 0 4 22644 VecTDot 66106 1.0 1.3662e+01 1.4 4.33e+09 1.0 0.0e+00 0.0e+00 6.6e+04 2 5 0 0 49 2 5 0 0 50 20295 VecNorm 39197 1.0 1.4431e+01 2.8 2.57e+09 1.0 0.0e+00 0.0e+00 3.9e+04 2 3 0 0 29 2 3 0 0 30 11392 VecScale 39197 1.0 2.8002e+00 1.2 1.28e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 29356 VecCopy 70202 1.0 1.1299e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSet 69178 1.0 3.9612e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 135284 1.0 1.9286e+01 1.1 8.87e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 9 0 0 0 3 10 0 0 0 29422 VecAYPX 99671 1.0 1.7862e+01 1.1 5.43e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 19464 VecMAXPY 5632 1.0 3.7555e+00 1.0 2.18e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 37169 VecScatterBegin 73786 1.0 6.2463e+00 1.2 0.00e+00 0.0 2.8e+07 8.2e+03 0.0e+00 1 0 83 82 0 1 0100100 0 0 VecScatterEnd 73786 1.0 2.1679e+01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 VecNormalize 5632 1.0 9.0864e-01 1.2 5.54e+08 1.0 0.0e+00 0.0e+00 5.6e+03 0 1 0 0 4 0 1 0 0 4 38996 MatMult 71738 1.0 1.5645e+02 1.1 3.29e+10 1.0 2.8e+07 8.2e+03 0.0e+00 26 35 80 79 0 28 36 97 97 0 13462 MatSOR 72762 1.0 2.9900e+02 1.0 3.25e+10 1.0 0.0e+00 0.0e+00 0.0e+00 49 34 0 0 0 54 35 0 0 0 6953 KSPGMRESOrthog 5120 1.0 8.0849e+00 1.1 3.69e+09 1.0 0.0e+00 0.0e+00 5.1e+03 1 4 0 0 4 1 4 0 0 4 29218 KSPSetUp 1536 1.0 2.0613e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+04 0 0 0 0 8 0 0 0 0 9 0 KSPSolve 512 1.0 5.3248e+02 1.0 9.18e+10 1.0 2.8e+07 8.2e+03 1.3e+05 90 97 80 79 98 99100 97 97100 11034 PCSetUp 512 1.0 5.6760e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.1e+03 0 0 0 0 2 0 0 0 0 2 0 PCApply 33565 1.0 4.2495e+02 1.0 6.36e+10 1.0 1.5e+07 8.2e+03 2.6e+04 71 67 43 43 19 78 69 52 52 20 9585 --- Event Stage 4: IoStage VecView 50 1.0 7.7463e+00240.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+02 1 0 0 0 0 34 0 0 0 40 0 VecCopy 50 1.0 1.0773e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 30 1.0 1.1727e-02 2.3 0.00e+00 0.0 1.2e+04 8.2e+03 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 VecScatterEnd 30 1.0 2.2058e+00701.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 10 0 0 0 0 0 --- Event Stage 5: SolvAlloc VecSet 50 1.0 1.3748e-01 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 14 0 0 0 0 0 MatAssemblyBegin 4 1.0 3.1760e-0217.4 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 2 0 0 0 2 0 MatAssemblyEnd 4 1.0 2.1847e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 --- Event Stage 6: SolvSolve VecMDot 10 1.0 1.2067e-02 1.5 3.60e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 2 0 0 1 19117 VecTDot 134 1.0 2.6145e-02 1.5 8.78e+06 1.0 0.0e+00 0.0e+00 1.3e+02 0 0 0 0 0 1 5 0 0 7 21497 VecNorm 1615 1.0 1.4866e+00 3.5 5.18e+06 1.0 0.0e+00 0.0e+00 1.6e+03 0 0 0 0 1 29 3 0 0 90 223 VecScale 79 1.0 5.9721e-03 1.2 2.59e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 27741 VecCopy 145 1.0 2.4912e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecSet 140 1.0 7.9901e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 277 1.0 4.0597e-02 1.2 1.82e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 10 0 0 0 28619 VecAYPX 202 1.0 3.5421e-02 1.1 1.10e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 6 0 0 0 19893 VecMAXPY 11 1.0 7.7360e-03 1.1 4.26e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 2 0 0 0 35242 VecScatterBegin 149 1.0 1.4983e-02 1.2 0.00e+00 0.0 5.7e+04 8.2e+03 0.0e+00 0 0 0 0 0 0 0100100 0 0 VecScatterEnd 149 1.0 5.0236e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecNormalize 11 1.0 7.1080e-03 3.9 1.08e+06 1.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 0 0 1 0 0 1 9736 MatMult 145 1.0 3.2611e-01 1.1 6.65e+07 1.0 5.6e+04 8.2e+03 0.0e+00 0 0 0 0 0 8 36 97 97 0 13055 MatSOR 147 1.0 6.0702e-01 1.0 6.57e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 16 35 0 0 0 6923 KSPGMRESOrthog 10 1.0 1.7956e-02 1.3 7.21e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 4 0 0 1 25694 KSPSetUp 3 1.0 3.0483e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 0 0 0 0 0 1 0 0 0 1 0 KSPSolve 1 1.0 1.1431e+00 1.0 1.85e+08 1.0 5.6e+04 8.2e+03 2.7e+02 0 0 0 0 0 30100 97 97 15 10378 PCSetUp 1 1.0 1.1488e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 68 1.0 9.1644e-01 1.0 1.28e+08 1.0 3.0e+04 8.2e+03 5.1e+01 0 0 0 0 0 24 69 52 52 3 8959 --- Event Stage 7: SolvDeall ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 1 0 0 0 --- Event Stage 1: StepStage --- Event Stage 2: ConvStage --- Event Stage 3: ProjStage Vector 10752 10752 2834657280 0 Krylov Solver 1536 1536 16596992 0 Preconditioner 1536 1536 1290240 0 --- Event Stage 4: IoStage Vector 50 50 13182000 0 Viewer 50 50 34400 0 --- Event Stage 5: SolvAlloc Vector 140 6 8848 0 Vector Scatter 6 0 0 0 Matrix 6 0 0 0 Distributed Mesh 2 0 0 0 Bipartite Graph 4 0 0 0 Index Set 14 14 372400 0 IS L to G Mapping 3 0 0 0 Krylov Solver 2 0 0 0 Preconditioner 2 0 0 0 --- Event Stage 6: SolvSolve Vector 22 0 0 0 Krylov Solver 3 2 2296 0 Preconditioner 3 2 1760 0 --- Event Stage 7: SolvDeall Vector 0 149 41419384 0 Vector Scatter 0 1 1036 0 Matrix 0 3 4619676 0 Krylov Solver 0 3 32416 0 Preconditioner 0 3 2520 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 1.13964e-05 Average time for zero size MPI_Send(): 1.2815e-06 #PETSc Option Table entries: -ksp_type cg -log_summary -pc_mg_galerkin -pc_type mg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Configure options: Best, Filippo On Monday 29 September 2014 08:58:35 Matthew Knepley wrote: > On Mon, Sep 29, 2014 at 8:42 AM, Filippo Leonardi < > > filippo.leonardi at sam.math.ethz.ch> wrote: > > Hi, > > > > I am trying to solve a standard second order central differenced Poisson > > equation in parallel, in 3D, using a 3D structured DMDAs (extremely > > standard > > Laplacian matrix). > > > > I want to get some nice scaling (especially weak), but my results show > > that > > the Krylow method is not performing as expected. The problem (at leas for > > CG + > > Bjacobi) seems to lie on the number of iterations. > > > > In particular the number of iterations grows with CG (the matrix is SPD) > > + > > BJacobi as mesh is refined (probably due to condition number increasing) > > and > > number of processors is increased (probably due to the Bjacobi > > preconditioner). For instance I tried the following setup: > > 1 procs to solve 32^3 domain => 20 iterations > > 8 procs to solve 64^3 domain => 60 iterations > > 64 procs to solve 128^3 domain => 101 iterations > > > > Is there something pathological with my runs (maybe I am missing > > something)? > > Is there somebody who can provide me weak scaling benchmarks for > > equivalent > > problems? (Maybe there is some better preconditioner for this problem). > > Bjacobi is not a scalable preconditioner. As you note, the number of > iterates grows > with the system size. You should always use MG here. > > > I am also aware that Multigrid is even better for this problems but the > > **scalability** of my runs seems to be as bad as with CG. > > MG will weak scale almost perfectly. Send -log_summary for each run if this > does not happen. > > Thanks, > > Matt > > > -pc_mg_galerkin > > -pc_type mg > > (both directly with richardson or as preconditioner to cg) > > > > The following is the "-log_summary" of a 128^3 run, notice that I solve > > the > > system multiple times (hence KSPSolve is multiplied by 128). Using CG + > > BJacobi. > > > > Tell me if I missed some detail and sorry for the length of the post. > > > > Thanks, > > Filippo > > > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 > > > > Max Max/Min Avg Total > > > > Time (sec): 9.095e+01 1.00001 9.095e+01 > > Objects: 1.875e+03 1.00000 1.875e+03 > > Flops: 1.733e+10 1.00000 1.733e+10 1.109e+12 > > Flops/sec: 1.905e+08 1.00001 1.905e+08 1.219e+10 > > MPI Messages: 1.050e+05 1.00594 1.044e+05 6.679e+06 > > MPI Message Lengths: 1.184e+09 1.37826 8.283e+03 5.532e+10 > > MPI Reductions: 4.136e+04 1.00000 > > > > Flop counting convention: 1 flop = 1 real number operation of type > > (multiply/divide/add/subtract) > > > > e.g., VecAXPY() for real vectors of length N > > > > --> > > 2N flops > > > > and VecAXPY() for complex vectors of length N > > > > --> > > 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > > --- > > -- Message Lengths -- -- Reductions -- > > > > Avg %Total Avg %Total counts > > %Total > > > > Avg %Total counts %Total > > > > 0: Main Stage: 1.1468e-01 0.1% 0.0000e+00 0.0% 0.000e+00 > > 0.0% > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > 1: StepStage: 4.4170e-01 0.5% 7.2478e+09 0.7% 0.000e+00 > > 0.0% > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > 2: ConvStage: 8.8333e+00 9.7% 3.7044e+10 3.3% 1.475e+06 > > 22.1% > > > > 1.809e+03 21.8% 0.000e+00 0.0% > > > > 3: ProjStage: 7.7169e+01 84.8% 1.0556e+12 95.2% 5.151e+06 > > 77.1% > > > > 6.317e+03 76.3% 4.024e+04 97.3% > > > > 4: IoStage: 2.4789e+00 2.7% 0.0000e+00 0.0% 3.564e+03 > > 0.1% > > > > 1.017e+02 1.2% 5.000e+01 0.1% > > > > 5: SolvAlloc: 7.0947e-01 0.8% 0.0000e+00 0.0% 5.632e+03 > > 0.1% > > > > 9.587e-01 0.0% 3.330e+02 0.8% > > > > 6: SolvSolve: 1.2044e+00 1.3% 9.1679e+09 0.8% 4.454e+04 > > 0.7% > > > > 5.464e+01 0.7% 7.320e+02 1.8% > > > > 7: SolvDeall: 7.5711e-04 0.0% 0.0000e+00 0.0% 0.000e+00 > > 0.0% > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > > > -------------------------------------------------------------------------- > > ---------------------------------------------- See the 'Profiling' chapter > > of the users' manual for details on > > interpreting > > output. > > > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > > > Ratio - ratio of maximum to minimum over all processors > > > > Mess: number of messages sent > > Avg. len: average message length > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() and > > > > PetscLogStagePop(). > > > > %T - percent time in this phase %f - percent flops in this > > > > phase > > > > %M - percent messages in this phase %L - percent message lengths > > > > in > > this phase > > > > %R - percent reductions in this phase > > > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > > > > over all > > processors) > > > > -------------------------------------------------------------------------- > > ---------------------------------------------- Event Count > > Time (sec) Flops > > --- Global --- --- Stage --- Total > > > > Max Ratio Max Ratio Max Ratio Mess Avg len > > > > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > > > > -------------------------------------------------------------------------- > > ---------------------------------------------- > > > > --- Event Stage 0: Main Stage > > > > > > --- Event Stage 1: StepStage > > > > VecAXPY 1536 1.0 4.6436e-01 1.1 1.13e+08 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 1 0 0 0 99100 0 0 0 15608 > > > > --- Event Stage 2: ConvStage > > > > VecCopy 2304 1.0 8.1658e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 > > VecAXPY 2304 1.0 6.1324e-01 1.2 1.51e+08 1.0 0.0e+00 0.0e+00 > > 0.0e+00 1 1 0 0 0 6 26 0 0 0 15758 > > VecAXPBYCZ 2688 1.0 1.3029e+00 1.1 3.52e+08 1.0 0.0e+00 0.0e+00 > > 0.0e+00 1 2 0 0 0 14 61 0 0 0 17306 > > VecPointwiseMult 2304 1.0 7.2368e-01 1.0 7.55e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 1 0 0 0 0 8 13 0 0 0 6677 > > VecScatterBegin 3840 1.0 1.8182e+00 1.3 0.00e+00 0.0 1.5e+06 8.2e+03 > > 0.0e+00 2 0 22 22 0 18 0100100 0 0 > > VecScatterEnd 3840 1.0 1.1972e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 > > > > --- Event Stage 3: ProjStage > > > > VecTDot 25802 1.0 4.2552e+00 1.3 1.69e+09 1.0 0.0e+00 0.0e+00 > > 2.6e+04 4 10 0 0 62 5 10 0 0 64 25433 > > VecNorm 13029 1.0 3.0772e+00 3.3 8.54e+08 1.0 0.0e+00 0.0e+00 > > 1.3e+04 2 5 0 0 32 2 5 0 0 32 17759 > > VecCopy 640 1.0 2.4339e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 13157 1.0 7.0903e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > > VecAXPY 26186 1.0 4.1462e+00 1.1 1.72e+09 1.0 0.0e+00 0.0e+00 > > 0.0e+00 4 10 0 0 0 5 10 0 0 0 26490 > > VecAYPX 12773 1.0 1.9135e+00 1.1 8.37e+08 1.0 0.0e+00 0.0e+00 > > 0.0e+00 2 5 0 0 0 2 5 0 0 0 27997 > > VecScatterBegin 13413 1.0 1.0689e+00 1.1 0.00e+00 0.0 5.2e+06 8.2e+03 > > 0.0e+00 1 0 77 76 0 1 0100100 0 0 > > VecScatterEnd 13413 1.0 2.7944e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 > > MatMult 12901 1.0 3.2072e+01 1.0 5.92e+09 1.0 5.0e+06 8.2e+03 > > 0.0e+00 35 34 74 73 0 41 36 96 96 0 11810 > > MatSolve 13029 1.0 3.0851e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 > > 0.0e+00 33 31 0 0 0 39 33 0 0 0 11182 > > MatLUFactorNum 128 1.0 1.2922e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 1 1 0 0 0 2 1 0 0 0 4358 > > MatILUFactorSym 128 1.0 7.5075e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > 1.3e+02 1 0 0 0 0 1 0 0 0 0 0 > > MatGetRowIJ 128 1.0 1.4782e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetOrdering 128 1.0 5.7567e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > 2.6e+02 0 0 0 0 1 0 0 0 0 1 0 > > KSPSetUp 256 1.0 1.9913e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > > 7.7e+02 0 0 0 0 2 0 0 0 0 2 0 > > KSPSolve 128 1.0 7.6381e+01 1.0 1.65e+10 1.0 5.0e+06 8.2e+03 > > 4.0e+04 84 95 74 73 97 99100 96 96100 13800 > > PCSetUp 256 1.0 2.1503e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > > 6.4e+02 2 1 0 0 2 3 1 0 0 2 2619 > > PCSetUpOnBlocks 128 1.0 2.1232e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > > 3.8e+02 2 1 0 0 1 3 1 0 0 1 2652 > > PCApply 13029 1.0 3.1812e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 > > 0.0e+00 34 31 0 0 0 40 33 0 0 0 10844 > > > > --- Event Stage 4: IoStage > > > > VecView 10 1.0 1.7523e+00282.9 0.00e+00 0.0 0.0e+00 0.0e+00 > > 2.0e+01 1 0 0 0 0 36 0 0 0 40 0 > > VecCopy 10 1.0 2.2449e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 6 1.0 2.3620e-03 2.4 0.00e+00 0.0 2.3e+03 8.2e+03 > > 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 > > VecScatterEnd 6 1.0 4.4194e-01663.9 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 9 0 0 0 0 0 > > > > --- Event Stage 5: SolvAlloc > > > > VecSet 50 1.0 1.3170e-01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 > > MatAssemblyBegin 4 1.0 3.9801e-0230.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > 8.0e+00 0 0 0 0 0 3 0 0 0 2 0 > > MatAssemblyEnd 4 1.0 2.2752e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 > > 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 > > > > --- Event Stage 6: SolvSolve > > > > VecTDot 224 1.0 3.5454e-02 1.3 1.47e+07 1.0 0.0e+00 0.0e+00 > > 2.2e+02 0 0 0 0 1 3 10 0 0 31 26499 > > VecNorm 497 1.0 1.5268e-01 1.4 7.41e+06 1.0 0.0e+00 0.0e+00 > > 5.0e+02 0 0 0 0 1 11 5 0 0 68 3104 > > VecCopy 8 1.0 2.7523e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 114 1.0 5.9965e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecAXPY 230 1.0 3.7198e-02 1.1 1.51e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 3 11 0 0 0 25934 > > VecAYPX 111 1.0 1.7153e-02 1.1 7.27e+06 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 1 5 0 0 0 27142 > > VecScatterBegin 116 1.0 1.1888e-02 1.2 0.00e+00 0.0 4.5e+04 8.2e+03 > > 0.0e+00 0 0 1 1 0 1 0100100 0 0 > > VecScatterEnd 116 1.0 2.8105e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 > > MatMult 112 1.0 2.8080e-01 1.0 5.14e+07 1.0 4.3e+04 8.2e+03 > > 0.0e+00 0 0 1 1 0 23 36 97 97 0 11711 > > MatSolve 113 1.0 2.6673e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 22 33 0 0 0 11217 > > MatLUFactorNum 1 1.0 1.0332e-02 1.0 6.87e+05 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 1 0 0 0 0 4259 > > MatILUFactorSym 1 1.0 3.1291e-02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 > > 1.0e+00 0 0 0 0 0 2 0 0 0 0 0 > > MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetOrdering 1 1.0 3.4251e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 > > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSetUp 2 1.0 3.6959e-0210.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 6.0e+00 0 0 0 0 0 1 0 0 0 1 0 > > KSPSolve 1 1.0 6.9956e-01 1.0 1.43e+08 1.0 4.3e+04 8.2e+03 > > 3.5e+02 1 1 1 1 1 58100 97 97 48 13069 > > PCSetUp 2 1.0 4.4161e-02 2.3 6.87e+05 1.0 0.0e+00 0.0e+00 > > 5.0e+00 0 0 0 0 0 3 0 0 0 1 996 > > PCSetUpOnBlocks 1 1.0 4.3894e-02 2.4 6.87e+05 1.0 0.0e+00 0.0e+00 > > 3.0e+00 0 0 0 0 0 3 0 0 0 0 1002 > > PCApply 113 1.0 2.7507e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 22 33 0 0 0 10877 > > > > --- Event Stage 7: SolvDeall > > > > > > -------------------------------------------------------------------------- > > ---------------------------------------------- > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' > > Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Viewer 1 0 0 0 > > > > --- Event Stage 1: StepStage > > > > > > --- Event Stage 2: ConvStage > > > > > > --- Event Stage 3: ProjStage > > > > Vector 640 640 101604352 0 > > Matrix 128 128 410327040 0 > > > > Index Set 384 384 17062912 0 > > > > Krylov Solver 256 256 282624 0 > > > > Preconditioner 256 256 228352 0 > > > > --- Event Stage 4: IoStage > > > > Vector 10 10 2636400 0 > > Viewer 10 10 6880 0 > > > > --- Event Stage 5: SolvAlloc > > > > Vector 140 6 8848 0 > > > > Vector Scatter 6 0 0 0 > > > > Matrix 6 0 0 0 > > > > Distributed Mesh 2 0 0 0 > > > > Bipartite Graph 4 0 0 0 > > > > Index Set 14 14 372400 0 > > > > IS L to G Mapping 3 0 0 0 > > > > Krylov Solver 1 0 0 0 > > > > Preconditioner 1 0 0 0 > > > > --- Event Stage 6: SolvSolve > > > > Vector 5 0 0 0 > > Matrix 1 0 0 0 > > > > Index Set 3 0 0 0 > > > > Krylov Solver 2 1 1136 0 > > > > Preconditioner 2 1 824 0 > > > > --- Event Stage 7: SolvDeall > > > > Vector 0 133 36676728 0 > > > > Vector Scatter 0 1 1036 0 > > > > Matrix 0 4 7038924 0 > > > > Index Set 0 3 133304 0 > > > > Krylov Solver 0 2 2208 0 > > > > Preconditioner 0 2 1784 0 > > > > ========================================================================== > > ============================================== Average time to get > > PetscTime(): 9.53674e-08 > > Average time for MPI_Barrier(): 1.12057e-05 > > Average time for zero size MPI_Send(): 1.3113e-06 > > #PETSc Option Table entries: > > -ksp_type cg > > -log_summary > > -pc_type bjacobi > > #End of PETSc Option Table entries > > Compiled without FORTRAN kernels > > Compiled with full precision matrices (default) > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > Configure run at: > > Configure options: > > Application 9457215 resources: utime ~5920s, stime ~58s -------------- next part -------------- A non-text attachment was scrubbed... Name: ETHZ.vcf Type: text/vcard Size: 594 bytes Desc: not available URL: From hzhang at mcs.anl.gov Mon Sep 29 10:17:12 2014 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 29 Sep 2014 10:17:12 -0500 Subject: [petsc-users] KSP_DIVERGED_INDEFINITE_PC message even with PCFactorSetShiftType In-Reply-To: References: Message-ID: Evan : > PCFactorSetMatSolverPackage(pc_fetd_dt, MATSOLVERMUMPS); You use mumps' Cholesky. > PCFactorSetShiftType(pc_fetd_dt,MAT_SHIFT_POSITIVE_DEFINITE); This option only works for PETSc Cholesky, not mumps. I'm not aware of any option in mumps to add a shift to zero pivot. I'll send a request to mumps developer about it. Hong From knepley at gmail.com Mon Sep 29 14:39:56 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 29 Sep 2014 14:39:56 -0500 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: <20769360.6VtOyZos7S@besikovitch-ii> References: <2490546.DNVhllGaLT@besikovitch-ii> <20769360.6VtOyZos7S@besikovitch-ii> Message-ID: On Mon, Sep 29, 2014 at 9:36 AM, Filippo Leonardi < filippo.leonardi at sam.math.ethz.ch> wrote: > Thank you. > > Actually I had the feeling that it wasn't my problem with Bjacobi and CG. > > So I'll stick to MG. Problem with MG is that there are a lot of parameters > to > be tuned, so I leave the defaults (expect I select CG as Krylow method). I > post just results for 64^3 and 128^3. Tell me if I'm missing some useful > detail. (I get similar results with BoomerAMG). > 1) I assume we are looking at ProjStage? 2) Why are you doing a different number of solves on the different number of processes? Matt > Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin > -pc_type > mg): > 32^3 and 1 proc: 1.01e-1 > 64^3 and 8 proc: 6.56e-01 > 128^3 and 64 proc: 1.05e+00 > Number of PCSetup per KSPSolve: > 15 > 39 > 65 > > With BoomerAMG: > stable 8 iterations per KSP but time per iteration greater than PETSc MG > and > still increases: > 64^3: 3.17e+00 > 128^3: 9.99e+00 > > > --> For instance with 64^3 (256 iterations): > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 > > Max Max/Min Avg Total > Time (sec): 1.896e+02 1.00000 1.896e+02 > Objects: 7.220e+03 1.00000 7.220e+03 > Flops: 3.127e+10 1.00000 3.127e+10 2.502e+11 > Flops/sec: 1.649e+08 1.00000 1.649e+08 1.319e+09 > MPI Messages: 9.509e+04 1.00316 9.483e+04 7.586e+05 > MPI Message Lengths: 1.735e+09 1.09967 1.685e+04 1.278e+10 > MPI Reductions: 4.781e+04 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> > 2N flops > and VecAXPY() for complex vectors of length N > --> > 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- > -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total > Avg %Total counts %Total > 0: Main Stage: 1.3416e-02 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 1: StepStage: 8.7909e-01 0.5% 1.8119e+09 0.7% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 2: ConvStage: 1.7172e+01 9.1% 9.2610e+09 3.7% 1.843e+05 24.3% > 3.981e+03 23.6% 0.000e+00 0.0% > 3: ProjStage: 1.6804e+02 88.6% 2.3813e+11 95.2% 5.703e+05 75.2% > 1.232e+04 73.1% 4.627e+04 96.8% > 4: IoStage: 1.5814e+00 0.8% 0.0000e+00 0.0% 1.420e+03 0.2% > 4.993e+02 3.0% 2.500e+02 0.5% > 5: SolvAlloc: 2.5722e-01 0.1% 0.0000e+00 0.0% 2.560e+02 0.0% > 1.054e+00 0.0% 3.330e+02 0.7% > 6: SolvSolve: 1.6776e+00 0.9% 9.5345e+08 0.4% 2.280e+03 0.3% > 4.924e+01 0.3% 9.540e+02 2.0% > 7: SolvDeall: 7.4017e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting > output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %f - percent flops in this > phase > %M - percent messages in this phase %L - percent message lengths > in > this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all > processors) > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > > --- Event Stage 1: StepStage > > VecAXPY 3072 1.0 8.8295e-01 1.0 2.26e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 99100 0 0 0 2052 > > --- Event Stage 2: ConvStage > > VecCopy 4608 1.0 1.6016e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 > VecAXPY 4608 1.0 1.2212e+00 1.2 3.02e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 6 26 0 0 0 1978 > VecAXPBYCZ 5376 1.0 2.5875e+00 1.1 7.05e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 2 0 0 0 15 61 0 0 0 2179 > VecPointwiseMult 4608 1.0 1.4411e+00 1.0 1.51e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 8 13 0 0 0 838 > VecScatterBegin 7680 1.0 3.4130e+00 1.0 0.00e+00 0.0 1.8e+05 1.6e+04 > 0.0e+00 2 0 24 24 0 20 0100100 0 0 > VecScatterEnd 7680 1.0 9.3412e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 5 0 0 0 0 0 > > --- Event Stage 3: ProjStage > > VecMDot 2560 1.0 2.1944e+00 1.1 9.23e+08 1.0 0.0e+00 0.0e+00 > 2.6e+03 1 3 0 0 5 1 3 0 0 6 3364 > VecTDot 19924 1.0 2.7283e+00 1.3 1.31e+09 1.0 0.0e+00 0.0e+00 > 2.0e+04 1 4 0 0 42 1 4 0 0 43 3829 > VecNorm 13034 1.0 1.5385e+00 2.0 8.54e+08 1.0 0.0e+00 0.0e+00 > 1.3e+04 1 3 0 0 27 1 3 0 0 28 4442 > VecScale 13034 1.0 9.0783e-01 1.3 4.27e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 3764 > VecCopy 21972 1.0 3.5136e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > VecSet 21460 1.0 1.3108e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecAXPY 41384 1.0 5.9866e+00 1.1 2.71e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 3 9 0 0 0 3 9 0 0 0 3624 > VecAYPX 30142 1.0 5.3362e+00 1.0 1.64e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 3 5 0 0 0 3 6 0 0 0 2460 > VecMAXPY 2816 1.0 1.8561e+00 1.0 1.09e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 3 0 0 0 1 4 0 0 0 4700 > VecScatterBegin 23764 1.0 1.7138e+00 1.1 0.00e+00 0.0 5.7e+05 1.6e+04 > 0.0e+00 1 0 75 73 0 1 0100100 0 0 > VecScatterEnd 23764 1.0 3.1986e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecNormalize 2816 1.0 2.9511e-01 1.1 2.77e+08 1.0 0.0e+00 0.0e+00 > 2.8e+03 0 1 0 0 6 0 1 0 0 6 7504 > MatMult 22740 1.0 4.6896e+01 1.0 1.04e+10 1.0 5.5e+05 1.6e+04 > 0.0e+00 25 33 72 70 0 28 35 96 96 0 1780 > MatSOR 23252 1.0 9.5250e+01 1.0 1.04e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 50 33 0 0 0 56 35 0 0 0 872 > KSPGMRESOrthog 2560 1.0 3.6142e+00 1.1 1.85e+09 1.0 0.0e+00 0.0e+00 > 2.6e+03 2 6 0 0 5 2 6 0 0 6 4085 > KSPSetUp 768 1.0 7.9389e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 5.6e+03 0 0 0 0 12 0 0 0 0 12 0 > KSPSolve 256 1.0 1.6661e+02 1.0 2.97e+10 1.0 5.5e+05 1.6e+04 > 4.6e+04 88 95 72 70 97 99100 96 96100 1427 > PCSetUp 256 1.0 2.6755e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+03 0 0 0 0 3 0 0 0 0 3 0 > PCApply 10218 1.0 1.3642e+02 1.0 2.12e+10 1.0 3.1e+05 1.6e+04 > 1.3e+04 72 68 40 39 27 81 71 54 54 28 1245 > > --- Event Stage 4: IoStage > > VecView 50 1.0 8.8377e-0138.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+02 0 0 0 0 0 29 0 0 0 40 0 > VecCopy 50 1.0 8.9977e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecScatterBegin 30 1.0 1.0644e-02 1.6 0.00e+00 0.0 7.2e+02 1.6e+04 > 0.0e+00 0 0 0 0 0 1 0 51 3 0 0 > VecScatterEnd 30 1.0 2.4857e-01109.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 8 0 0 0 0 0 > > --- Event Stage 5: SolvAlloc > > VecSet 50 1.0 1.9324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 7 0 0 0 0 0 > MatAssemblyBegin 4 1.0 5.0378e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 1 0 0 0 2 0 > MatAssemblyEnd 4 1.0 1.5030e-02 1.0 0.00e+00 0.0 9.6e+01 4.1e+03 > 1.6e+01 0 0 0 0 0 6 0 38 49 5 0 > > --- Event Stage 6: SolvSolve > > VecMDot 10 1.0 8.9154e-03 1.1 3.60e+06 1.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 3 0 0 1 3234 > VecTDot 80 1.0 1.1104e-02 1.1 5.24e+06 1.0 0.0e+00 0.0e+00 > 8.0e+01 0 0 0 0 0 1 4 0 0 8 3777 > VecNorm 820 1.0 2.6904e-01 1.6 3.41e+06 1.0 0.0e+00 0.0e+00 > 8.2e+02 0 0 0 0 2 13 3 0 0 86 101 > VecScale 52 1.0 3.6066e-03 1.2 1.70e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 1 0 0 0 3780 > VecCopy 91 1.0 1.4363e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecSet 86 1.0 5.1112e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 169 1.0 2.4659e-02 1.1 1.11e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 9 0 0 0 3593 > VecAYPX 121 1.0 2.2017e-02 1.1 6.59e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 6 0 0 0 2393 > VecMAXPY 11 1.0 7.2782e-03 1.0 4.26e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 4 0 0 0 4682 > VecScatterBegin 95 1.0 7.3617e-03 1.1 0.00e+00 0.0 2.3e+03 1.6e+04 > 0.0e+00 0 0 0 0 0 0 0100100 0 0 > VecScatterEnd 95 1.0 1.3788e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecNormalize 11 1.0 1.2109e-03 1.1 1.08e+06 1.0 0.0e+00 0.0e+00 > 1.1e+01 0 0 0 0 0 0 1 0 0 1 7144 > MatMult 91 1.0 1.9398e-01 1.0 4.17e+07 1.0 2.2e+03 1.6e+04 > 0.0e+00 0 0 0 0 0 11 35 96 96 0 1722 > MatSOR 93 1.0 3.8194e-01 1.0 4.16e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 23 35 0 0 0 870 > KSPGMRESOrthog 10 1.0 1.4540e-02 1.1 7.21e+06 1.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 1 6 0 0 1 3966 > KSPSetUp 3 1.0 5.2021e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.4e+01 0 0 0 0 0 0 0 0 0 3 0 > KSPSolve 1 1.0 6.7911e-01 1.0 1.19e+08 1.0 2.2e+03 1.6e+04 > 1.9e+02 0 0 0 0 0 40100 96 96 19 1399 > PCSetUp 1 1.0 1.9128e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 > PCApply 41 1.0 5.5355e-01 1.0 8.47e+07 1.0 1.2e+03 1.6e+04 > 5.1e+01 0 0 0 0 0 33 71 54 54 5 1224 > > --- Event Stage 7: SolvDeall > > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Viewer 1 0 0 0 > > --- Event Stage 1: StepStage > > > --- Event Stage 2: ConvStage > > > --- Event Stage 3: ProjStage > > Vector 5376 5376 1417328640 0 > Krylov Solver 768 768 8298496 0 > Preconditioner 768 768 645120 0 > > --- Event Stage 4: IoStage > > Vector 50 50 13182000 0 > Viewer 50 50 34400 0 > > --- Event Stage 5: SolvAlloc > > Vector 140 6 8848 0 > Vector Scatter 6 0 0 0 > Matrix 6 0 0 0 > Distributed Mesh 2 0 0 0 > Bipartite Graph 4 0 0 0 > Index Set 14 14 372400 0 > IS L to G Mapping 3 0 0 0 > Krylov Solver 2 0 0 0 > Preconditioner 2 0 0 0 > > --- Event Stage 6: SolvSolve > > Vector 22 0 0 0 > Krylov Solver 3 2 2296 0 > Preconditioner 3 2 1760 0 > > --- Event Stage 7: SolvDeall > > Vector 0 149 41419384 0 > Vector Scatter 0 1 1036 0 > Matrix 0 3 4619676 0 > Krylov Solver 0 3 32416 0 > Preconditioner 0 3 2520 0 > > ======================================================================================================================== > Average time to get PetscTime(): 1.90735e-07 > Average time for MPI_Barrier(): 4.62532e-06 > Average time for zero size MPI_Send(): 1.51992e-06 > #PETSc Option Table entries: > -ksp_type cg > -log_summary > -pc_mg_galerkin > -pc_type mg > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure run at: > Configure options: > > --> And with 128^3 (512 iterations): > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 > > Max Max/Min Avg Total > Time (sec): 5.889e+02 1.00000 5.889e+02 > Objects: 1.413e+04 1.00000 1.413e+04 > Flops: 9.486e+10 1.00000 9.486e+10 6.071e+12 > Flops/sec: 1.611e+08 1.00000 1.611e+08 1.031e+10 > MPI Messages: 5.392e+05 1.00578 5.361e+05 3.431e+07 > MPI Message Lengths: 6.042e+09 1.36798 8.286e+03 2.843e+11 > MPI Reductions: 1.343e+05 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> > 2N flops > and VecAXPY() for complex vectors of length N > --> > 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- > -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total > Avg %Total counts %Total > 0: Main Stage: 1.1330e-01 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 1: StepStage: 1.7508e+00 0.3% 2.8991e+10 0.5% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 2: ConvStage: 3.5534e+01 6.0% 1.4818e+11 2.4% 5.898e+06 17.2% > 1.408e+03 17.0% 0.000e+00 0.0% > 3: ProjStage: 5.3568e+02 91.0% 5.8820e+12 96.9% 2.833e+07 82.6% > 6.765e+03 81.6% 1.319e+05 98.2% > 4: IoStage: 1.1365e+01 1.9% 0.0000e+00 0.0% 1.782e+04 0.1% > 9.901e+01 1.2% 2.500e+02 0.2% > 5: SolvAlloc: 7.1497e-01 0.1% 0.0000e+00 0.0% 5.632e+03 0.0% > 1.866e-01 0.0% 3.330e+02 0.2% > 6: SolvSolve: 3.7604e+00 0.6% 1.1888e+10 0.2% 5.722e+04 0.2% > 1.366e+01 0.2% 1.803e+03 1.3% > 7: SolvDeall: 7.6677e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting > output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %f - percent flops in this > phase > %M - percent messages in this phase %L - percent message lengths > in > this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all > processors) > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > > --- Event Stage 1: StepStage > > VecAXPY 6144 1.0 1.8187e+00 1.1 4.53e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 99100 0 0 0 15941 > > --- Event Stage 2: ConvStage > > VecCopy 9216 1.0 3.2440e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 > VecAXPY 9216 1.0 2.4045e+00 1.1 6.04e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 6 26 0 0 0 16076 > VecAXPBYCZ 10752 1.0 5.1656e+00 1.1 1.41e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 14 61 0 0 0 17460 > VecPointwiseMult 9216 1.0 2.9012e+00 1.0 3.02e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 8 13 0 0 0 6662 > VecScatterBegin 15360 1.0 7.3895e+00 1.3 0.00e+00 0.0 5.9e+06 8.2e+03 > 0.0e+00 1 0 17 17 0 18 0100100 0 0 > VecScatterEnd 15360 1.0 4.4483e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 > > --- Event Stage 3: ProjStage > > VecMDot 5120 1.0 5.2159e+00 1.2 1.85e+09 1.0 0.0e+00 0.0e+00 > 5.1e+03 1 2 0 0 4 1 2 0 0 4 22644 > VecTDot 66106 1.0 1.3662e+01 1.4 4.33e+09 1.0 0.0e+00 0.0e+00 > 6.6e+04 2 5 0 0 49 2 5 0 0 50 20295 > VecNorm 39197 1.0 1.4431e+01 2.8 2.57e+09 1.0 0.0e+00 0.0e+00 > 3.9e+04 2 3 0 0 29 2 3 0 0 30 11392 > VecScale 39197 1.0 2.8002e+00 1.2 1.28e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 29356 > VecCopy 70202 1.0 1.1299e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > VecSet 69178 1.0 3.9612e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecAXPY 135284 1.0 1.9286e+01 1.1 8.87e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 3 9 0 0 0 3 10 0 0 0 29422 > VecAYPX 99671 1.0 1.7862e+01 1.1 5.43e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 3 6 0 0 0 3 6 0 0 0 19464 > VecMAXPY 5632 1.0 3.7555e+00 1.0 2.18e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 2 0 0 0 1 2 0 0 0 37169 > VecScatterBegin 73786 1.0 6.2463e+00 1.2 0.00e+00 0.0 2.8e+07 8.2e+03 > 0.0e+00 1 0 83 82 0 1 0100100 0 0 > VecScatterEnd 73786 1.0 2.1679e+01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 > VecNormalize 5632 1.0 9.0864e-01 1.2 5.54e+08 1.0 0.0e+00 0.0e+00 > 5.6e+03 0 1 0 0 4 0 1 0 0 4 38996 > MatMult 71738 1.0 1.5645e+02 1.1 3.29e+10 1.0 2.8e+07 8.2e+03 > 0.0e+00 26 35 80 79 0 28 36 97 97 0 13462 > MatSOR 72762 1.0 2.9900e+02 1.0 3.25e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 49 34 0 0 0 54 35 0 0 0 6953 > KSPGMRESOrthog 5120 1.0 8.0849e+00 1.1 3.69e+09 1.0 0.0e+00 0.0e+00 > 5.1e+03 1 4 0 0 4 1 4 0 0 4 29218 > KSPSetUp 1536 1.0 2.0613e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.1e+04 0 0 0 0 8 0 0 0 0 9 0 > KSPSolve 512 1.0 5.3248e+02 1.0 9.18e+10 1.0 2.8e+07 8.2e+03 > 1.3e+05 90 97 80 79 98 99100 97 97100 11034 > PCSetUp 512 1.0 5.6760e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 3.1e+03 0 0 0 0 2 0 0 0 0 2 0 > PCApply 33565 1.0 4.2495e+02 1.0 6.36e+10 1.0 1.5e+07 8.2e+03 > 2.6e+04 71 67 43 43 19 78 69 52 52 20 9585 > > --- Event Stage 4: IoStage > > VecView 50 1.0 7.7463e+00240.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+02 1 0 0 0 0 34 0 0 0 40 0 > VecCopy 50 1.0 1.0773e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 30 1.0 1.1727e-02 2.3 0.00e+00 0.0 1.2e+04 8.2e+03 > 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 > VecScatterEnd 30 1.0 2.2058e+00701.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 10 0 0 0 0 0 > > --- Event Stage 5: SolvAlloc > > VecSet 50 1.0 1.3748e-01 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 14 0 0 0 0 0 > MatAssemblyBegin 4 1.0 3.1760e-0217.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 2 0 0 0 2 0 > MatAssemblyEnd 4 1.0 2.1847e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 > 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 > > --- Event Stage 6: SolvSolve > > VecMDot 10 1.0 1.2067e-02 1.5 3.60e+06 1.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 2 0 0 1 19117 > VecTDot 134 1.0 2.6145e-02 1.5 8.78e+06 1.0 0.0e+00 0.0e+00 > 1.3e+02 0 0 0 0 0 1 5 0 0 7 21497 > VecNorm 1615 1.0 1.4866e+00 3.5 5.18e+06 1.0 0.0e+00 0.0e+00 > 1.6e+03 0 0 0 0 1 29 3 0 0 90 223 > VecScale 79 1.0 5.9721e-03 1.2 2.59e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 1 0 0 0 27741 > VecCopy 145 1.0 2.4912e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecSet 140 1.0 7.9901e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 277 1.0 4.0597e-02 1.2 1.82e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 10 0 0 0 28619 > VecAYPX 202 1.0 3.5421e-02 1.1 1.10e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 6 0 0 0 19893 > VecMAXPY 11 1.0 7.7360e-03 1.1 4.26e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 2 0 0 0 35242 > VecScatterBegin 149 1.0 1.4983e-02 1.2 0.00e+00 0.0 5.7e+04 8.2e+03 > 0.0e+00 0 0 0 0 0 0 0100100 0 0 > VecScatterEnd 149 1.0 5.0236e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecNormalize 11 1.0 7.1080e-03 3.9 1.08e+06 1.0 0.0e+00 0.0e+00 > 1.1e+01 0 0 0 0 0 0 1 0 0 1 9736 > MatMult 145 1.0 3.2611e-01 1.1 6.65e+07 1.0 5.6e+04 8.2e+03 > 0.0e+00 0 0 0 0 0 8 36 97 97 0 13055 > MatSOR 147 1.0 6.0702e-01 1.0 6.57e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 16 35 0 0 0 6923 > KSPGMRESOrthog 10 1.0 1.7956e-02 1.3 7.21e+06 1.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 4 0 0 1 25694 > KSPSetUp 3 1.0 3.0483e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.4e+01 0 0 0 0 0 1 0 0 0 1 0 > KSPSolve 1 1.0 1.1431e+00 1.0 1.85e+08 1.0 5.6e+04 8.2e+03 > 2.7e+02 0 0 0 0 0 30100 97 97 15 10378 > PCSetUp 1 1.0 1.1488e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > PCApply 68 1.0 9.1644e-01 1.0 1.28e+08 1.0 3.0e+04 8.2e+03 > 5.1e+01 0 0 0 0 0 24 69 52 52 3 8959 > > --- Event Stage 7: SolvDeall > > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Viewer 1 0 0 0 > > --- Event Stage 1: StepStage > > > --- Event Stage 2: ConvStage > > > --- Event Stage 3: ProjStage > > Vector 10752 10752 2834657280 0 > Krylov Solver 1536 1536 16596992 0 > Preconditioner 1536 1536 1290240 0 > > --- Event Stage 4: IoStage > > Vector 50 50 13182000 0 > Viewer 50 50 34400 0 > > --- Event Stage 5: SolvAlloc > > Vector 140 6 8848 0 > Vector Scatter 6 0 0 0 > Matrix 6 0 0 0 > Distributed Mesh 2 0 0 0 > Bipartite Graph 4 0 0 0 > Index Set 14 14 372400 0 > IS L to G Mapping 3 0 0 0 > Krylov Solver 2 0 0 0 > Preconditioner 2 0 0 0 > > --- Event Stage 6: SolvSolve > > Vector 22 0 0 0 > Krylov Solver 3 2 2296 0 > Preconditioner 3 2 1760 0 > > --- Event Stage 7: SolvDeall > > Vector 0 149 41419384 0 > Vector Scatter 0 1 1036 0 > Matrix 0 3 4619676 0 > Krylov Solver 0 3 32416 0 > Preconditioner 0 3 2520 0 > > ======================================================================================================================== > Average time to get PetscTime(): 9.53674e-08 > Average time for MPI_Barrier(): 1.13964e-05 > Average time for zero size MPI_Send(): 1.2815e-06 > #PETSc Option Table entries: > -ksp_type cg > -log_summary > -pc_mg_galerkin > -pc_type mg > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure run at: > Configure options: > > Best, > Filippo > > On Monday 29 September 2014 08:58:35 Matthew Knepley wrote: > > On Mon, Sep 29, 2014 at 8:42 AM, Filippo Leonardi < > > > > filippo.leonardi at sam.math.ethz.ch> wrote: > > > Hi, > > > > > > I am trying to solve a standard second order central differenced > Poisson > > > equation in parallel, in 3D, using a 3D structured DMDAs (extremely > > > standard > > > Laplacian matrix). > > > > > > I want to get some nice scaling (especially weak), but my results show > > > that > > > the Krylow method is not performing as expected. The problem (at leas > for > > > CG + > > > Bjacobi) seems to lie on the number of iterations. > > > > > > In particular the number of iterations grows with CG (the matrix is > SPD) > > > + > > > BJacobi as mesh is refined (probably due to condition number > increasing) > > > and > > > number of processors is increased (probably due to the Bjacobi > > > preconditioner). For instance I tried the following setup: > > > 1 procs to solve 32^3 domain => 20 iterations > > > 8 procs to solve 64^3 domain => 60 iterations > > > 64 procs to solve 128^3 domain => 101 iterations > > > > > > Is there something pathological with my runs (maybe I am missing > > > something)? > > > Is there somebody who can provide me weak scaling benchmarks for > > > equivalent > > > problems? (Maybe there is some better preconditioner for this problem). > > > > Bjacobi is not a scalable preconditioner. As you note, the number of > > iterates grows > > with the system size. You should always use MG here. > > > > > I am also aware that Multigrid is even better for this problems but the > > > **scalability** of my runs seems to be as bad as with CG. > > > > MG will weak scale almost perfectly. Send -log_summary for each run if > this > > does not happen. > > > > Thanks, > > > > Matt > > > > > -pc_mg_galerkin > > > -pc_type mg > > > (both directly with richardson or as preconditioner to cg) > > > > > > The following is the "-log_summary" of a 128^3 run, notice that I solve > > > the > > > system multiple times (hence KSPSolve is multiplied by 128). Using CG + > > > BJacobi. > > > > > > Tell me if I missed some detail and sorry for the length of the post. > > > > > > Thanks, > > > Filippo > > > > > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT > 2012 > > > > > > Max Max/Min Avg Total > > > > > > Time (sec): 9.095e+01 1.00001 9.095e+01 > > > Objects: 1.875e+03 1.00000 1.875e+03 > > > Flops: 1.733e+10 1.00000 1.733e+10 1.109e+12 > > > Flops/sec: 1.905e+08 1.00001 1.905e+08 1.219e+10 > > > MPI Messages: 1.050e+05 1.00594 1.044e+05 6.679e+06 > > > MPI Message Lengths: 1.184e+09 1.37826 8.283e+03 5.532e+10 > > > MPI Reductions: 4.136e+04 1.00000 > > > > > > Flop counting convention: 1 flop = 1 real number operation of type > > > (multiply/divide/add/subtract) > > > > > > e.g., VecAXPY() for real vectors of length > N > > > > > > --> > > > 2N flops > > > > > > and VecAXPY() for complex vectors of > length N > > > > > > --> > > > 8N flops > > > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > > > --- > > > -- Message Lengths -- -- Reductions -- > > > > > > Avg %Total Avg %Total counts > > > %Total > > > > > > Avg %Total counts %Total > > > > > > 0: Main Stage: 1.1468e-01 0.1% 0.0000e+00 0.0% 0.000e+00 > > > 0.0% > > > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > > > 1: StepStage: 4.4170e-01 0.5% 7.2478e+09 0.7% 0.000e+00 > > > 0.0% > > > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > > > 2: ConvStage: 8.8333e+00 9.7% 3.7044e+10 3.3% 1.475e+06 > > > 22.1% > > > > > > 1.809e+03 21.8% 0.000e+00 0.0% > > > > > > 3: ProjStage: 7.7169e+01 84.8% 1.0556e+12 95.2% 5.151e+06 > > > 77.1% > > > > > > 6.317e+03 76.3% 4.024e+04 97.3% > > > > > > 4: IoStage: 2.4789e+00 2.7% 0.0000e+00 0.0% 3.564e+03 > > > 0.1% > > > > > > 1.017e+02 1.2% 5.000e+01 0.1% > > > > > > 5: SolvAlloc: 7.0947e-01 0.8% 0.0000e+00 0.0% 5.632e+03 > > > 0.1% > > > > > > 9.587e-01 0.0% 3.330e+02 0.8% > > > > > > 6: SolvSolve: 1.2044e+00 1.3% 9.1679e+09 0.8% 4.454e+04 > > > 0.7% > > > > > > 5.464e+01 0.7% 7.320e+02 1.8% > > > > > > 7: SolvDeall: 7.5711e-04 0.0% 0.0000e+00 0.0% 0.000e+00 > > > 0.0% > > > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > > > > > > > -------------------------------------------------------------------------- > > > ---------------------------------------------- See the 'Profiling' > chapter > > > of the users' manual for details on > > > interpreting > > > output. > > > > > > Phase summary info: > > > Count: number of times phase was executed > > > Time and Flops: Max - maximum over all processors > > > > > > Ratio - ratio of maximum to minimum over all > processors > > > > > > Mess: number of messages sent > > > Avg. len: average message length > > > Reduct: number of global reductions > > > Global: entire computation > > > Stage: stages of a computation. Set stages with PetscLogStagePush() > and > > > > > > PetscLogStagePop(). > > > > > > %T - percent time in this phase %f - percent flops in > this > > > > > > phase > > > > > > %M - percent messages in this phase %L - percent message > lengths > > > > > > in > > > this phase > > > > > > %R - percent reductions in this phase > > > > > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > > > > > > over all > > > processors) > > > > > > > -------------------------------------------------------------------------- > > > ---------------------------------------------- Event > Count > > > Time (sec) Flops > > > --- Global --- --- Stage --- Total > > > > > > Max Ratio Max Ratio Max Ratio Mess Avg > len > > > > > > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > > > > > > > -------------------------------------------------------------------------- > > > ---------------------------------------------- > > > > > > --- Event Stage 0: Main Stage > > > > > > > > > --- Event Stage 1: StepStage > > > > > > VecAXPY 1536 1.0 4.6436e-01 1.1 1.13e+08 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 1 0 0 0 99100 0 0 0 15608 > > > > > > --- Event Stage 2: ConvStage > > > > > > VecCopy 2304 1.0 8.1658e-01 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 > > > VecAXPY 2304 1.0 6.1324e-01 1.2 1.51e+08 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 1 1 0 0 0 6 26 0 0 0 15758 > > > VecAXPBYCZ 2688 1.0 1.3029e+00 1.1 3.52e+08 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 1 2 0 0 0 14 61 0 0 0 17306 > > > VecPointwiseMult 2304 1.0 7.2368e-01 1.0 7.55e+07 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 1 0 0 0 0 8 13 0 0 0 6677 > > > VecScatterBegin 3840 1.0 1.8182e+00 1.3 0.00e+00 0.0 1.5e+06 > 8.2e+03 > > > 0.0e+00 2 0 22 22 0 18 0100100 0 0 > > > VecScatterEnd 3840 1.0 1.1972e+00 2.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 > > > > > > --- Event Stage 3: ProjStage > > > > > > VecTDot 25802 1.0 4.2552e+00 1.3 1.69e+09 1.0 0.0e+00 > 0.0e+00 > > > 2.6e+04 4 10 0 0 62 5 10 0 0 64 25433 > > > VecNorm 13029 1.0 3.0772e+00 3.3 8.54e+08 1.0 0.0e+00 > 0.0e+00 > > > 1.3e+04 2 5 0 0 32 2 5 0 0 32 17759 > > > VecCopy 640 1.0 2.4339e-01 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecSet 13157 1.0 7.0903e-01 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > > > VecAXPY 26186 1.0 4.1462e+00 1.1 1.72e+09 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 4 10 0 0 0 5 10 0 0 0 26490 > > > VecAYPX 12773 1.0 1.9135e+00 1.1 8.37e+08 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 2 5 0 0 0 2 5 0 0 0 27997 > > > VecScatterBegin 13413 1.0 1.0689e+00 1.1 0.00e+00 0.0 5.2e+06 > 8.2e+03 > > > 0.0e+00 1 0 77 76 0 1 0100100 0 0 > > > VecScatterEnd 13413 1.0 2.7944e+00 1.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 > > > MatMult 12901 1.0 3.2072e+01 1.0 5.92e+09 1.0 5.0e+06 > 8.2e+03 > > > 0.0e+00 35 34 74 73 0 41 36 96 96 0 11810 > > > MatSolve 13029 1.0 3.0851e+01 1.1 5.39e+09 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 33 31 0 0 0 39 33 0 0 0 11182 > > > MatLUFactorNum 128 1.0 1.2922e+00 1.0 8.80e+07 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 1 1 0 0 0 2 1 0 0 0 4358 > > > MatILUFactorSym 128 1.0 7.5075e-01 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 1.3e+02 1 0 0 0 0 1 0 0 0 0 0 > > > MatGetRowIJ 128 1.0 1.4782e-04 1.8 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetOrdering 128 1.0 5.7567e-02 1.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 2.6e+02 0 0 0 0 1 0 0 0 0 1 0 > > > KSPSetUp 256 1.0 1.9913e-01 1.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 7.7e+02 0 0 0 0 2 0 0 0 0 2 0 > > > KSPSolve 128 1.0 7.6381e+01 1.0 1.65e+10 1.0 5.0e+06 > 8.2e+03 > > > 4.0e+04 84 95 74 73 97 99100 96 96100 13800 > > > PCSetUp 256 1.0 2.1503e+00 1.0 8.80e+07 1.0 0.0e+00 > 0.0e+00 > > > 6.4e+02 2 1 0 0 2 3 1 0 0 2 2619 > > > PCSetUpOnBlocks 128 1.0 2.1232e+00 1.0 8.80e+07 1.0 0.0e+00 > 0.0e+00 > > > 3.8e+02 2 1 0 0 1 3 1 0 0 1 2652 > > > PCApply 13029 1.0 3.1812e+01 1.1 5.39e+09 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 34 31 0 0 0 40 33 0 0 0 10844 > > > > > > --- Event Stage 4: IoStage > > > > > > VecView 10 1.0 1.7523e+00282.9 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 2.0e+01 1 0 0 0 0 36 0 0 0 40 0 > > > VecCopy 10 1.0 2.2449e-03 1.7 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecScatterBegin 6 1.0 2.3620e-03 2.4 0.00e+00 0.0 2.3e+03 > 8.2e+03 > > > 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 > > > VecScatterEnd 6 1.0 4.4194e-01663.9 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 9 0 0 0 0 0 > > > > > > --- Event Stage 5: SolvAlloc > > > > > > VecSet 50 1.0 1.3170e-01 5.6 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 > > > MatAssemblyBegin 4 1.0 3.9801e-0230.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 8.0e+00 0 0 0 0 0 3 0 0 0 2 0 > > > MatAssemblyEnd 4 1.0 2.2752e-02 1.0 0.00e+00 0.0 1.5e+03 > 2.0e+03 > > > 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 > > > > > > --- Event Stage 6: SolvSolve > > > > > > VecTDot 224 1.0 3.5454e-02 1.3 1.47e+07 1.0 0.0e+00 > 0.0e+00 > > > 2.2e+02 0 0 0 0 1 3 10 0 0 31 26499 > > > VecNorm 497 1.0 1.5268e-01 1.4 7.41e+06 1.0 0.0e+00 > 0.0e+00 > > > 5.0e+02 0 0 0 0 1 11 5 0 0 68 3104 > > > VecCopy 8 1.0 2.7523e-03 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecSet 114 1.0 5.9965e-03 1.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > VecAXPY 230 1.0 3.7198e-02 1.1 1.51e+07 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 3 11 0 0 0 25934 > > > VecAYPX 111 1.0 1.7153e-02 1.1 7.27e+06 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 1 5 0 0 0 27142 > > > VecScatterBegin 116 1.0 1.1888e-02 1.2 0.00e+00 0.0 4.5e+04 > 8.2e+03 > > > 0.0e+00 0 0 1 1 0 1 0100100 0 0 > > > VecScatterEnd 116 1.0 2.8105e-02 2.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 > > > MatMult 112 1.0 2.8080e-01 1.0 5.14e+07 1.0 4.3e+04 > 8.2e+03 > > > 0.0e+00 0 0 1 1 0 23 36 97 97 0 11711 > > > MatSolve 113 1.0 2.6673e-01 1.1 4.67e+07 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 22 33 0 0 0 11217 > > > MatLUFactorNum 1 1.0 1.0332e-02 1.0 6.87e+05 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 1 0 0 0 0 4259 > > > MatILUFactorSym 1 1.0 3.1291e-02 4.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 1.0e+00 0 0 0 0 0 2 0 0 0 0 0 > > > MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > MatGetOrdering 1 1.0 3.4251e-03 5.4 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > KSPSetUp 2 1.0 3.6959e-0210.1 0.00e+00 0.0 0.0e+00 > 0.0e+00 > > > 6.0e+00 0 0 0 0 0 1 0 0 0 1 0 > > > KSPSolve 1 1.0 6.9956e-01 1.0 1.43e+08 1.0 4.3e+04 > 8.2e+03 > > > 3.5e+02 1 1 1 1 1 58100 97 97 48 13069 > > > PCSetUp 2 1.0 4.4161e-02 2.3 6.87e+05 1.0 0.0e+00 > 0.0e+00 > > > 5.0e+00 0 0 0 0 0 3 0 0 0 1 996 > > > PCSetUpOnBlocks 1 1.0 4.3894e-02 2.4 6.87e+05 1.0 0.0e+00 > 0.0e+00 > > > 3.0e+00 0 0 0 0 0 3 0 0 0 0 1002 > > > PCApply 113 1.0 2.7507e-01 1.1 4.67e+07 1.0 0.0e+00 > 0.0e+00 > > > 0.0e+00 0 0 0 0 0 22 33 0 0 0 10877 > > > > > > --- Event Stage 7: SolvDeall > > > > > > > > > > -------------------------------------------------------------------------- > > > ---------------------------------------------- > > > > > > Memory usage is given in bytes: > > > > > > Object Type Creations Destructions Memory Descendants' > > > Mem. > > > Reports information only for process 0. > > > > > > --- Event Stage 0: Main Stage > > > > > > Viewer 1 0 0 0 > > > > > > --- Event Stage 1: StepStage > > > > > > > > > --- Event Stage 2: ConvStage > > > > > > > > > --- Event Stage 3: ProjStage > > > > > > Vector 640 640 101604352 0 > > > Matrix 128 128 410327040 0 > > > > > > Index Set 384 384 17062912 0 > > > > > > Krylov Solver 256 256 282624 0 > > > > > > Preconditioner 256 256 228352 0 > > > > > > --- Event Stage 4: IoStage > > > > > > Vector 10 10 2636400 0 > > > Viewer 10 10 6880 0 > > > > > > --- Event Stage 5: SolvAlloc > > > > > > Vector 140 6 8848 0 > > > > > > Vector Scatter 6 0 0 0 > > > > > > Matrix 6 0 0 0 > > > > > > Distributed Mesh 2 0 0 0 > > > > > > Bipartite Graph 4 0 0 0 > > > > > > Index Set 14 14 372400 0 > > > > > > IS L to G Mapping 3 0 0 0 > > > > > > Krylov Solver 1 0 0 0 > > > > > > Preconditioner 1 0 0 0 > > > > > > --- Event Stage 6: SolvSolve > > > > > > Vector 5 0 0 0 > > > Matrix 1 0 0 0 > > > > > > Index Set 3 0 0 0 > > > > > > Krylov Solver 2 1 1136 0 > > > > > > Preconditioner 2 1 824 0 > > > > > > --- Event Stage 7: SolvDeall > > > > > > Vector 0 133 36676728 0 > > > > > > Vector Scatter 0 1 1036 0 > > > > > > Matrix 0 4 7038924 0 > > > > > > Index Set 0 3 133304 0 > > > > > > Krylov Solver 0 2 2208 0 > > > > > > Preconditioner 0 2 1784 0 > > > > > > > ========================================================================== > > > ============================================== Average time to get > > > PetscTime(): 9.53674e-08 > > > Average time for MPI_Barrier(): 1.12057e-05 > > > Average time for zero size MPI_Send(): 1.3113e-06 > > > #PETSc Option Table entries: > > > -ksp_type cg > > > -log_summary > > > -pc_type bjacobi > > > #End of PETSc Option Table entries > > > Compiled without FORTRAN kernels > > > Compiled with full precision matrices (default) > > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > > > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > > Configure run at: > > > Configure options: > > > Application 9457215 resources: utime ~5920s, stime ~58s -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From filippo.leonardi at sam.math.ethz.ch Mon Sep 29 14:45:17 2014 From: filippo.leonardi at sam.math.ethz.ch (Leonardi Filippo) Date: Mon, 29 Sep 2014 19:45:17 +0000 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: References: <2490546.DNVhllGaLT@besikovitch-ii> <20769360.6VtOyZos7S@besikovitch-ii>, Message-ID: Yes, projstage is what we are looking at. Different number of solvers are due to timestepping. I can also remove that (and I did that for subsequent runs, btw the result is the same), but this was also sort of a "production run". ________________________________ Da: Matthew Knepley [knepley at gmail.com] Inviato: luned? 29 settembre 2014 21.39 A: Leonardi Filippo Cc: petsc-users at mcs.anl.gov Oggetto: Re: [petsc-users] Scaling/Preconditioners for Poisson equation On Mon, Sep 29, 2014 at 9:36 AM, Filippo Leonardi > wrote: Thank you. Actually I had the feeling that it wasn't my problem with Bjacobi and CG. So I'll stick to MG. Problem with MG is that there are a lot of parameters to be tuned, so I leave the defaults (expect I select CG as Krylow method). I post just results for 64^3 and 128^3. Tell me if I'm missing some useful detail. (I get similar results with BoomerAMG). 1) I assume we are looking at ProjStage? 2) Why are you doing a different number of solves on the different number of processes? Matt Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin -pc_type mg): 32^3 and 1 proc: 1.01e-1 64^3 and 8 proc: 6.56e-01 128^3 and 64 proc: 1.05e+00 Number of PCSetup per KSPSolve: 15 39 65 With BoomerAMG: stable 8 iterations per KSP but time per iteration greater than PETSc MG and still increases: 64^3: 3.17e+00 128^3: 9.99e+00 --> For instance with 64^3 (256 iterations): Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 Max Max/Min Avg Total Time (sec): 1.896e+02 1.00000 1.896e+02 Objects: 7.220e+03 1.00000 7.220e+03 Flops: 3.127e+10 1.00000 3.127e+10 2.502e+11 Flops/sec: 1.649e+08 1.00000 1.649e+08 1.319e+09 MPI Messages: 9.509e+04 1.00316 9.483e+04 7.586e+05 MPI Message Lengths: 1.735e+09 1.09967 1.685e+04 1.278e+10 MPI Reductions: 4.781e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3416e-02 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: StepStage: 8.7909e-01 0.5% 1.8119e+09 0.7% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2: ConvStage: 1.7172e+01 9.1% 9.2610e+09 3.7% 1.843e+05 24.3% 3.981e+03 23.6% 0.000e+00 0.0% 3: ProjStage: 1.6804e+02 88.6% 2.3813e+11 95.2% 5.703e+05 75.2% 1.232e+04 73.1% 4.627e+04 96.8% 4: IoStage: 1.5814e+00 0.8% 0.0000e+00 0.0% 1.420e+03 0.2% 4.993e+02 3.0% 2.500e+02 0.5% 5: SolvAlloc: 2.5722e-01 0.1% 0.0000e+00 0.0% 2.560e+02 0.0% 1.054e+00 0.0% 3.330e+02 0.7% 6: SolvSolve: 1.6776e+00 0.9% 9.5345e+08 0.4% 2.280e+03 0.3% 4.924e+01 0.3% 9.540e+02 2.0% 7: SolvDeall: 7.4017e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: StepStage VecAXPY 3072 1.0 8.8295e-01 1.0 2.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 99100 0 0 0 2052 --- Event Stage 2: ConvStage VecCopy 4608 1.0 1.6016e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 VecAXPY 4608 1.0 1.2212e+00 1.2 3.02e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 6 26 0 0 0 1978 VecAXPBYCZ 5376 1.0 2.5875e+00 1.1 7.05e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 15 61 0 0 0 2179 VecPointwiseMult 4608 1.0 1.4411e+00 1.0 1.51e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 8 13 0 0 0 838 VecScatterBegin 7680 1.0 3.4130e+00 1.0 0.00e+00 0.0 1.8e+05 1.6e+04 0.0e+00 2 0 24 24 0 20 0100100 0 0 VecScatterEnd 7680 1.0 9.3412e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 5 0 0 0 0 0 --- Event Stage 3: ProjStage VecMDot 2560 1.0 2.1944e+00 1.1 9.23e+08 1.0 0.0e+00 0.0e+00 2.6e+03 1 3 0 0 5 1 3 0 0 6 3364 VecTDot 19924 1.0 2.7283e+00 1.3 1.31e+09 1.0 0.0e+00 0.0e+00 2.0e+04 1 4 0 0 42 1 4 0 0 43 3829 VecNorm 13034 1.0 1.5385e+00 2.0 8.54e+08 1.0 0.0e+00 0.0e+00 1.3e+04 1 3 0 0 27 1 3 0 0 28 4442 VecScale 13034 1.0 9.0783e-01 1.3 4.27e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 3764 VecCopy 21972 1.0 3.5136e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSet 21460 1.0 1.3108e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 41384 1.0 5.9866e+00 1.1 2.71e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 9 0 0 0 3 9 0 0 0 3624 VecAYPX 30142 1.0 5.3362e+00 1.0 1.64e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 5 0 0 0 3 6 0 0 0 2460 VecMAXPY 2816 1.0 1.8561e+00 1.0 1.09e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 4 0 0 0 4700 VecScatterBegin 23764 1.0 1.7138e+00 1.1 0.00e+00 0.0 5.7e+05 1.6e+04 0.0e+00 1 0 75 73 0 1 0100100 0 0 VecScatterEnd 23764 1.0 3.1986e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecNormalize 2816 1.0 2.9511e-01 1.1 2.77e+08 1.0 0.0e+00 0.0e+00 2.8e+03 0 1 0 0 6 0 1 0 0 6 7504 MatMult 22740 1.0 4.6896e+01 1.0 1.04e+10 1.0 5.5e+05 1.6e+04 0.0e+00 25 33 72 70 0 28 35 96 96 0 1780 MatSOR 23252 1.0 9.5250e+01 1.0 1.04e+10 1.0 0.0e+00 0.0e+00 0.0e+00 50 33 0 0 0 56 35 0 0 0 872 KSPGMRESOrthog 2560 1.0 3.6142e+00 1.1 1.85e+09 1.0 0.0e+00 0.0e+00 2.6e+03 2 6 0 0 5 2 6 0 0 6 4085 KSPSetUp 768 1.0 7.9389e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.6e+03 0 0 0 0 12 0 0 0 0 12 0 KSPSolve 256 1.0 1.6661e+02 1.0 2.97e+10 1.0 5.5e+05 1.6e+04 4.6e+04 88 95 72 70 97 99100 96 96100 1427 PCSetUp 256 1.0 2.6755e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+03 0 0 0 0 3 0 0 0 0 3 0 PCApply 10218 1.0 1.3642e+02 1.0 2.12e+10 1.0 3.1e+05 1.6e+04 1.3e+04 72 68 40 39 27 81 71 54 54 28 1245 --- Event Stage 4: IoStage VecView 50 1.0 8.8377e-0138.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+02 0 0 0 0 0 29 0 0 0 40 0 VecCopy 50 1.0 8.9977e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecScatterBegin 30 1.0 1.0644e-02 1.6 0.00e+00 0.0 7.2e+02 1.6e+04 0.0e+00 0 0 0 0 0 1 0 51 3 0 0 VecScatterEnd 30 1.0 2.4857e-01109.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 8 0 0 0 0 0 --- Event Stage 5: SolvAlloc VecSet 50 1.0 1.9324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 7 0 0 0 0 0 MatAssemblyBegin 4 1.0 5.0378e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 1 0 0 0 2 0 MatAssemblyEnd 4 1.0 1.5030e-02 1.0 0.00e+00 0.0 9.6e+01 4.1e+03 1.6e+01 0 0 0 0 0 6 0 38 49 5 0 --- Event Stage 6: SolvSolve VecMDot 10 1.0 8.9154e-03 1.1 3.60e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 3 0 0 1 3234 VecTDot 80 1.0 1.1104e-02 1.1 5.24e+06 1.0 0.0e+00 0.0e+00 8.0e+01 0 0 0 0 0 1 4 0 0 8 3777 VecNorm 820 1.0 2.6904e-01 1.6 3.41e+06 1.0 0.0e+00 0.0e+00 8.2e+02 0 0 0 0 2 13 3 0 0 86 101 VecScale 52 1.0 3.6066e-03 1.2 1.70e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 3780 VecCopy 91 1.0 1.4363e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecSet 86 1.0 5.1112e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 169 1.0 2.4659e-02 1.1 1.11e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 9 0 0 0 3593 VecAYPX 121 1.0 2.2017e-02 1.1 6.59e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 6 0 0 0 2393 VecMAXPY 11 1.0 7.2782e-03 1.0 4.26e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 4 0 0 0 4682 VecScatterBegin 95 1.0 7.3617e-03 1.1 0.00e+00 0.0 2.3e+03 1.6e+04 0.0e+00 0 0 0 0 0 0 0100100 0 0 VecScatterEnd 95 1.0 1.3788e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecNormalize 11 1.0 1.2109e-03 1.1 1.08e+06 1.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 0 0 1 0 0 1 7144 MatMult 91 1.0 1.9398e-01 1.0 4.17e+07 1.0 2.2e+03 1.6e+04 0.0e+00 0 0 0 0 0 11 35 96 96 0 1722 MatSOR 93 1.0 3.8194e-01 1.0 4.16e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 23 35 0 0 0 870 KSPGMRESOrthog 10 1.0 1.4540e-02 1.1 7.21e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 1 6 0 0 1 3966 KSPSetUp 3 1.0 5.2021e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 0 0 0 0 0 0 0 0 0 3 0 KSPSolve 1 1.0 6.7911e-01 1.0 1.19e+08 1.0 2.2e+03 1.6e+04 1.9e+02 0 0 0 0 0 40100 96 96 19 1399 PCSetUp 1 1.0 1.9128e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 PCApply 41 1.0 5.5355e-01 1.0 8.47e+07 1.0 1.2e+03 1.6e+04 5.1e+01 0 0 0 0 0 33 71 54 54 5 1224 --- Event Stage 7: SolvDeall ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 1 0 0 0 --- Event Stage 1: StepStage --- Event Stage 2: ConvStage --- Event Stage 3: ProjStage Vector 5376 5376 1417328640 0 Krylov Solver 768 768 8298496 0 Preconditioner 768 768 645120 0 --- Event Stage 4: IoStage Vector 50 50 13182000 0 Viewer 50 50 34400 0 --- Event Stage 5: SolvAlloc Vector 140 6 8848 0 Vector Scatter 6 0 0 0 Matrix 6 0 0 0 Distributed Mesh 2 0 0 0 Bipartite Graph 4 0 0 0 Index Set 14 14 372400 0 IS L to G Mapping 3 0 0 0 Krylov Solver 2 0 0 0 Preconditioner 2 0 0 0 --- Event Stage 6: SolvSolve Vector 22 0 0 0 Krylov Solver 3 2 2296 0 Preconditioner 3 2 1760 0 --- Event Stage 7: SolvDeall Vector 0 149 41419384 0 Vector Scatter 0 1 1036 0 Matrix 0 3 4619676 0 Krylov Solver 0 3 32416 0 Preconditioner 0 3 2520 0 ======================================================================================================================== Average time to get PetscTime(): 1.90735e-07 Average time for MPI_Barrier(): 4.62532e-06 Average time for zero size MPI_Send(): 1.51992e-06 #PETSc Option Table entries: -ksp_type cg -log_summary -pc_mg_galerkin -pc_type mg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Configure options: --> And with 128^3 (512 iterations): Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 Max Max/Min Avg Total Time (sec): 5.889e+02 1.00000 5.889e+02 Objects: 1.413e+04 1.00000 1.413e+04 Flops: 9.486e+10 1.00000 9.486e+10 6.071e+12 Flops/sec: 1.611e+08 1.00000 1.611e+08 1.031e+10 MPI Messages: 5.392e+05 1.00578 5.361e+05 3.431e+07 MPI Message Lengths: 6.042e+09 1.36798 8.286e+03 2.843e+11 MPI Reductions: 1.343e+05 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.1330e-01 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: StepStage: 1.7508e+00 0.3% 2.8991e+10 0.5% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2: ConvStage: 3.5534e+01 6.0% 1.4818e+11 2.4% 5.898e+06 17.2% 1.408e+03 17.0% 0.000e+00 0.0% 3: ProjStage: 5.3568e+02 91.0% 5.8820e+12 96.9% 2.833e+07 82.6% 6.765e+03 81.6% 1.319e+05 98.2% 4: IoStage: 1.1365e+01 1.9% 0.0000e+00 0.0% 1.782e+04 0.1% 9.901e+01 1.2% 2.500e+02 0.2% 5: SolvAlloc: 7.1497e-01 0.1% 0.0000e+00 0.0% 5.632e+03 0.0% 1.866e-01 0.0% 3.330e+02 0.2% 6: SolvSolve: 3.7604e+00 0.6% 1.1888e+10 0.2% 5.722e+04 0.2% 1.366e+01 0.2% 1.803e+03 1.3% 7: SolvDeall: 7.6677e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage --- Event Stage 1: StepStage VecAXPY 6144 1.0 1.8187e+00 1.1 4.53e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 99100 0 0 0 15941 --- Event Stage 2: ConvStage VecCopy 9216 1.0 3.2440e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 VecAXPY 9216 1.0 2.4045e+00 1.1 6.04e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 6 26 0 0 0 16076 VecAXPBYCZ 10752 1.0 5.1656e+00 1.1 1.41e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 14 61 0 0 0 17460 VecPointwiseMult 9216 1.0 2.9012e+00 1.0 3.02e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 8 13 0 0 0 6662 VecScatterBegin 15360 1.0 7.3895e+00 1.3 0.00e+00 0.0 5.9e+06 8.2e+03 0.0e+00 1 0 17 17 0 18 0100100 0 0 VecScatterEnd 15360 1.0 4.4483e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 --- Event Stage 3: ProjStage VecMDot 5120 1.0 5.2159e+00 1.2 1.85e+09 1.0 0.0e+00 0.0e+00 5.1e+03 1 2 0 0 4 1 2 0 0 4 22644 VecTDot 66106 1.0 1.3662e+01 1.4 4.33e+09 1.0 0.0e+00 0.0e+00 6.6e+04 2 5 0 0 49 2 5 0 0 50 20295 VecNorm 39197 1.0 1.4431e+01 2.8 2.57e+09 1.0 0.0e+00 0.0e+00 3.9e+04 2 3 0 0 29 2 3 0 0 30 11392 VecScale 39197 1.0 2.8002e+00 1.2 1.28e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 29356 VecCopy 70202 1.0 1.1299e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSet 69178 1.0 3.9612e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 135284 1.0 1.9286e+01 1.1 8.87e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 9 0 0 0 3 10 0 0 0 29422 VecAYPX 99671 1.0 1.7862e+01 1.1 5.43e+09 1.0 0.0e+00 0.0e+00 0.0e+00 3 6 0 0 0 3 6 0 0 0 19464 VecMAXPY 5632 1.0 3.7555e+00 1.0 2.18e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 37169 VecScatterBegin 73786 1.0 6.2463e+00 1.2 0.00e+00 0.0 2.8e+07 8.2e+03 0.0e+00 1 0 83 82 0 1 0100100 0 0 VecScatterEnd 73786 1.0 2.1679e+01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 VecNormalize 5632 1.0 9.0864e-01 1.2 5.54e+08 1.0 0.0e+00 0.0e+00 5.6e+03 0 1 0 0 4 0 1 0 0 4 38996 MatMult 71738 1.0 1.5645e+02 1.1 3.29e+10 1.0 2.8e+07 8.2e+03 0.0e+00 26 35 80 79 0 28 36 97 97 0 13462 MatSOR 72762 1.0 2.9900e+02 1.0 3.25e+10 1.0 0.0e+00 0.0e+00 0.0e+00 49 34 0 0 0 54 35 0 0 0 6953 KSPGMRESOrthog 5120 1.0 8.0849e+00 1.1 3.69e+09 1.0 0.0e+00 0.0e+00 5.1e+03 1 4 0 0 4 1 4 0 0 4 29218 KSPSetUp 1536 1.0 2.0613e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+04 0 0 0 0 8 0 0 0 0 9 0 KSPSolve 512 1.0 5.3248e+02 1.0 9.18e+10 1.0 2.8e+07 8.2e+03 1.3e+05 90 97 80 79 98 99100 97 97100 11034 PCSetUp 512 1.0 5.6760e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.1e+03 0 0 0 0 2 0 0 0 0 2 0 PCApply 33565 1.0 4.2495e+02 1.0 6.36e+10 1.0 1.5e+07 8.2e+03 2.6e+04 71 67 43 43 19 78 69 52 52 20 9585 --- Event Stage 4: IoStage VecView 50 1.0 7.7463e+00240.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+02 1 0 0 0 0 34 0 0 0 40 0 VecCopy 50 1.0 1.0773e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 30 1.0 1.1727e-02 2.3 0.00e+00 0.0 1.2e+04 8.2e+03 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 VecScatterEnd 30 1.0 2.2058e+00701.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 10 0 0 0 0 0 --- Event Stage 5: SolvAlloc VecSet 50 1.0 1.3748e-01 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 14 0 0 0 0 0 MatAssemblyBegin 4 1.0 3.1760e-0217.4 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 2 0 0 0 2 0 MatAssemblyEnd 4 1.0 2.1847e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 --- Event Stage 6: SolvSolve VecMDot 10 1.0 1.2067e-02 1.5 3.60e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 2 0 0 1 19117 VecTDot 134 1.0 2.6145e-02 1.5 8.78e+06 1.0 0.0e+00 0.0e+00 1.3e+02 0 0 0 0 0 1 5 0 0 7 21497 VecNorm 1615 1.0 1.4866e+00 3.5 5.18e+06 1.0 0.0e+00 0.0e+00 1.6e+03 0 0 0 0 1 29 3 0 0 90 223 VecScale 79 1.0 5.9721e-03 1.2 2.59e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 27741 VecCopy 145 1.0 2.4912e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecSet 140 1.0 7.9901e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 277 1.0 4.0597e-02 1.2 1.82e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 10 0 0 0 28619 VecAYPX 202 1.0 3.5421e-02 1.1 1.10e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 6 0 0 0 19893 VecMAXPY 11 1.0 7.7360e-03 1.1 4.26e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 2 0 0 0 35242 VecScatterBegin 149 1.0 1.4983e-02 1.2 0.00e+00 0.0 5.7e+04 8.2e+03 0.0e+00 0 0 0 0 0 0 0100100 0 0 VecScatterEnd 149 1.0 5.0236e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 VecNormalize 11 1.0 7.1080e-03 3.9 1.08e+06 1.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 0 0 1 0 0 1 9736 MatMult 145 1.0 3.2611e-01 1.1 6.65e+07 1.0 5.6e+04 8.2e+03 0.0e+00 0 0 0 0 0 8 36 97 97 0 13055 MatSOR 147 1.0 6.0702e-01 1.0 6.57e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 16 35 0 0 0 6923 KSPGMRESOrthog 10 1.0 1.7956e-02 1.3 7.21e+06 1.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 4 0 0 1 25694 KSPSetUp 3 1.0 3.0483e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 0 0 0 0 0 1 0 0 0 1 0 KSPSolve 1 1.0 1.1431e+00 1.0 1.85e+08 1.0 5.6e+04 8.2e+03 2.7e+02 0 0 0 0 0 30100 97 97 15 10378 PCSetUp 1 1.0 1.1488e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 68 1.0 9.1644e-01 1.0 1.28e+08 1.0 3.0e+04 8.2e+03 5.1e+01 0 0 0 0 0 24 69 52 52 3 8959 --- Event Stage 7: SolvDeall ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 1 0 0 0 --- Event Stage 1: StepStage --- Event Stage 2: ConvStage --- Event Stage 3: ProjStage Vector 10752 10752 2834657280 0 Krylov Solver 1536 1536 16596992 0 Preconditioner 1536 1536 1290240 0 --- Event Stage 4: IoStage Vector 50 50 13182000 0 Viewer 50 50 34400 0 --- Event Stage 5: SolvAlloc Vector 140 6 8848 0 Vector Scatter 6 0 0 0 Matrix 6 0 0 0 Distributed Mesh 2 0 0 0 Bipartite Graph 4 0 0 0 Index Set 14 14 372400 0 IS L to G Mapping 3 0 0 0 Krylov Solver 2 0 0 0 Preconditioner 2 0 0 0 --- Event Stage 6: SolvSolve Vector 22 0 0 0 Krylov Solver 3 2 2296 0 Preconditioner 3 2 1760 0 --- Event Stage 7: SolvDeall Vector 0 149 41419384 0 Vector Scatter 0 1 1036 0 Matrix 0 3 4619676 0 Krylov Solver 0 3 32416 0 Preconditioner 0 3 2520 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 1.13964e-05 Average time for zero size MPI_Send(): 1.2815e-06 #PETSc Option Table entries: -ksp_type cg -log_summary -pc_mg_galerkin -pc_type mg #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Configure options: Best, Filippo On Monday 29 September 2014 08:58:35 Matthew Knepley wrote: > On Mon, Sep 29, 2014 at 8:42 AM, Filippo Leonardi < > > filippo.leonardi at sam.math.ethz.ch> wrote: > > Hi, > > > > I am trying to solve a standard second order central differenced Poisson > > equation in parallel, in 3D, using a 3D structured DMDAs (extremely > > standard > > Laplacian matrix). > > > > I want to get some nice scaling (especially weak), but my results show > > that > > the Krylow method is not performing as expected. The problem (at leas for > > CG + > > Bjacobi) seems to lie on the number of iterations. > > > > In particular the number of iterations grows with CG (the matrix is SPD) > > + > > BJacobi as mesh is refined (probably due to condition number increasing) > > and > > number of processors is increased (probably due to the Bjacobi > > preconditioner). For instance I tried the following setup: > > 1 procs to solve 32^3 domain => 20 iterations > > 8 procs to solve 64^3 domain => 60 iterations > > 64 procs to solve 128^3 domain => 101 iterations > > > > Is there something pathological with my runs (maybe I am missing > > something)? > > Is there somebody who can provide me weak scaling benchmarks for > > equivalent > > problems? (Maybe there is some better preconditioner for this problem). > > Bjacobi is not a scalable preconditioner. As you note, the number of > iterates grows > with the system size. You should always use MG here. > > > I am also aware that Multigrid is even better for this problems but the > > **scalability** of my runs seems to be as bad as with CG. > > MG will weak scale almost perfectly. Send -log_summary for each run if this > does not happen. > > Thanks, > > Matt > > > -pc_mg_galerkin > > -pc_type mg > > (both directly with richardson or as preconditioner to cg) > > > > The following is the "-log_summary" of a 128^3 run, notice that I solve > > the > > system multiple times (hence KSPSolve is multiplied by 128). Using CG + > > BJacobi. > > > > Tell me if I missed some detail and sorry for the length of the post. > > > > Thanks, > > Filippo > > > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 > > > > Max Max/Min Avg Total > > > > Time (sec): 9.095e+01 1.00001 9.095e+01 > > Objects: 1.875e+03 1.00000 1.875e+03 > > Flops: 1.733e+10 1.00000 1.733e+10 1.109e+12 > > Flops/sec: 1.905e+08 1.00001 1.905e+08 1.219e+10 > > MPI Messages: 1.050e+05 1.00594 1.044e+05 6.679e+06 > > MPI Message Lengths: 1.184e+09 1.37826 8.283e+03 5.532e+10 > > MPI Reductions: 4.136e+04 1.00000 > > > > Flop counting convention: 1 flop = 1 real number operation of type > > (multiply/divide/add/subtract) > > > > e.g., VecAXPY() for real vectors of length N > > > > --> > > 2N flops > > > > and VecAXPY() for complex vectors of length N > > > > --> > > 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > > --- > > -- Message Lengths -- -- Reductions -- > > > > Avg %Total Avg %Total counts > > %Total > > > > Avg %Total counts %Total > > > > 0: Main Stage: 1.1468e-01 0.1% 0.0000e+00 0.0% 0.000e+00 > > 0.0% > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > 1: StepStage: 4.4170e-01 0.5% 7.2478e+09 0.7% 0.000e+00 > > 0.0% > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > 2: ConvStage: 8.8333e+00 9.7% 3.7044e+10 3.3% 1.475e+06 > > 22.1% > > > > 1.809e+03 21.8% 0.000e+00 0.0% > > > > 3: ProjStage: 7.7169e+01 84.8% 1.0556e+12 95.2% 5.151e+06 > > 77.1% > > > > 6.317e+03 76.3% 4.024e+04 97.3% > > > > 4: IoStage: 2.4789e+00 2.7% 0.0000e+00 0.0% 3.564e+03 > > 0.1% > > > > 1.017e+02 1.2% 5.000e+01 0.1% > > > > 5: SolvAlloc: 7.0947e-01 0.8% 0.0000e+00 0.0% 5.632e+03 > > 0.1% > > > > 9.587e-01 0.0% 3.330e+02 0.8% > > > > 6: SolvSolve: 1.2044e+00 1.3% 9.1679e+09 0.8% 4.454e+04 > > 0.7% > > > > 5.464e+01 0.7% 7.320e+02 1.8% > > > > 7: SolvDeall: 7.5711e-04 0.0% 0.0000e+00 0.0% 0.000e+00 > > 0.0% > > > > 0.000e+00 0.0% 0.000e+00 0.0% > > > > > > -------------------------------------------------------------------------- > > ---------------------------------------------- See the 'Profiling' chapter > > of the users' manual for details on > > interpreting > > output. > > > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > > > Ratio - ratio of maximum to minimum over all processors > > > > Mess: number of messages sent > > Avg. len: average message length > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() and > > > > PetscLogStagePop(). > > > > %T - percent time in this phase %f - percent flops in this > > > > phase > > > > %M - percent messages in this phase %L - percent message lengths > > > > in > > this phase > > > > %R - percent reductions in this phase > > > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > > > > over all > > processors) > > > > -------------------------------------------------------------------------- > > ---------------------------------------------- Event Count > > Time (sec) Flops > > --- Global --- --- Stage --- Total > > > > Max Ratio Max Ratio Max Ratio Mess Avg len > > > > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > > > > -------------------------------------------------------------------------- > > ---------------------------------------------- > > > > --- Event Stage 0: Main Stage > > > > > > --- Event Stage 1: StepStage > > > > VecAXPY 1536 1.0 4.6436e-01 1.1 1.13e+08 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 1 0 0 0 99100 0 0 0 15608 > > > > --- Event Stage 2: ConvStage > > > > VecCopy 2304 1.0 8.1658e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 > > VecAXPY 2304 1.0 6.1324e-01 1.2 1.51e+08 1.0 0.0e+00 0.0e+00 > > 0.0e+00 1 1 0 0 0 6 26 0 0 0 15758 > > VecAXPBYCZ 2688 1.0 1.3029e+00 1.1 3.52e+08 1.0 0.0e+00 0.0e+00 > > 0.0e+00 1 2 0 0 0 14 61 0 0 0 17306 > > VecPointwiseMult 2304 1.0 7.2368e-01 1.0 7.55e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 1 0 0 0 0 8 13 0 0 0 6677 > > VecScatterBegin 3840 1.0 1.8182e+00 1.3 0.00e+00 0.0 1.5e+06 8.2e+03 > > 0.0e+00 2 0 22 22 0 18 0100100 0 0 > > VecScatterEnd 3840 1.0 1.1972e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 > > > > --- Event Stage 3: ProjStage > > > > VecTDot 25802 1.0 4.2552e+00 1.3 1.69e+09 1.0 0.0e+00 0.0e+00 > > 2.6e+04 4 10 0 0 62 5 10 0 0 64 25433 > > VecNorm 13029 1.0 3.0772e+00 3.3 8.54e+08 1.0 0.0e+00 0.0e+00 > > 1.3e+04 2 5 0 0 32 2 5 0 0 32 17759 > > VecCopy 640 1.0 2.4339e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 13157 1.0 7.0903e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > > VecAXPY 26186 1.0 4.1462e+00 1.1 1.72e+09 1.0 0.0e+00 0.0e+00 > > 0.0e+00 4 10 0 0 0 5 10 0 0 0 26490 > > VecAYPX 12773 1.0 1.9135e+00 1.1 8.37e+08 1.0 0.0e+00 0.0e+00 > > 0.0e+00 2 5 0 0 0 2 5 0 0 0 27997 > > VecScatterBegin 13413 1.0 1.0689e+00 1.1 0.00e+00 0.0 5.2e+06 8.2e+03 > > 0.0e+00 1 0 77 76 0 1 0100100 0 0 > > VecScatterEnd 13413 1.0 2.7944e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 > > MatMult 12901 1.0 3.2072e+01 1.0 5.92e+09 1.0 5.0e+06 8.2e+03 > > 0.0e+00 35 34 74 73 0 41 36 96 96 0 11810 > > MatSolve 13029 1.0 3.0851e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 > > 0.0e+00 33 31 0 0 0 39 33 0 0 0 11182 > > MatLUFactorNum 128 1.0 1.2922e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 1 1 0 0 0 2 1 0 0 0 4358 > > MatILUFactorSym 128 1.0 7.5075e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > 1.3e+02 1 0 0 0 0 1 0 0 0 0 0 > > MatGetRowIJ 128 1.0 1.4782e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetOrdering 128 1.0 5.7567e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > 2.6e+02 0 0 0 0 1 0 0 0 0 1 0 > > KSPSetUp 256 1.0 1.9913e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > > 7.7e+02 0 0 0 0 2 0 0 0 0 2 0 > > KSPSolve 128 1.0 7.6381e+01 1.0 1.65e+10 1.0 5.0e+06 8.2e+03 > > 4.0e+04 84 95 74 73 97 99100 96 96100 13800 > > PCSetUp 256 1.0 2.1503e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > > 6.4e+02 2 1 0 0 2 3 1 0 0 2 2619 > > PCSetUpOnBlocks 128 1.0 2.1232e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 > > 3.8e+02 2 1 0 0 1 3 1 0 0 1 2652 > > PCApply 13029 1.0 3.1812e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 > > 0.0e+00 34 31 0 0 0 40 33 0 0 0 10844 > > > > --- Event Stage 4: IoStage > > > > VecView 10 1.0 1.7523e+00282.9 0.00e+00 0.0 0.0e+00 0.0e+00 > > 2.0e+01 1 0 0 0 0 36 0 0 0 40 0 > > VecCopy 10 1.0 2.2449e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 6 1.0 2.3620e-03 2.4 0.00e+00 0.0 2.3e+03 8.2e+03 > > 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 > > VecScatterEnd 6 1.0 4.4194e-01663.9 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 9 0 0 0 0 0 > > > > --- Event Stage 5: SolvAlloc > > > > VecSet 50 1.0 1.3170e-01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 > > MatAssemblyBegin 4 1.0 3.9801e-0230.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > 8.0e+00 0 0 0 0 0 3 0 0 0 2 0 > > MatAssemblyEnd 4 1.0 2.2752e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 > > 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 > > > > --- Event Stage 6: SolvSolve > > > > VecTDot 224 1.0 3.5454e-02 1.3 1.47e+07 1.0 0.0e+00 0.0e+00 > > 2.2e+02 0 0 0 0 1 3 10 0 0 31 26499 > > VecNorm 497 1.0 1.5268e-01 1.4 7.41e+06 1.0 0.0e+00 0.0e+00 > > 5.0e+02 0 0 0 0 1 11 5 0 0 68 3104 > > VecCopy 8 1.0 2.7523e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 114 1.0 5.9965e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecAXPY 230 1.0 3.7198e-02 1.1 1.51e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 3 11 0 0 0 25934 > > VecAYPX 111 1.0 1.7153e-02 1.1 7.27e+06 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 1 5 0 0 0 27142 > > VecScatterBegin 116 1.0 1.1888e-02 1.2 0.00e+00 0.0 4.5e+04 8.2e+03 > > 0.0e+00 0 0 1 1 0 1 0100100 0 0 > > VecScatterEnd 116 1.0 2.8105e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 > > MatMult 112 1.0 2.8080e-01 1.0 5.14e+07 1.0 4.3e+04 8.2e+03 > > 0.0e+00 0 0 1 1 0 23 36 97 97 0 11711 > > MatSolve 113 1.0 2.6673e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 22 33 0 0 0 11217 > > MatLUFactorNum 1 1.0 1.0332e-02 1.0 6.87e+05 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 1 0 0 0 0 4259 > > MatILUFactorSym 1 1.0 3.1291e-02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 > > 1.0e+00 0 0 0 0 0 2 0 0 0 0 0 > > MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetOrdering 1 1.0 3.4251e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 > > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSetUp 2 1.0 3.6959e-0210.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 6.0e+00 0 0 0 0 0 1 0 0 0 1 0 > > KSPSolve 1 1.0 6.9956e-01 1.0 1.43e+08 1.0 4.3e+04 8.2e+03 > > 3.5e+02 1 1 1 1 1 58100 97 97 48 13069 > > PCSetUp 2 1.0 4.4161e-02 2.3 6.87e+05 1.0 0.0e+00 0.0e+00 > > 5.0e+00 0 0 0 0 0 3 0 0 0 1 996 > > PCSetUpOnBlocks 1 1.0 4.3894e-02 2.4 6.87e+05 1.0 0.0e+00 0.0e+00 > > 3.0e+00 0 0 0 0 0 3 0 0 0 0 1002 > > PCApply 113 1.0 2.7507e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 > > 0.0e+00 0 0 0 0 0 22 33 0 0 0 10877 > > > > --- Event Stage 7: SolvDeall > > > > > > -------------------------------------------------------------------------- > > ---------------------------------------------- > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' > > Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Viewer 1 0 0 0 > > > > --- Event Stage 1: StepStage > > > > > > --- Event Stage 2: ConvStage > > > > > > --- Event Stage 3: ProjStage > > > > Vector 640 640 101604352 0 > > Matrix 128 128 410327040 0 > > > > Index Set 384 384 17062912 0 > > > > Krylov Solver 256 256 282624 0 > > > > Preconditioner 256 256 228352 0 > > > > --- Event Stage 4: IoStage > > > > Vector 10 10 2636400 0 > > Viewer 10 10 6880 0 > > > > --- Event Stage 5: SolvAlloc > > > > Vector 140 6 8848 0 > > > > Vector Scatter 6 0 0 0 > > > > Matrix 6 0 0 0 > > > > Distributed Mesh 2 0 0 0 > > > > Bipartite Graph 4 0 0 0 > > > > Index Set 14 14 372400 0 > > > > IS L to G Mapping 3 0 0 0 > > > > Krylov Solver 1 0 0 0 > > > > Preconditioner 1 0 0 0 > > > > --- Event Stage 6: SolvSolve > > > > Vector 5 0 0 0 > > Matrix 1 0 0 0 > > > > Index Set 3 0 0 0 > > > > Krylov Solver 2 1 1136 0 > > > > Preconditioner 2 1 824 0 > > > > --- Event Stage 7: SolvDeall > > > > Vector 0 133 36676728 0 > > > > Vector Scatter 0 1 1036 0 > > > > Matrix 0 4 7038924 0 > > > > Index Set 0 3 133304 0 > > > > Krylov Solver 0 2 2208 0 > > > > Preconditioner 0 2 1784 0 > > > > ========================================================================== > > ============================================== Average time to get > > PetscTime(): 9.53674e-08 > > Average time for MPI_Barrier(): 1.12057e-05 > > Average time for zero size MPI_Send(): 1.3113e-06 > > #PETSc Option Table entries: > > -ksp_type cg > > -log_summary > > -pc_type bjacobi > > #End of PETSc Option Table entries > > Compiled without FORTRAN kernels > > Compiled with full precision matrices (default) > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > > Configure run at: > > Configure options: > > Application 9457215 resources: utime ~5920s, stime ~58s -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From bsmith at mcs.anl.gov Mon Sep 29 14:53:11 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 29 Sep 2014 14:53:11 -0500 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: <20769360.6VtOyZos7S@besikovitch-ii> References: <2490546.DNVhllGaLT@besikovitch-ii> <20769360.6VtOyZos7S@besikovitch-ii> Message-ID: <148FE4BD-CE0F-4A1F-A1EE-2268FF784FE1@mcs.anl.gov> On Sep 29, 2014, at 9:36 AM, Filippo Leonardi wrote: > Thank you. > > Actually I had the feeling that it wasn't my problem with Bjacobi and CG. > > So I'll stick to MG. Problem with MG is that there are a lot of parameters to > be tuned, so I leave the defaults (expect I select CG as Krylow method). I > post just results for 64^3 and 128^3. Tell me if I'm missing some useful > detail. (I get similar results with BoomerAMG). > > Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin -pc_type > mg): > 32^3 and 1 proc: 1.01e-1 > 64^3 and 8 proc: 6.56e-01 > 128^3 and 64 proc: 1.05e+00 > Number of PCSetup per KSPSolve: > 15 > 39 > 65 You are not setting the number of levels with mg here. Likely it is always using 1 level. Run with -ksp_view you may need to use -pc_mg_levels > > With BoomerAMG: > stable 8 iterations per KSP but time per iteration greater than PETSc MG and > still increases: > 64^3: 3.17e+00 > 128^3: 9.99e+00 > > > --> For instance with 64^3 (256 iterations): > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 > > Max Max/Min Avg Total > Time (sec): 1.896e+02 1.00000 1.896e+02 > Objects: 7.220e+03 1.00000 7.220e+03 > Flops: 3.127e+10 1.00000 3.127e+10 2.502e+11 > Flops/sec: 1.649e+08 1.00000 1.649e+08 1.319e+09 > MPI Messages: 9.509e+04 1.00316 9.483e+04 7.586e+05 > MPI Message Lengths: 1.735e+09 1.09967 1.685e+04 1.278e+10 > MPI Reductions: 4.781e+04 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> > 2N flops > and VecAXPY() for complex vectors of length N --> > 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- > -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total > Avg %Total counts %Total > 0: Main Stage: 1.3416e-02 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 1: StepStage: 8.7909e-01 0.5% 1.8119e+09 0.7% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 2: ConvStage: 1.7172e+01 9.1% 9.2610e+09 3.7% 1.843e+05 24.3% > 3.981e+03 23.6% 0.000e+00 0.0% > 3: ProjStage: 1.6804e+02 88.6% 2.3813e+11 95.2% 5.703e+05 75.2% > 1.232e+04 73.1% 4.627e+04 96.8% > 4: IoStage: 1.5814e+00 0.8% 0.0000e+00 0.0% 1.420e+03 0.2% > 4.993e+02 3.0% 2.500e+02 0.5% > 5: SolvAlloc: 2.5722e-01 0.1% 0.0000e+00 0.0% 2.560e+02 0.0% > 1.054e+00 0.0% 3.330e+02 0.7% > 6: SolvSolve: 1.6776e+00 0.9% 9.5345e+08 0.4% 2.280e+03 0.3% > 4.924e+01 0.3% 9.540e+02 2.0% > 7: SolvDeall: 7.4017e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting > output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %f - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in > this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all > processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > > --- Event Stage 1: StepStage > > VecAXPY 3072 1.0 8.8295e-01 1.0 2.26e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 99100 0 0 0 2052 > > --- Event Stage 2: ConvStage > > VecCopy 4608 1.0 1.6016e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 > VecAXPY 4608 1.0 1.2212e+00 1.2 3.02e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 6 26 0 0 0 1978 > VecAXPBYCZ 5376 1.0 2.5875e+00 1.1 7.05e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 2 0 0 0 15 61 0 0 0 2179 > VecPointwiseMult 4608 1.0 1.4411e+00 1.0 1.51e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 8 13 0 0 0 838 > VecScatterBegin 7680 1.0 3.4130e+00 1.0 0.00e+00 0.0 1.8e+05 1.6e+04 > 0.0e+00 2 0 24 24 0 20 0100100 0 0 > VecScatterEnd 7680 1.0 9.3412e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 5 0 0 0 0 0 > > --- Event Stage 3: ProjStage > > VecMDot 2560 1.0 2.1944e+00 1.1 9.23e+08 1.0 0.0e+00 0.0e+00 > 2.6e+03 1 3 0 0 5 1 3 0 0 6 3364 > VecTDot 19924 1.0 2.7283e+00 1.3 1.31e+09 1.0 0.0e+00 0.0e+00 > 2.0e+04 1 4 0 0 42 1 4 0 0 43 3829 > VecNorm 13034 1.0 1.5385e+00 2.0 8.54e+08 1.0 0.0e+00 0.0e+00 > 1.3e+04 1 3 0 0 27 1 3 0 0 28 4442 > VecScale 13034 1.0 9.0783e-01 1.3 4.27e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 3764 > VecCopy 21972 1.0 3.5136e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > VecSet 21460 1.0 1.3108e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecAXPY 41384 1.0 5.9866e+00 1.1 2.71e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 3 9 0 0 0 3 9 0 0 0 3624 > VecAYPX 30142 1.0 5.3362e+00 1.0 1.64e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 3 5 0 0 0 3 6 0 0 0 2460 > VecMAXPY 2816 1.0 1.8561e+00 1.0 1.09e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 3 0 0 0 1 4 0 0 0 4700 > VecScatterBegin 23764 1.0 1.7138e+00 1.1 0.00e+00 0.0 5.7e+05 1.6e+04 > 0.0e+00 1 0 75 73 0 1 0100100 0 0 > VecScatterEnd 23764 1.0 3.1986e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecNormalize 2816 1.0 2.9511e-01 1.1 2.77e+08 1.0 0.0e+00 0.0e+00 > 2.8e+03 0 1 0 0 6 0 1 0 0 6 7504 > MatMult 22740 1.0 4.6896e+01 1.0 1.04e+10 1.0 5.5e+05 1.6e+04 > 0.0e+00 25 33 72 70 0 28 35 96 96 0 1780 > MatSOR 23252 1.0 9.5250e+01 1.0 1.04e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 50 33 0 0 0 56 35 0 0 0 872 > KSPGMRESOrthog 2560 1.0 3.6142e+00 1.1 1.85e+09 1.0 0.0e+00 0.0e+00 > 2.6e+03 2 6 0 0 5 2 6 0 0 6 4085 > KSPSetUp 768 1.0 7.9389e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 5.6e+03 0 0 0 0 12 0 0 0 0 12 0 > KSPSolve 256 1.0 1.6661e+02 1.0 2.97e+10 1.0 5.5e+05 1.6e+04 > 4.6e+04 88 95 72 70 97 99100 96 96100 1427 > PCSetUp 256 1.0 2.6755e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.5e+03 0 0 0 0 3 0 0 0 0 3 0 > PCApply 10218 1.0 1.3642e+02 1.0 2.12e+10 1.0 3.1e+05 1.6e+04 > 1.3e+04 72 68 40 39 27 81 71 54 54 28 1245 > > --- Event Stage 4: IoStage > > VecView 50 1.0 8.8377e-0138.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+02 0 0 0 0 0 29 0 0 0 40 0 > VecCopy 50 1.0 8.9977e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecScatterBegin 30 1.0 1.0644e-02 1.6 0.00e+00 0.0 7.2e+02 1.6e+04 > 0.0e+00 0 0 0 0 0 1 0 51 3 0 0 > VecScatterEnd 30 1.0 2.4857e-01109.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 8 0 0 0 0 0 > > --- Event Stage 5: SolvAlloc > > VecSet 50 1.0 1.9324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 7 0 0 0 0 0 > MatAssemblyBegin 4 1.0 5.0378e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 1 0 0 0 2 0 > MatAssemblyEnd 4 1.0 1.5030e-02 1.0 0.00e+00 0.0 9.6e+01 4.1e+03 > 1.6e+01 0 0 0 0 0 6 0 38 49 5 0 > > --- Event Stage 6: SolvSolve > > VecMDot 10 1.0 8.9154e-03 1.1 3.60e+06 1.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 3 0 0 1 3234 > VecTDot 80 1.0 1.1104e-02 1.1 5.24e+06 1.0 0.0e+00 0.0e+00 > 8.0e+01 0 0 0 0 0 1 4 0 0 8 3777 > VecNorm 820 1.0 2.6904e-01 1.6 3.41e+06 1.0 0.0e+00 0.0e+00 > 8.2e+02 0 0 0 0 2 13 3 0 0 86 101 > VecScale 52 1.0 3.6066e-03 1.2 1.70e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 1 0 0 0 3780 > VecCopy 91 1.0 1.4363e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecSet 86 1.0 5.1112e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 169 1.0 2.4659e-02 1.1 1.11e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 9 0 0 0 3593 > VecAYPX 121 1.0 2.2017e-02 1.1 6.59e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 6 0 0 0 2393 > VecMAXPY 11 1.0 7.2782e-03 1.0 4.26e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 4 0 0 0 4682 > VecScatterBegin 95 1.0 7.3617e-03 1.1 0.00e+00 0.0 2.3e+03 1.6e+04 > 0.0e+00 0 0 0 0 0 0 0100100 0 0 > VecScatterEnd 95 1.0 1.3788e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecNormalize 11 1.0 1.2109e-03 1.1 1.08e+06 1.0 0.0e+00 0.0e+00 > 1.1e+01 0 0 0 0 0 0 1 0 0 1 7144 > MatMult 91 1.0 1.9398e-01 1.0 4.17e+07 1.0 2.2e+03 1.6e+04 > 0.0e+00 0 0 0 0 0 11 35 96 96 0 1722 > MatSOR 93 1.0 3.8194e-01 1.0 4.16e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 23 35 0 0 0 870 > KSPGMRESOrthog 10 1.0 1.4540e-02 1.1 7.21e+06 1.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 1 6 0 0 1 3966 > KSPSetUp 3 1.0 5.2021e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.4e+01 0 0 0 0 0 0 0 0 0 3 0 > KSPSolve 1 1.0 6.7911e-01 1.0 1.19e+08 1.0 2.2e+03 1.6e+04 > 1.9e+02 0 0 0 0 0 40100 96 96 19 1399 > PCSetUp 1 1.0 1.9128e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 > PCApply 41 1.0 5.5355e-01 1.0 8.47e+07 1.0 1.2e+03 1.6e+04 > 5.1e+01 0 0 0 0 0 33 71 54 54 5 1224 > > --- Event Stage 7: SolvDeall > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Viewer 1 0 0 0 > > --- Event Stage 1: StepStage > > > --- Event Stage 2: ConvStage > > > --- Event Stage 3: ProjStage > > Vector 5376 5376 1417328640 0 > Krylov Solver 768 768 8298496 0 > Preconditioner 768 768 645120 0 > > --- Event Stage 4: IoStage > > Vector 50 50 13182000 0 > Viewer 50 50 34400 0 > > --- Event Stage 5: SolvAlloc > > Vector 140 6 8848 0 > Vector Scatter 6 0 0 0 > Matrix 6 0 0 0 > Distributed Mesh 2 0 0 0 > Bipartite Graph 4 0 0 0 > Index Set 14 14 372400 0 > IS L to G Mapping 3 0 0 0 > Krylov Solver 2 0 0 0 > Preconditioner 2 0 0 0 > > --- Event Stage 6: SolvSolve > > Vector 22 0 0 0 > Krylov Solver 3 2 2296 0 > Preconditioner 3 2 1760 0 > > --- Event Stage 7: SolvDeall > > Vector 0 149 41419384 0 > Vector Scatter 0 1 1036 0 > Matrix 0 3 4619676 0 > Krylov Solver 0 3 32416 0 > Preconditioner 0 3 2520 0 > ======================================================================================================================== > Average time to get PetscTime(): 1.90735e-07 > Average time for MPI_Barrier(): 4.62532e-06 > Average time for zero size MPI_Send(): 1.51992e-06 > #PETSc Option Table entries: > -ksp_type cg > -log_summary > -pc_mg_galerkin > -pc_type mg > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure run at: > Configure options: > > --> And with 128^3 (512 iterations): > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 > > Max Max/Min Avg Total > Time (sec): 5.889e+02 1.00000 5.889e+02 > Objects: 1.413e+04 1.00000 1.413e+04 > Flops: 9.486e+10 1.00000 9.486e+10 6.071e+12 > Flops/sec: 1.611e+08 1.00000 1.611e+08 1.031e+10 > MPI Messages: 5.392e+05 1.00578 5.361e+05 3.431e+07 > MPI Message Lengths: 6.042e+09 1.36798 8.286e+03 2.843e+11 > MPI Reductions: 1.343e+05 1.00000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> > 2N flops > and VecAXPY() for complex vectors of length N --> > 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- > -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total > Avg %Total counts %Total > 0: Main Stage: 1.1330e-01 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 1: StepStage: 1.7508e+00 0.3% 2.8991e+10 0.5% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > 2: ConvStage: 3.5534e+01 6.0% 1.4818e+11 2.4% 5.898e+06 17.2% > 1.408e+03 17.0% 0.000e+00 0.0% > 3: ProjStage: 5.3568e+02 91.0% 5.8820e+12 96.9% 2.833e+07 82.6% > 6.765e+03 81.6% 1.319e+05 98.2% > 4: IoStage: 1.1365e+01 1.9% 0.0000e+00 0.0% 1.782e+04 0.1% > 9.901e+01 1.2% 2.500e+02 0.2% > 5: SolvAlloc: 7.1497e-01 0.1% 0.0000e+00 0.0% 5.632e+03 0.0% > 1.866e-01 0.0% 3.330e+02 0.2% > 6: SolvSolve: 3.7604e+00 0.6% 1.1888e+10 0.2% 5.722e+04 0.2% > 1.366e+01 0.2% 1.803e+03 1.3% > 7: SolvDeall: 7.6677e-04 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% > 0.000e+00 0.0% 0.000e+00 0.0% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting > output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %f - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in > this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all > processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > > --- Event Stage 1: StepStage > > VecAXPY 6144 1.0 1.8187e+00 1.1 4.53e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 99100 0 0 0 15941 > > --- Event Stage 2: ConvStage > > VecCopy 9216 1.0 3.2440e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 > VecAXPY 9216 1.0 2.4045e+00 1.1 6.04e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 6 26 0 0 0 16076 > VecAXPBYCZ 10752 1.0 5.1656e+00 1.1 1.41e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 14 61 0 0 0 17460 > VecPointwiseMult 9216 1.0 2.9012e+00 1.0 3.02e+08 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 8 13 0 0 0 6662 > VecScatterBegin 15360 1.0 7.3895e+00 1.3 0.00e+00 0.0 5.9e+06 8.2e+03 > 0.0e+00 1 0 17 17 0 18 0100100 0 0 > VecScatterEnd 15360 1.0 4.4483e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 > > --- Event Stage 3: ProjStage > > VecMDot 5120 1.0 5.2159e+00 1.2 1.85e+09 1.0 0.0e+00 0.0e+00 > 5.1e+03 1 2 0 0 4 1 2 0 0 4 22644 > VecTDot 66106 1.0 1.3662e+01 1.4 4.33e+09 1.0 0.0e+00 0.0e+00 > 6.6e+04 2 5 0 0 49 2 5 0 0 50 20295 > VecNorm 39197 1.0 1.4431e+01 2.8 2.57e+09 1.0 0.0e+00 0.0e+00 > 3.9e+04 2 3 0 0 29 2 3 0 0 30 11392 > VecScale 39197 1.0 2.8002e+00 1.2 1.28e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 29356 > VecCopy 70202 1.0 1.1299e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > VecSet 69178 1.0 3.9612e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 > VecAXPY 135284 1.0 1.9286e+01 1.1 8.87e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 3 9 0 0 0 3 10 0 0 0 29422 > VecAYPX 99671 1.0 1.7862e+01 1.1 5.43e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 3 6 0 0 0 3 6 0 0 0 19464 > VecMAXPY 5632 1.0 3.7555e+00 1.0 2.18e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 1 2 0 0 0 1 2 0 0 0 37169 > VecScatterBegin 73786 1.0 6.2463e+00 1.2 0.00e+00 0.0 2.8e+07 8.2e+03 > 0.0e+00 1 0 83 82 0 1 0100100 0 0 > VecScatterEnd 73786 1.0 2.1679e+01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 > VecNormalize 5632 1.0 9.0864e-01 1.2 5.54e+08 1.0 0.0e+00 0.0e+00 > 5.6e+03 0 1 0 0 4 0 1 0 0 4 38996 > MatMult 71738 1.0 1.5645e+02 1.1 3.29e+10 1.0 2.8e+07 8.2e+03 > 0.0e+00 26 35 80 79 0 28 36 97 97 0 13462 > MatSOR 72762 1.0 2.9900e+02 1.0 3.25e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 49 34 0 0 0 54 35 0 0 0 6953 > KSPGMRESOrthog 5120 1.0 8.0849e+00 1.1 3.69e+09 1.0 0.0e+00 0.0e+00 > 5.1e+03 1 4 0 0 4 1 4 0 0 4 29218 > KSPSetUp 1536 1.0 2.0613e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.1e+04 0 0 0 0 8 0 0 0 0 9 0 > KSPSolve 512 1.0 5.3248e+02 1.0 9.18e+10 1.0 2.8e+07 8.2e+03 > 1.3e+05 90 97 80 79 98 99100 97 97100 11034 > PCSetUp 512 1.0 5.6760e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 3.1e+03 0 0 0 0 2 0 0 0 0 2 0 > PCApply 33565 1.0 4.2495e+02 1.0 6.36e+10 1.0 1.5e+07 8.2e+03 > 2.6e+04 71 67 43 43 19 78 69 52 52 20 9585 > > --- Event Stage 4: IoStage > > VecView 50 1.0 7.7463e+00240.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+02 1 0 0 0 0 34 0 0 0 40 0 > VecCopy 50 1.0 1.0773e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 30 1.0 1.1727e-02 2.3 0.00e+00 0.0 1.2e+04 8.2e+03 > 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 > VecScatterEnd 30 1.0 2.2058e+00701.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 10 0 0 0 0 0 > > --- Event Stage 5: SolvAlloc > > VecSet 50 1.0 1.3748e-01 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 14 0 0 0 0 0 > MatAssemblyBegin 4 1.0 3.1760e-0217.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 2 0 0 0 2 0 > MatAssemblyEnd 4 1.0 2.1847e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 > 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 > > --- Event Stage 6: SolvSolve > > VecMDot 10 1.0 1.2067e-02 1.5 3.60e+06 1.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 2 0 0 1 19117 > VecTDot 134 1.0 2.6145e-02 1.5 8.78e+06 1.0 0.0e+00 0.0e+00 > 1.3e+02 0 0 0 0 0 1 5 0 0 7 21497 > VecNorm 1615 1.0 1.4866e+00 3.5 5.18e+06 1.0 0.0e+00 0.0e+00 > 1.6e+03 0 0 0 0 1 29 3 0 0 90 223 > VecScale 79 1.0 5.9721e-03 1.2 2.59e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 1 0 0 0 27741 > VecCopy 145 1.0 2.4912e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecSet 140 1.0 7.9901e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 277 1.0 4.0597e-02 1.2 1.82e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 10 0 0 0 28619 > VecAYPX 202 1.0 3.5421e-02 1.1 1.10e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 6 0 0 0 19893 > VecMAXPY 11 1.0 7.7360e-03 1.1 4.26e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 2 0 0 0 35242 > VecScatterBegin 149 1.0 1.4983e-02 1.2 0.00e+00 0.0 5.7e+04 8.2e+03 > 0.0e+00 0 0 0 0 0 0 0100100 0 0 > VecScatterEnd 149 1.0 5.0236e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > VecNormalize 11 1.0 7.1080e-03 3.9 1.08e+06 1.0 0.0e+00 0.0e+00 > 1.1e+01 0 0 0 0 0 0 1 0 0 1 9736 > MatMult 145 1.0 3.2611e-01 1.1 6.65e+07 1.0 5.6e+04 8.2e+03 > 0.0e+00 0 0 0 0 0 8 36 97 97 0 13055 > MatSOR 147 1.0 6.0702e-01 1.0 6.57e+07 1.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 16 35 0 0 0 6923 > KSPGMRESOrthog 10 1.0 1.7956e-02 1.3 7.21e+06 1.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 4 0 0 1 25694 > KSPSetUp 3 1.0 3.0483e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.4e+01 0 0 0 0 0 1 0 0 0 1 0 > KSPSolve 1 1.0 1.1431e+00 1.0 1.85e+08 1.0 5.6e+04 8.2e+03 > 2.7e+02 0 0 0 0 0 30100 97 97 15 10378 > PCSetUp 1 1.0 1.1488e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 > PCApply 68 1.0 9.1644e-01 1.0 1.28e+08 1.0 3.0e+04 8.2e+03 > 5.1e+01 0 0 0 0 0 24 69 52 52 3 8959 > > --- Event Stage 7: SolvDeall > > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Viewer 1 0 0 0 > > --- Event Stage 1: StepStage > > > --- Event Stage 2: ConvStage > > > --- Event Stage 3: ProjStage > > Vector 10752 10752 2834657280 0 > Krylov Solver 1536 1536 16596992 0 > Preconditioner 1536 1536 1290240 0 > > --- Event Stage 4: IoStage > > Vector 50 50 13182000 0 > Viewer 50 50 34400 0 > > --- Event Stage 5: SolvAlloc > > Vector 140 6 8848 0 > Vector Scatter 6 0 0 0 > Matrix 6 0 0 0 > Distributed Mesh 2 0 0 0 > Bipartite Graph 4 0 0 0 > Index Set 14 14 372400 0 > IS L to G Mapping 3 0 0 0 > Krylov Solver 2 0 0 0 > Preconditioner 2 0 0 0 > > --- Event Stage 6: SolvSolve > > Vector 22 0 0 0 > Krylov Solver 3 2 2296 0 > Preconditioner 3 2 1760 0 > > --- Event Stage 7: SolvDeall > > Vector 0 149 41419384 0 > Vector Scatter 0 1 1036 0 > Matrix 0 3 4619676 0 > Krylov Solver 0 3 32416 0 > Preconditioner 0 3 2520 0 > ======================================================================================================================== > Average time to get PetscTime(): 9.53674e-08 > Average time for MPI_Barrier(): 1.13964e-05 > Average time for zero size MPI_Send(): 1.2815e-06 > #PETSc Option Table entries: > -ksp_type cg > -log_summary > -pc_mg_galerkin > -pc_type mg > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure run at: > Configure options: > > Best, > Filippo > > On Monday 29 September 2014 08:58:35 Matthew Knepley wrote: >> On Mon, Sep 29, 2014 at 8:42 AM, Filippo Leonardi < >> >> filippo.leonardi at sam.math.ethz.ch> wrote: >>> Hi, >>> >>> I am trying to solve a standard second order central differenced Poisson >>> equation in parallel, in 3D, using a 3D structured DMDAs (extremely >>> standard >>> Laplacian matrix). >>> >>> I want to get some nice scaling (especially weak), but my results show >>> that >>> the Krylow method is not performing as expected. The problem (at leas for >>> CG + >>> Bjacobi) seems to lie on the number of iterations. >>> >>> In particular the number of iterations grows with CG (the matrix is SPD) >>> + >>> BJacobi as mesh is refined (probably due to condition number increasing) >>> and >>> number of processors is increased (probably due to the Bjacobi >>> preconditioner). For instance I tried the following setup: >>> 1 procs to solve 32^3 domain => 20 iterations >>> 8 procs to solve 64^3 domain => 60 iterations >>> 64 procs to solve 128^3 domain => 101 iterations >>> >>> Is there something pathological with my runs (maybe I am missing >>> something)? >>> Is there somebody who can provide me weak scaling benchmarks for >>> equivalent >>> problems? (Maybe there is some better preconditioner for this problem). >> >> Bjacobi is not a scalable preconditioner. As you note, the number of >> iterates grows >> with the system size. You should always use MG here. >> >>> I am also aware that Multigrid is even better for this problems but the >>> **scalability** of my runs seems to be as bad as with CG. >> >> MG will weak scale almost perfectly. Send -log_summary for each run if this >> does not happen. >> >> Thanks, >> >> Matt >> >>> -pc_mg_galerkin >>> -pc_type mg >>> (both directly with richardson or as preconditioner to cg) >>> >>> The following is the "-log_summary" of a 128^3 run, notice that I solve >>> the >>> system multiple times (hence KSPSolve is multiplied by 128). Using CG + >>> BJacobi. >>> >>> Tell me if I missed some detail and sorry for the length of the post. >>> >>> Thanks, >>> Filippo >>> >>> Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 >>> >>> Max Max/Min Avg Total >>> >>> Time (sec): 9.095e+01 1.00001 9.095e+01 >>> Objects: 1.875e+03 1.00000 1.875e+03 >>> Flops: 1.733e+10 1.00000 1.733e+10 1.109e+12 >>> Flops/sec: 1.905e+08 1.00001 1.905e+08 1.219e+10 >>> MPI Messages: 1.050e+05 1.00594 1.044e+05 6.679e+06 >>> MPI Message Lengths: 1.184e+09 1.37826 8.283e+03 5.532e+10 >>> MPI Reductions: 4.136e+04 1.00000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> >>> e.g., VecAXPY() for real vectors of length N >>> >>> --> >>> 2N flops >>> >>> and VecAXPY() for complex vectors of length N >>> >>> --> >>> 8N flops >>> >>> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages >>> --- >>> -- Message Lengths -- -- Reductions -- >>> >>> Avg %Total Avg %Total counts >>> %Total >>> >>> Avg %Total counts %Total >>> >>> 0: Main Stage: 1.1468e-01 0.1% 0.0000e+00 0.0% 0.000e+00 >>> 0.0% >>> >>> 0.000e+00 0.0% 0.000e+00 0.0% >>> >>> 1: StepStage: 4.4170e-01 0.5% 7.2478e+09 0.7% 0.000e+00 >>> 0.0% >>> >>> 0.000e+00 0.0% 0.000e+00 0.0% >>> >>> 2: ConvStage: 8.8333e+00 9.7% 3.7044e+10 3.3% 1.475e+06 >>> 22.1% >>> >>> 1.809e+03 21.8% 0.000e+00 0.0% >>> >>> 3: ProjStage: 7.7169e+01 84.8% 1.0556e+12 95.2% 5.151e+06 >>> 77.1% >>> >>> 6.317e+03 76.3% 4.024e+04 97.3% >>> >>> 4: IoStage: 2.4789e+00 2.7% 0.0000e+00 0.0% 3.564e+03 >>> 0.1% >>> >>> 1.017e+02 1.2% 5.000e+01 0.1% >>> >>> 5: SolvAlloc: 7.0947e-01 0.8% 0.0000e+00 0.0% 5.632e+03 >>> 0.1% >>> >>> 9.587e-01 0.0% 3.330e+02 0.8% >>> >>> 6: SolvSolve: 1.2044e+00 1.3% 9.1679e+09 0.8% 4.454e+04 >>> 0.7% >>> >>> 5.464e+01 0.7% 7.320e+02 1.8% >>> >>> 7: SolvDeall: 7.5711e-04 0.0% 0.0000e+00 0.0% 0.000e+00 >>> 0.0% >>> >>> 0.000e+00 0.0% 0.000e+00 0.0% >>> >>> >>> -------------------------------------------------------------------------- >>> ---------------------------------------------- See the 'Profiling' chapter >>> of the users' manual for details on >>> interpreting >>> output. >>> >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flops: Max - maximum over all processors >>> >>> Ratio - ratio of maximum to minimum over all processors >>> >>> Mess: number of messages sent >>> Avg. len: average message length >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() and >>> >>> PetscLogStagePop(). >>> >>> %T - percent time in this phase %f - percent flops in this >>> >>> phase >>> >>> %M - percent messages in this phase %L - percent message lengths >>> >>> in >>> this phase >>> >>> %R - percent reductions in this phase >>> >>> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time >>> >>> over all >>> processors) >>> >>> -------------------------------------------------------------------------- >>> ---------------------------------------------- Event Count >>> Time (sec) Flops >>> --- Global --- --- Stage --- Total >>> >>> Max Ratio Max Ratio Max Ratio Mess Avg len >>> >>> Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s >>> >>> -------------------------------------------------------------------------- >>> ---------------------------------------------- >>> >>> --- Event Stage 0: Main Stage >>> >>> >>> --- Event Stage 1: StepStage >>> >>> VecAXPY 1536 1.0 4.6436e-01 1.1 1.13e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 1 0 0 0 99100 0 0 0 15608 >>> >>> --- Event Stage 2: ConvStage >>> >>> VecCopy 2304 1.0 8.1658e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 0 0 0 0 9 0 0 0 0 0 >>> VecAXPY 2304 1.0 6.1324e-01 1.2 1.51e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 1 0 0 0 6 26 0 0 0 15758 >>> VecAXPBYCZ 2688 1.0 1.3029e+00 1.1 3.52e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 2 0 0 0 14 61 0 0 0 17306 >>> VecPointwiseMult 2304 1.0 7.2368e-01 1.0 7.55e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 0 0 0 0 8 13 0 0 0 6677 >>> VecScatterBegin 3840 1.0 1.8182e+00 1.3 0.00e+00 0.0 1.5e+06 8.2e+03 >>> 0.0e+00 2 0 22 22 0 18 0100100 0 0 >>> VecScatterEnd 3840 1.0 1.1972e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 0 0 0 0 10 0 0 0 0 0 >>> >>> --- Event Stage 3: ProjStage >>> >>> VecTDot 25802 1.0 4.2552e+00 1.3 1.69e+09 1.0 0.0e+00 0.0e+00 >>> 2.6e+04 4 10 0 0 62 5 10 0 0 64 25433 >>> VecNorm 13029 1.0 3.0772e+00 3.3 8.54e+08 1.0 0.0e+00 0.0e+00 >>> 1.3e+04 2 5 0 0 32 2 5 0 0 32 17759 >>> VecCopy 640 1.0 2.4339e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 13157 1.0 7.0903e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >>> VecAXPY 26186 1.0 4.1462e+00 1.1 1.72e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 4 10 0 0 0 5 10 0 0 0 26490 >>> VecAYPX 12773 1.0 1.9135e+00 1.1 8.37e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 2 5 0 0 0 2 5 0 0 0 27997 >>> VecScatterBegin 13413 1.0 1.0689e+00 1.1 0.00e+00 0.0 5.2e+06 8.2e+03 >>> 0.0e+00 1 0 77 76 0 1 0100100 0 0 >>> VecScatterEnd 13413 1.0 2.7944e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 2 0 0 0 0 3 0 0 0 0 0 >>> MatMult 12901 1.0 3.2072e+01 1.0 5.92e+09 1.0 5.0e+06 8.2e+03 >>> 0.0e+00 35 34 74 73 0 41 36 96 96 0 11810 >>> MatSolve 13029 1.0 3.0851e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 33 31 0 0 0 39 33 0 0 0 11182 >>> MatLUFactorNum 128 1.0 1.2922e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 1 0 0 0 2 1 0 0 0 4358 >>> MatILUFactorSym 128 1.0 7.5075e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 1.3e+02 1 0 0 0 0 1 0 0 0 0 0 >>> MatGetRowIJ 128 1.0 1.4782e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetOrdering 128 1.0 5.7567e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 2.6e+02 0 0 0 0 1 0 0 0 0 1 0 >>> KSPSetUp 256 1.0 1.9913e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 7.7e+02 0 0 0 0 2 0 0 0 0 2 0 >>> KSPSolve 128 1.0 7.6381e+01 1.0 1.65e+10 1.0 5.0e+06 8.2e+03 >>> 4.0e+04 84 95 74 73 97 99100 96 96100 13800 >>> PCSetUp 256 1.0 2.1503e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 >>> 6.4e+02 2 1 0 0 2 3 1 0 0 2 2619 >>> PCSetUpOnBlocks 128 1.0 2.1232e+00 1.0 8.80e+07 1.0 0.0e+00 0.0e+00 >>> 3.8e+02 2 1 0 0 1 3 1 0 0 1 2652 >>> PCApply 13029 1.0 3.1812e+01 1.1 5.39e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 34 31 0 0 0 40 33 0 0 0 10844 >>> >>> --- Event Stage 4: IoStage >>> >>> VecView 10 1.0 1.7523e+00282.9 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 2.0e+01 1 0 0 0 0 36 0 0 0 40 0 >>> VecCopy 10 1.0 2.2449e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecScatterBegin 6 1.0 2.3620e-03 2.4 0.00e+00 0.0 2.3e+03 8.2e+03 >>> 0.0e+00 0 0 0 0 0 0 0 65 3 0 0 >>> VecScatterEnd 6 1.0 4.4194e-01663.9 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 9 0 0 0 0 0 >>> >>> --- Event Stage 5: SolvAlloc >>> >>> VecSet 50 1.0 1.3170e-01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 13 0 0 0 0 0 >>> MatAssemblyBegin 4 1.0 3.9801e-0230.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 8.0e+00 0 0 0 0 0 3 0 0 0 2 0 >>> MatAssemblyEnd 4 1.0 2.2752e-02 1.0 0.00e+00 0.0 1.5e+03 2.0e+03 >>> 1.6e+01 0 0 0 0 0 3 0 27 49 5 0 >>> >>> --- Event Stage 6: SolvSolve >>> >>> VecTDot 224 1.0 3.5454e-02 1.3 1.47e+07 1.0 0.0e+00 0.0e+00 >>> 2.2e+02 0 0 0 0 1 3 10 0 0 31 26499 >>> VecNorm 497 1.0 1.5268e-01 1.4 7.41e+06 1.0 0.0e+00 0.0e+00 >>> 5.0e+02 0 0 0 0 1 11 5 0 0 68 3104 >>> VecCopy 8 1.0 2.7523e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 114 1.0 5.9965e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAXPY 230 1.0 3.7198e-02 1.1 1.51e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 3 11 0 0 0 25934 >>> VecAYPX 111 1.0 1.7153e-02 1.1 7.27e+06 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 1 5 0 0 0 27142 >>> VecScatterBegin 116 1.0 1.1888e-02 1.2 0.00e+00 0.0 4.5e+04 8.2e+03 >>> 0.0e+00 0 0 1 1 0 1 0100100 0 0 >>> VecScatterEnd 116 1.0 2.8105e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 2 0 0 0 0 0 >>> MatMult 112 1.0 2.8080e-01 1.0 5.14e+07 1.0 4.3e+04 8.2e+03 >>> 0.0e+00 0 0 1 1 0 23 36 97 97 0 11711 >>> MatSolve 113 1.0 2.6673e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 22 33 0 0 0 11217 >>> MatLUFactorNum 1 1.0 1.0332e-02 1.0 6.87e+05 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 1 0 0 0 0 4259 >>> MatILUFactorSym 1 1.0 3.1291e-02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 1.0e+00 0 0 0 0 0 2 0 0 0 0 0 >>> MatGetRowIJ 1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetOrdering 1 1.0 3.4251e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSetUp 2 1.0 3.6959e-0210.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 6.0e+00 0 0 0 0 0 1 0 0 0 1 0 >>> KSPSolve 1 1.0 6.9956e-01 1.0 1.43e+08 1.0 4.3e+04 8.2e+03 >>> 3.5e+02 1 1 1 1 1 58100 97 97 48 13069 >>> PCSetUp 2 1.0 4.4161e-02 2.3 6.87e+05 1.0 0.0e+00 0.0e+00 >>> 5.0e+00 0 0 0 0 0 3 0 0 0 1 996 >>> PCSetUpOnBlocks 1 1.0 4.3894e-02 2.4 6.87e+05 1.0 0.0e+00 0.0e+00 >>> 3.0e+00 0 0 0 0 0 3 0 0 0 0 1002 >>> PCApply 113 1.0 2.7507e-01 1.1 4.67e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 22 33 0 0 0 10877 >>> >>> --- Event Stage 7: SolvDeall >>> >>> >>> -------------------------------------------------------------------------- >>> ---------------------------------------------- >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' >>> Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Viewer 1 0 0 0 >>> >>> --- Event Stage 1: StepStage >>> >>> >>> --- Event Stage 2: ConvStage >>> >>> >>> --- Event Stage 3: ProjStage >>> >>> Vector 640 640 101604352 0 >>> Matrix 128 128 410327040 0 >>> >>> Index Set 384 384 17062912 0 >>> >>> Krylov Solver 256 256 282624 0 >>> >>> Preconditioner 256 256 228352 0 >>> >>> --- Event Stage 4: IoStage >>> >>> Vector 10 10 2636400 0 >>> Viewer 10 10 6880 0 >>> >>> --- Event Stage 5: SolvAlloc >>> >>> Vector 140 6 8848 0 >>> >>> Vector Scatter 6 0 0 0 >>> >>> Matrix 6 0 0 0 >>> >>> Distributed Mesh 2 0 0 0 >>> >>> Bipartite Graph 4 0 0 0 >>> >>> Index Set 14 14 372400 0 >>> >>> IS L to G Mapping 3 0 0 0 >>> >>> Krylov Solver 1 0 0 0 >>> >>> Preconditioner 1 0 0 0 >>> >>> --- Event Stage 6: SolvSolve >>> >>> Vector 5 0 0 0 >>> Matrix 1 0 0 0 >>> >>> Index Set 3 0 0 0 >>> >>> Krylov Solver 2 1 1136 0 >>> >>> Preconditioner 2 1 824 0 >>> >>> --- Event Stage 7: SolvDeall >>> >>> Vector 0 133 36676728 0 >>> >>> Vector Scatter 0 1 1036 0 >>> >>> Matrix 0 4 7038924 0 >>> >>> Index Set 0 3 133304 0 >>> >>> Krylov Solver 0 2 2208 0 >>> >>> Preconditioner 0 2 1784 0 >>> >>> ========================================================================== >>> ============================================== Average time to get >>> PetscTime(): 9.53674e-08 >>> Average time for MPI_Barrier(): 1.12057e-05 >>> Average time for zero size MPI_Send(): 1.3113e-06 >>> #PETSc Option Table entries: >>> -ksp_type cg >>> -log_summary >>> -pc_type bjacobi >>> #End of PETSc Option Table entries >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >>> Configure run at: >>> Configure options: >>> Application 9457215 resources: utime ~5920s, stime ~58s > From jed at jedbrown.org Mon Sep 29 14:59:49 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 29 Sep 2014 14:59:49 -0500 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: <20769360.6VtOyZos7S@besikovitch-ii> References: <2490546.DNVhllGaLT@besikovitch-ii> <20769360.6VtOyZos7S@besikovitch-ii> Message-ID: <87iok68afe.fsf@jedbrown.org> Filippo Leonardi writes: > Thank you. > > Actually I had the feeling that it wasn't my problem with Bjacobi and CG. > > So I'll stick to MG. Problem with MG is that there are a lot of parameters to > be tuned, so I leave the defaults (expect I select CG as Krylow method). I > post just results for 64^3 and 128^3. Tell me if I'm missing some useful > detail. (I get similar results with BoomerAMG). > > Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin -pc_type > mg): > 32^3 and 1 proc: 1.01e-1 > 64^3 and 8 proc: 6.56e-01 > 128^3 and 64 proc: 1.05e+00 > Number of PCSetup per KSPSolve: > 15 > 39 > 65 Presumably you mean PCApply. Something is wrong here because this iteration count is way too high. Perhaps your boundary conditions are nonsymmetric or interpolation is not compatible with the discretization. > With BoomerAMG: > stable 8 iterations per KSP but time per iteration greater than PETSc MG and > still increases: > 64^3: 3.17e+00 > 128^3: 9.99e+00 > > > --> For instance with 64^3 (256 iterations): In the first pass with geometric multigrid, don't worry about timing and get the iterations figured out. Are you using a cell-centered or vertex-centered discretization. When you say 128^3, is that counting the number of elements or the number of vertices? Note that if you have a vertex-centered discretization, you will want a 129^3 grid. With PCMG, make sure you are getting the number of levels of refinement that you expect. You should see something like the following (this is 193^3). $ mpiexec -n 4 ./ex45 -da_refine 5 -pc_type mg -ksp_monitor -pc_mg_type full -mg_levels_ksp_type richardson -mg_levels_pc_type sor -ksp_type richardson 0 KSP Residual norm 2.653722249919e+03 1 KSP Residual norm 1.019366121923e+02 2 KSP Residual norm 2.364558296616e-01 3 KSP Residual norm 7.438761746501e-04 Residual norm 1.47939e-06 You can actually do better than this by using higher order FMG interpolation, by going matrix-free, etc. For example, HPGMG (finite-element or finite-volume, see https://hpgmg.org) will solve more than a million equations/second per core. Is your application really solving the constant-coefficient Poisson problem on a Cartesian grid, or is that just a test? > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 And a reminder to please upgrade to the current version of PETSc. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From rlmackie862 at gmail.com Mon Sep 29 15:42:16 2014 From: rlmackie862 at gmail.com (Randall Mackie) Date: Mon, 29 Sep 2014 13:42:16 -0700 Subject: [petsc-users] include/finclude/petscsysdef.h and daimag Message-ID: <1FDD0BF6-7CB3-42AB-B928-1AECE701B705@gmail.com> I recently ran into an issue with include/finclude/petscsysdef.h and the definition of PetscImaginaryPart, which is defined as daimag(a) in the case PETSC_MISSING_DREAL is not defined. 1) As far as I know, daimag is not a valid fortran statement, and I suspect that here you might want dimag. 2) That being said, I was wondering why all my compiles using gfortran worked fine and didn't complain, whereas an Intel 2015 compilation did complain about daimag. Turns out Intel and gfortran have different behaviors for the dreal test in config/BuildSystem/config/types.py. Gfortran gives an *error* saying the argument in the call to dreal should be COMPLEX(8) and not REAL(4), whereas Intel just *warns* that the argument data type is incompatible with the intrinsic procedure. Since gfortran does not compile the dreal code test, it sets PETSC_MISSING_DREAL to 1, and uses aimag. But I'm curious just what exactly you are trying to test for? DREAL is a valid extension if the argument is double complex. So dreal(dcmplx(3.0,0.0)) will work just fine, but your test of dreal(3.0) is not a valid statement and should fail. Furthermore, I'm not sure I understand the need for the dreal stuff, since real, conjg, aimag all return values of the same kind as the argument for the case of complex variables. Thanks, Randy Mackie From filippo.leonardi at sam.math.ethz.ch Mon Sep 29 15:47:15 2014 From: filippo.leonardi at sam.math.ethz.ch (Filippo Leonardi) Date: Mon, 29 Sep 2014 22:47:15 +0200 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: <87iok68afe.fsf@jedbrown.org> References: <2490546.DNVhllGaLT@besikovitch-ii> <20769360.6VtOyZos7S@besikovitch-ii> <87iok68afe.fsf@jedbrown.org> Message-ID: <3413675.Kj5UT1R2Xb@besikovitch-ii> @ Barry: It may be that I forgot to set the number of levels for the runs. New experiment with the following options: -da_refine 5 -pc_type mg -ksp_monitor -log_summary -pc_mg_type full -ksp_view - pc_mg_log -pc_mg_levels 5 -pc_mg_galerkin -ksp_monitor_true_residual - ksp_converged_reason on 128^3, and it looks nice: 0 KSP Residual norm 5.584601494955e+01 0 KSP preconditioned resid norm 5.584601494955e+01 true resid norm 1.370259979011e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP Residual norm 9.235021247277e+00 1 KSP preconditioned resid norm 9.235021247277e+00 true resid norm 8.185195475443e-01 ||r(i)||/||b|| 5.973461679404e-02 2 KSP Residual norm 6.344253555076e-02 2 KSP preconditioned resid norm 6.344253555076e-02 true resid norm 1.108015805992e-01 ||r(i)||/||b|| 8.086172134956e-03 3 KSP Residual norm 1.084530268454e-03 3 KSP preconditioned resid norm 1.084530268454e-03 true resid norm 3.228589340041e-03 ||r(i)||/||b|| 2.356187431214e-04 4 KSP Residual norm 2.345341850850e-05 4 KSP preconditioned resid norm 2.345341850850e-05 true resid norm 9.362117433445e-05 ||r(i)||/||b|| 6.832365811489e-06 Linear solve converged due to CONVERGED_RTOL iterations 4 I'll try on more processors. Is it correct what I am doing? If so, is there some tweak I am missing. Any suggestion on optimal number of levels VS number of processors? Btw, thanks a lot, you are always so helpful. On Monday 29 September 2014 14:59:49 you wrote: > Filippo Leonardi writes: > > Thank you. > > > > Actually I had the feeling that it wasn't my problem with Bjacobi and CG. > > > > So I'll stick to MG. Problem with MG is that there are a lot of parameters > > to be tuned, so I leave the defaults (expect I select CG as Krylow > > method). I post just results for 64^3 and 128^3. Tell me if I'm missing > > some useful detail. (I get similar results with BoomerAMG). > > > > Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin > > -pc_type mg): > > 32^3 and 1 proc: 1.01e-1 > > 64^3 and 8 proc: 6.56e-01 > > 128^3 and 64 proc: 1.05e+00 > > Number of PCSetup per KSPSolve: > > 15 > > 39 > > 65 > > Presumably you mean PCApply. Something is wrong here because this > iteration count is way too high. Perhaps your boundary conditions are > nonsymmetric or interpolation is not compatible with the discretization. > > > With BoomerAMG: > > stable 8 iterations per KSP but time per iteration greater than PETSc MG > > and still increases: > > 64^3: 3.17e+00 > > 128^3: 9.99e+00 > > > --> For instance with 64^3 (256 iterations): > In the first pass with geometric multigrid, don't worry about timing and > get the iterations figured out. Are you using a cell-centered or > vertex-centered discretization. When you say 128^3, is that counting > the number of elements or the number of vertices? Note that if you have > a vertex-centered discretization, you will want a 129^3 grid. Cell-centered, counting elements. > With > PCMG, make sure you are getting the number of levels of refinement that > you expect. > > You should see something like the following (this is 193^3). > > $ mpiexec -n 4 ./ex45 -da_refine 5 -pc_type mg -ksp_monitor -pc_mg_type full > -mg_levels_ksp_type richardson -mg_levels_pc_type sor -ksp_type richardson > 0 KSP Residual norm 2.653722249919e+03 > 1 KSP Residual norm 1.019366121923e+02 > 2 KSP Residual norm 2.364558296616e-01 > 3 KSP Residual norm 7.438761746501e-04 > Residual norm 1.47939e-06 > > You can actually do better than this by using higher order FMG > interpolation, by going matrix-free, etc. For example, HPGMG > (finite-element or finite-volume, see https://hpgmg.org) will solve more > than a million equations/second per core. Is your application really > solving the constant-coefficient Poisson problem on a Cartesian grid, or > is that just a test? I actually just need a cell centered Poisson solver on cartesian grids. (various boundary conditions). Matrix free you mean AMG (like -pc_mg_galerkin)? Does it reach the same scalability as GMG? > > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT 2012 > > And a reminder to please upgrade to the current version of PETSc. Sadly this is not in my power. I had actually had to rollback all the APIs to be able to do this test runs. -------------- next part -------------- A non-text attachment was scrubbed... Name: ETHZ.vcf Type: text/vcard Size: 594 bytes Desc: not available URL: From knepley at gmail.com Mon Sep 29 15:51:31 2014 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 29 Sep 2014 15:51:31 -0500 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: <3413675.Kj5UT1R2Xb@besikovitch-ii> References: <2490546.DNVhllGaLT@besikovitch-ii> <20769360.6VtOyZos7S@besikovitch-ii> <87iok68afe.fsf@jedbrown.org> <3413675.Kj5UT1R2Xb@besikovitch-ii> Message-ID: On Mon, Sep 29, 2014 at 3:47 PM, Filippo Leonardi < filippo.leonardi at sam.math.ethz.ch> wrote: > @ Barry: It may be that I forgot to set the number of levels for the runs. > > New experiment with the following options: > > -da_refine 5 -pc_type mg -ksp_monitor -log_summary -pc_mg_type full > -ksp_view - > pc_mg_log -pc_mg_levels 5 -pc_mg_galerkin -ksp_monitor_true_residual - > ksp_converged_reason > > on 128^3, and it looks nice: > > 0 KSP Residual norm 5.584601494955e+01 > 0 KSP preconditioned resid norm 5.584601494955e+01 true resid norm > 1.370259979011e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP Residual norm 9.235021247277e+00 > 1 KSP preconditioned resid norm 9.235021247277e+00 true resid norm > 8.185195475443e-01 ||r(i)||/||b|| 5.973461679404e-02 > 2 KSP Residual norm 6.344253555076e-02 > 2 KSP preconditioned resid norm 6.344253555076e-02 true resid norm > 1.108015805992e-01 ||r(i)||/||b|| 8.086172134956e-03 > 3 KSP Residual norm 1.084530268454e-03 > 3 KSP preconditioned resid norm 1.084530268454e-03 true resid norm > 3.228589340041e-03 ||r(i)||/||b|| 2.356187431214e-04 > 4 KSP Residual norm 2.345341850850e-05 > 4 KSP preconditioned resid norm 2.345341850850e-05 true resid norm > 9.362117433445e-05 ||r(i)||/||b|| 6.832365811489e-06 > Linear solve converged due to CONVERGED_RTOL iterations 4 > > I'll try on more processors. Is it correct what I am doing? If so, is there > some tweak I am missing. > > Any suggestion on optimal number of levels VS number of processors? > > Btw, thanks a lot, you are always so helpful. > > On Monday 29 September 2014 14:59:49 you wrote: > > Filippo Leonardi writes: > > > Thank you. > > > > > > Actually I had the feeling that it wasn't my problem with Bjacobi and > CG. > > > > > > So I'll stick to MG. Problem with MG is that there are a lot of > parameters > > > to be tuned, so I leave the defaults (expect I select CG as Krylow > > > method). I post just results for 64^3 and 128^3. Tell me if I'm missing > > > some useful detail. (I get similar results with BoomerAMG). > > > > > > Time for one KSP iteration (-ksp_type cg -log_summary -pc_mg_galerkin > > > -pc_type mg): > > > 32^3 and 1 proc: 1.01e-1 > > > 64^3 and 8 proc: 6.56e-01 > > > 128^3 and 64 proc: 1.05e+00 > > > Number of PCSetup per KSPSolve: > > > 15 > > > 39 > > > 65 > > > > Presumably you mean PCApply. Something is wrong here because this > > iteration count is way too high. Perhaps your boundary conditions are > > nonsymmetric or interpolation is not compatible with the discretization. > > > > > With BoomerAMG: > > > stable 8 iterations per KSP but time per iteration greater than PETSc > MG > > > and still increases: > > > 64^3: 3.17e+00 > > > 128^3: 9.99e+00 > > > > > --> For instance with 64^3 (256 iterations): > > In the first pass with geometric multigrid, don't worry about timing and > > get the iterations figured out. Are you using a cell-centered or > > vertex-centered discretization. When you say 128^3, is that counting > > the number of elements or the number of vertices? Note that if you have > > a vertex-centered discretization, you will want a 129^3 grid. > > Cell-centered, counting elements. > > > With > > PCMG, make sure you are getting the number of levels of refinement that > > you expect. > > > > > > > You should see something like the following (this is 193^3). > > > > $ mpiexec -n 4 ./ex45 -da_refine 5 -pc_type mg -ksp_monitor -pc_mg_type > full > > -mg_levels_ksp_type richardson -mg_levels_pc_type sor -ksp_type > richardson > > 0 KSP Residual norm 2.653722249919e+03 > > 1 KSP Residual norm 1.019366121923e+02 > > 2 KSP Residual norm 2.364558296616e-01 > > 3 KSP Residual norm 7.438761746501e-04 > > Residual norm 1.47939e-06 > > > > > You can actually do better than this by using higher order FMG > > interpolation, by going matrix-free, etc. For example, HPGMG > > (finite-element or finite-volume, see https://hpgmg.org) will solve more > > than a million equations/second per core. Is your application really > > solving the constant-coefficient Poisson problem on a Cartesian grid, or > > is that just a test? > > I actually just need a cell centered Poisson solver on cartesian grids. > (various boundary conditions). > > Matrix free you mean AMG (like -pc_mg_galerkin)? Does it reach the same > scalability as GMG? No, Jed means calculating the action of the operator matrix-free instead of using an assembled sparse matrix. > > > > > > Using Petsc Release Version 3.3.0, Patch 3, Wed Aug 29 11:26:24 CDT > 2012 > > > > And a reminder to please upgrade to the current version of PETSc. > > Sadly this is not in my power. I had actually had to rollback all the APIs > to > be able to do this test runs. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Sep 29 16:02:24 2014 From: jed at jedbrown.org (Jed Brown) Date: Mon, 29 Sep 2014 16:02:24 -0500 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: References: <2490546.DNVhllGaLT@besikovitch-ii> <20769360.6VtOyZos7S@besikovitch-ii> <87iok68afe.fsf@jedbrown.org> <3413675.Kj5UT1R2Xb@besikovitch-ii> Message-ID: <87zjdi6syn.fsf@jedbrown.org> Matthew Knepley writes: >> Matrix free you mean AMG (like -pc_mg_galerkin)? Does it reach the same >> scalability as GMG? > > > No, Jed means calculating the action of the operator matrix-free instead of > using > an assembled sparse matrix. Yeah, that was a diversion; don't worry about it for now. HPGMG is an example of an optimized parallel matrix-free geometric multigrid solver. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From salazardetroya at gmail.com Mon Sep 29 16:55:14 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Mon, 29 Sep 2014 16:55:14 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: Hi all I'm bumping this post because I have more questions related to the same problem. I am looping over the edges of my DMNetwork, then I obtain the vertices that make up each edge with DMNetworkGetConnectedNode(). Each of these vertices have two variables (or actually, two degrees of freedom for my problem). My intentions are to modify the solution vector entries that are affected by these variables in each vertex. I would call the function DMNetworkGetVariableOffset() to do this. What happens if one of the vertices is a ghost vertex? Can I still modify the solution vector? My problem is that the edge has information to provide to these nodes. Thanks Miguel On Fri, Sep 26, 2014 at 12:33 PM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > I understand. Thanks a lot. > > Miguel > > On Fri, Sep 26, 2014 at 10:53 AM, Abhyankar, Shrirang G. < > abhyshr at mcs.anl.gov> wrote: > >> What Matt is saying is that there are two interfaces in PETSc for setting >> the residual evaluation routine: >> >> i) SNESSetFunction takes in a function pointer for the residual evaluation >> routine that has the prototype >> PetscErrorCode xyzroutine(SNES snes, Vec X, Vec F, void* >> ctx); >> >> X and F are the "global" solution and residual vectors. To compute the >> global residual evaluation, typically one does -- (a) scattering X and F >> onto local vectors localX and localF (DMGlobalToLocal), (b) computing the >> local residual, and (c) gathering the localF in the global F >> (DMLocalToGlobal). This is what is done in the example. >> >> ii) DMSNESSetFunctionLocal takes in a function pointer for the residual >> evaluation routine that has the prototype >> PetscErrorCode xyzlocalroutine(DM, Vec localX, localF, >> void* ctx) >> >> In this case, the localX and localF get passed to the routine. So, you >> only have to do the local residual evaluation. PETSc does the >> LocalToGlobal gather to form the global residual. >> >> I chose to use SNESSetFunction in the example. You can use either of them. >> >> Shri >> >> From: Matthew Knepley >> Date: Fri, 26 Sep 2014 10:28:26 -0500 >> To: Miguel Angel Salazar de Troya >> Cc: Jed Brown , Shri , >> "petsc-users at mcs.anl.gov" >> Subject: Re: [petsc-users] DMPlex with spring elements >> >> >> >On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya >> > wrote: >> > >> >Yeah, but doesn't it only work with the local vectors localX and localF? >> > >> > >> > >> >I am telling you what the interface for the functions is. You can do >> >whatever you want inside. >> > >> > Matt >> > >> > >> >Miguel >> > >> >On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley >> >wrote: >> > >> >On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya >> > wrote: >> > >> >That means that if we call SNESSetFunction() we don't build the residual >> >vector in parallel? In the pflow example >> >( >> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tut >> >orials/network/pflow/pf.c.html) the function FormFunction() (Input for >> >SNESSetFunction() works with the local vectors. I don't understand this. >> > >> > >> > >> >FormFunction() in that link clearly takes in a global vector X and >> >returns a global vector F. Inside, it >> >converts them to local vectors. This is exactly what you would do for a >> >function given to SNESSetFunction(). >> > >> > Matt >> > >> > >> > >> >Thanks >> >Miguel >> > >> > >> >On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley >> >wrote: >> > >> >On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya >> > wrote: >> > >> >Thanks. I had another question about the DM and SNES and TS. There are >> >similar routines to assign the residual and jacobian evaluation to both >> >objects. For the SNES case are: >> >DMSNESSetFunctionLocal >> >DMSNESSetJacobianLocal >> > >> >What are the differences of these with: >> > >> >SNESSetFunction >> >SNESSetJacobian >> > >> > >> > >> > >> >SNESSetFunction() expects the user to construct the entire parallel >> >residual vector. DMSNESSetFunctionLocal() >> >expects the user to construct the local pieces of the residual, and then >> >it automatically calls DMLocalToGlobal() >> >to assembly the full residual. It also converts the input from global >> >vectors to local vectors, and in the case of >> >DMDA multidimensional arrays. >> > >> > Thanks, >> > >> > Matt >> > >> > >> >and when should we use each? With "Local", it is meant to evaluate the >> >function/jacobian for the elements in the local processor? I could get >> >the local edges in DMNetwork by calling DMNetworkGetEdgeRange? >> > >> >Miguel >> > >> > >> >On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >> >wrote: >> > >> > >> > >> >On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya >> > wrote: >> > >> >> If you need a symmetric Jacobian, you can use the BC facility in >> >> PetscSection, which eliminates the >> >> variables completely. This is how the FEM examples, like ex12, work. >> >Would that be with PetscSectionSetConstraintDof ? For that I will need >> >the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >> >could cast it to DM_Network from the dm, networkdm, declared in the main >> >program, maybe something like this: >> >DM_Network *network = (DM_Network*) networkdm->data;Then I would loop >> >over the vertices and call PetscSectionSetConstraintDof if it's a >> >boundary node (by checking the corresponding component) >> > >> > >> > >> > >> >I admit to not completely understanding DMNetwork. However, it eventually >> >builds a PetscSection for data layout, which >> >you could get from DMGetDefaultSection(). The right thing to do is find >> >where it builds the Section, and put in your BC >> >there, but that sounds like it would entail coding. >> > >> > Thanks, >> > >> > Matt >> > >> > >> > >> > >> >Thanks for your responses.Miguel >> > >> > >> > >> > >> >On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >> > >> >Matthew Knepley writes: >> > >> >> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. >> >>> >>> wrote: >> >> >> >>> You are right. The Jacobian for the power grid application is indeed >> >>> non-symmetric. Is that a problem for your application? >> >>> >> >> >> >> If you need a symmetric Jacobian, you can use the BC facility in >> >> PetscSection, which eliminates the >> >> variables completely. This is how the FEM examples, like ex12, work. >> > >> >You can also use MatZeroRowsColumns() or do the equivalent >> >transformation during assembly (my preference). >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> > >> > >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> > >> > >> >salaza11 at illinois.edu >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >What most experimenters take for granted before they begin their >> >experiments is infinitely more interesting than any results to which >> >their experiments lead. >> >-- Norbert Wiener >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> >salaza11 at illinois.edu >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >What most experimenters take for granted before they begin their >> >experiments is infinitely more interesting than any results to which >> >their experiments lead. >> >-- Norbert Wiener >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> >salaza11 at illinois.edu >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >What most experimenters take for granted before they begin their >> >experiments is infinitely more interesting than any results to which >> >their experiments lead. >> >-- Norbert Wiener >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >Miguel Angel Salazar de Troya >> >Graduate Research Assistant >> >Department of Mechanical Science and Engineering >> >University of Illinois at Urbana-Champaign >> >(217) 550-2360 >> >salaza11 at illinois.edu >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >-- >> >What most experimenters take for granted before they begin their >> >experiments is infinitely more interesting than any results to which >> >their experiments lead. >> >-- Norbert Wiener >> >> > > > -- > *Miguel Angel Salazar de Troya* > > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From filippo.leonardi at sam.math.ethz.ch Tue Sep 30 01:59:59 2014 From: filippo.leonardi at sam.math.ethz.ch (Filippo Leonardi) Date: Tue, 30 Sep 2014 08:59:59 +0200 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: <87zjdi6syn.fsf@jedbrown.org> References: <2490546.DNVhllGaLT@besikovitch-ii> <87zjdi6syn.fsf@jedbrown.org> Message-ID: <2401575.mGBRNJJlEm@besikovitch-ii> Thank you everybody, number of iterations seems now to be under control, I'll run some scaling test and hope for the best. On Monday 29 September 2014 16:02:24 Jed Brown wrote: > Matthew Knepley writes: > >> Matrix free you mean AMG (like -pc_mg_galerkin)? Does it reach the same > >> scalability as GMG? > > > > No, Jed means calculating the action of the operator matrix-free instead > > of > > using > > an assembled sparse matrix. > > Yeah, that was a diversion; don't worry about it for now. HPGMG is an > example of an optimized parallel matrix-free geometric multigrid solver. Now I am intrigued, is there, by any chance, any reference I can look up for this? -------------- next part -------------- A non-text attachment was scrubbed... Name: ETHZ.vcf Type: text/vcard Size: 594 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Sep 30 02:13:55 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 30 Sep 2014 02:13:55 -0500 Subject: [petsc-users] Scaling/Preconditioners for Poisson equation In-Reply-To: <2401575.mGBRNJJlEm@besikovitch-ii> References: <2490546.DNVhllGaLT@besikovitch-ii> <87zjdi6syn.fsf@jedbrown.org> <2401575.mGBRNJJlEm@besikovitch-ii> Message-ID: http://lmgtfy.com/?q=HPGMG On Sep 30, 2014, at 1:59 AM, Filippo Leonardi wrote: > Thank you everybody, number of iterations seems now to be under control, I'll > run some scaling test and hope for the best. > > Now I am intrigued, is there, by any chance, any reference I can look up for > this? > From csp at info.szwgroup.com Tue Sep 30 02:19:28 2014 From: csp at info.szwgroup.com (=?utf-8?B?TXMuIEVsbGEgV2Vp?=) Date: Tue, 30 Sep 2014 15:19:28 +0800 (CST) Subject: [petsc-users] =?utf-8?q?Spain_V=2ES=2E_USA--Developing_CSP_projec?= =?utf-8?q?ts_in_South_Africa?= Message-ID: <20140930071928.B833730B6DCD@mx6.easemaillist.com> An HTML attachment was scrubbed... URL: From vbaros at hsr.ch Tue Sep 30 05:03:55 2014 From: vbaros at hsr.ch (Baros Vladimir) Date: Tue, 30 Sep 2014 10:03:55 +0000 Subject: [petsc-users] MKL Pardiso with MPI Message-ID: <293c496a16574aa0a9dff75fde6cdaf4@sid00230.hsr.ch> Hi, Intel released MKL 11.2 with support for Pardiso cluster version. https://software.intel.com/en-us/articles/intel-math-kernel-library-parallel-direct-sparse-solver-for-clusters Is it possible to use Pardiso with PETSC using MPI? I'm on Windows if that matters. Currently if I try to run, I will get this error: mpiexec Solver2D.exe -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package mkl_pardiso [0]PETSC ERROR: --------------------- Error Message ---------------------------- ---------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Matrix format mpiaij does not have a solver package mkl_pardiso for LU. Perhaps you must ./configure with --download-mkl_pardiso [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trou ble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.2, Sep, 08, 2014 Regards, Vladimir -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Tue Sep 30 05:14:15 2014 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 30 Sep 2014 12:14:15 +0200 Subject: [petsc-users] MKL Pardiso with MPI In-Reply-To: <293c496a16574aa0a9dff75fde6cdaf4@sid00230.hsr.ch> References: <293c496a16574aa0a9dff75fde6cdaf4@sid00230.hsr.ch> Message-ID: <542A8276.3060107@gmail.com> You should include make.log and configure.log (to answer the question of whether you configured with --download-mkl_pardiso, amongst others) . On 9/30/14 12:03 PM, Baros Vladimir wrote: > > Hi, > > Intel released MKL 11.2 with support for Pardiso cluster version. > > https://software.intel.com/en-us/articles/intel-math-kernel-library-parallel-direct-sparse-solver-for-clusters > > Is it possible to use Pardiso with PETSC using MPI? > > I?m on Windows if that matters. > > Currently if I try to run, I will get this error: > > mpiexec Solver2D.exe -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package mkl_pardiso > > [0]PETSC ERROR: --------------------- Error Message > ---------------------------- > > ---------------------------------- > > [0]PETSC ERROR: No support for this operation for this object type > > [0]PETSC ERROR: Matrix format mpiaij does not have a solver package > mkl_pardiso > > for LU. Perhaps you must ./configure with --download-mkl_pardiso > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trou > > ble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.5.2, Sep, 08, 2014 > > Regards, > > Vladimir > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hillsmattc at outlook.com Tue Sep 30 06:49:21 2014 From: hillsmattc at outlook.com (Matthew Hills) Date: Tue, 30 Sep 2014 13:49:21 +0200 Subject: [petsc-users] PETSc and MPE Message-ID: Hi PETSc team, I'm attempting to analyze and optimize a structural analysis program called SESKA using mpe. I have configured PETSc with: ./configure --with-mpi=1 --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-f-blas-lapack=${PETSC_DIR}/externalpackages/fblaslapacklinpack-3.1.1.tar.gz --with-parmetis=1 --download-parmetis=${PETSC_DIR}/externalpackages/parmetis-4.0.2-p3.tar.gz --with-mumps=1 --download-mumps=${PETSC_DIR}/externalpackages/MUMPS_4.10.0-p3.tar.gz --with-scalapack=1 --download-scalapack=${PETSC_DIR}/externalpackages/scalapack-2.0.2.tgz --download-blacs=${PETSC_DIR}/externalpackages/blacs-dev.tar.gz --with-superlu_dist=1 --download-superlu_dist=${PETSC_DIR}/externalpackages/superlu_dist_3.1.tar.gz --with-metis --download-metis=${PETSC_DIR}/externalpackages/metis-5.0.2-p3.tar.gz --with-sowing=${PETSC_DIR}/externalpackages/sowing-1.1.16d.tar.gz --with-c2html=0 --with-shared-libraries=1 --download-mpich-mpe=1 --download-mpich=${SESKADIR}/packages/downloads/mpich-3.0.4.tar.gz and then run SESKA with: mpiexec -n 8 seska -log_mpe mpe.log However mpe.log is not created. I wish to view this file in Jumpsuit. Any assistance would be greatly appreciated. Regards,Matthew -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Sep 30 07:22:04 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 30 Sep 2014 07:22:04 -0500 Subject: [petsc-users] PETSc and MPE In-Reply-To: References: Message-ID: Send configure.log and make.log On Sep 30, 2014, at 6:49 AM, Matthew Hills wrote: > Hi PETSc team, > > I'm attempting to analyze and optimize a structural analysis program called SESKA using mpe. I have configured PETSc with: > > ./configure --with-mpi=1 --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-f-blas-lapack=${PETSC_DIR}/externalpackages/fblaslapacklinpack-3.1.1.tar.gz --with-parmetis=1 --download-parmetis=${PETSC_DIR}/externalpackages/parmetis-4.0.2-p3.tar.gz --with-mumps=1 --download-mumps=${PETSC_DIR}/externalpackages/MUMPS_4.10.0-p3.tar.gz --with-scalapack=1 --download-scalapack=${PETSC_DIR}/externalpackages/scalapack-2.0.2.tgz --download-blacs=${PETSC_DIR}/externalpackages/blacs-dev.tar.gz --with-superlu_dist=1 --download-superlu_dist=${PETSC_DIR}/externalpackages/superlu_dist_3.1.tar.gz --with-metis --download-metis=${PETSC_DIR}/externalpackages/metis-5.0.2-p3.tar.gz --with-sowing=${PETSC_DIR}/externalpackages/sowing-1.1.16d.tar.gz --with-c2html=0 --with-shared-libraries=1 --download-mpich-mpe=1 --download-mpich=${SESKADIR}/packages/downloads/mpich-3.0.4.tar.gz > > > and then run SESKA with: > > mpiexec -n 8 seska -log_mpe mpe.log > > > However mpe.log is not created. I wish to view this file in Jumpsuit. Any assistance would be greatly appreciated. > > Regards, > Matthew From balay at mcs.anl.gov Tue Sep 30 08:06:26 2014 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 30 Sep 2014 08:06:26 -0500 Subject: [petsc-users] PETSc and MPE In-Reply-To: References: Message-ID: This hasn't been tested in a while - but I think you have to configur with: --download-mpe And then run the code with: -log_mpe Satish On Tue, 30 Sep 2014, Matthew Hills wrote: > Hi PETSc team, > > I'm attempting to analyze and optimize a structural analysis program called SESKA using mpe. I have configured PETSc with: > > > > > > > > > > > ./configure > --with-mpi=1 --with-cc=gcc > --with-cxx=g++ > --with-fc=gfortran > --download-f-blas-lapack=${PETSC_DIR}/externalpackages/fblaslapacklinpack-3.1.1.tar.gz > --with-parmetis=1 > --download-parmetis=${PETSC_DIR}/externalpackages/parmetis-4.0.2-p3.tar.gz > --with-mumps=1 > --download-mumps=${PETSC_DIR}/externalpackages/MUMPS_4.10.0-p3.tar.gz > --with-scalapack=1 > --download-scalapack=${PETSC_DIR}/externalpackages/scalapack-2.0.2.tgz > --download-blacs=${PETSC_DIR}/externalpackages/blacs-dev.tar.gz > --with-superlu_dist=1 > --download-superlu_dist=${PETSC_DIR}/externalpackages/superlu_dist_3.1.tar.gz > --with-metis > --download-metis=${PETSC_DIR}/externalpackages/metis-5.0.2-p3.tar.gz > --with-sowing=${PETSC_DIR}/externalpackages/sowing-1.1.16d.tar.gz > --with-c2html=0 --with-shared-libraries=1 --download-mpich-mpe=1 > --download-mpich=${SESKADIR}/packages/downloads/mpich-3.0.4.tar.gz > > and then run SESKA with: > mpiexec -n 8 seska -log_mpe mpe.log > > However mpe.log is not created. I wish to view this file in Jumpsuit. Any assistance would be greatly appreciated. > Regards,Matthew > > > > > From knepley at gmail.com Tue Sep 30 11:17:34 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 30 Sep 2014 11:17:34 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: On Mon, Sep 29, 2014 at 4:55 PM, Miguel Angel Salazar de Troya < salazardetroya at gmail.com> wrote: > Hi all > > I'm bumping this post because I have more questions related to the same > problem. > > I am looping over the edges of my DMNetwork, then I obtain the vertices > that make up each edge with DMNetworkGetConnectedNode(). Each of these > vertices have two variables (or actually, two degrees of freedom for my > problem). My intentions are to modify the solution vector entries that are > affected by these variables in each vertex. I would call the function > DMNetworkGetVariableOffset() to do this. What happens if one of the > vertices is a ghost vertex? Can I still modify the solution vector? My > problem is that the edge has information to provide to these nodes. > This is unfortunately not explained well enough in our documentation. I will try to put a more substantial explanation in the manual. PETSc operates in two different spaces: local space and global space. In the global space, every degree of freedom (dof, or unknown) is owned by a single process and is a single entry which lives on that process in a global Vec. This is what solvers, like GMRES, deal with and also what is plotted for visualization. In the local space, each process has all of the dofs it owns AND all of those that it shares with other owning processes (ghosts). Thus in the local space a given dof can be entries on several processes in a local Vec. You usually use the local space to compute residuals/Jacobians. Our paradigm is: compute residual in the local space, so that you can set values for ghosts, and then call DMLocalToGlobal() to put those values into a global Vec (using some combination operator, like +) which can be handed to the solver. Thanks, Matt > Thanks > Miguel > > On Fri, Sep 26, 2014 at 12:33 PM, Miguel Angel Salazar de Troya < > salazardetroya at gmail.com> wrote: > >> I understand. Thanks a lot. >> >> Miguel >> >> On Fri, Sep 26, 2014 at 10:53 AM, Abhyankar, Shrirang G. < >> abhyshr at mcs.anl.gov> wrote: >> >>> What Matt is saying is that there are two interfaces in PETSc for setting >>> the residual evaluation routine: >>> >>> i) SNESSetFunction takes in a function pointer for the residual >>> evaluation >>> routine that has the prototype >>> PetscErrorCode xyzroutine(SNES snes, Vec X, Vec F, void* >>> ctx); >>> >>> X and F are the "global" solution and residual vectors. To compute the >>> global residual evaluation, typically one does -- (a) scattering X and F >>> onto local vectors localX and localF (DMGlobalToLocal), (b) computing the >>> local residual, and (c) gathering the localF in the global F >>> (DMLocalToGlobal). This is what is done in the example. >>> >>> ii) DMSNESSetFunctionLocal takes in a function pointer for the residual >>> evaluation routine that has the prototype >>> PetscErrorCode xyzlocalroutine(DM, Vec localX, localF, >>> void* ctx) >>> >>> In this case, the localX and localF get passed to the routine. So, you >>> only have to do the local residual evaluation. PETSc does the >>> LocalToGlobal gather to form the global residual. >>> >>> I chose to use SNESSetFunction in the example. You can use either of >>> them. >>> >>> Shri >>> >>> From: Matthew Knepley >>> Date: Fri, 26 Sep 2014 10:28:26 -0500 >>> To: Miguel Angel Salazar de Troya >>> Cc: Jed Brown , Shri , >>> "petsc-users at mcs.anl.gov" >>> Subject: Re: [petsc-users] DMPlex with spring elements >>> >>> >>> >On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya >>> > wrote: >>> > >>> >Yeah, but doesn't it only work with the local vectors localX and localF? >>> > >>> > >>> > >>> >I am telling you what the interface for the functions is. You can do >>> >whatever you want inside. >>> > >>> > Matt >>> > >>> > >>> >Miguel >>> > >>> >On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley >>> >wrote: >>> > >>> >On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya >>> > wrote: >>> > >>> >That means that if we call SNESSetFunction() we don't build the residual >>> >vector in parallel? In the pflow example >>> >( >>> http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tut >>> >orials/network/pflow/pf.c.html) the function FormFunction() (Input for >>> >SNESSetFunction() works with the local vectors. I don't understand this. >>> > >>> > >>> > >>> >FormFunction() in that link clearly takes in a global vector X and >>> >returns a global vector F. Inside, it >>> >converts them to local vectors. This is exactly what you would do for a >>> >function given to SNESSetFunction(). >>> > >>> > Matt >>> > >>> > >>> > >>> >Thanks >>> >Miguel >>> > >>> > >>> >On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley >>> >wrote: >>> > >>> >On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya >>> > wrote: >>> > >>> >Thanks. I had another question about the DM and SNES and TS. There are >>> >similar routines to assign the residual and jacobian evaluation to both >>> >objects. For the SNES case are: >>> >DMSNESSetFunctionLocal >>> >DMSNESSetJacobianLocal >>> > >>> >What are the differences of these with: >>> > >>> >SNESSetFunction >>> >SNESSetJacobian >>> > >>> > >>> > >>> > >>> >SNESSetFunction() expects the user to construct the entire parallel >>> >residual vector. DMSNESSetFunctionLocal() >>> >expects the user to construct the local pieces of the residual, and then >>> >it automatically calls DMLocalToGlobal() >>> >to assembly the full residual. It also converts the input from global >>> >vectors to local vectors, and in the case of >>> >DMDA multidimensional arrays. >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > >>> >and when should we use each? With "Local", it is meant to evaluate the >>> >function/jacobian for the elements in the local processor? I could get >>> >the local edges in DMNetwork by calling DMNetworkGetEdgeRange? >>> > >>> >Miguel >>> > >>> > >>> >On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >>> >wrote: >>> > >>> > >>> > >>> >On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya >>> > wrote: >>> > >>> >> If you need a symmetric Jacobian, you can use the BC facility in >>> >> PetscSection, which eliminates the >>> >> variables completely. This is how the FEM examples, like ex12, work. >>> >Would that be with PetscSectionSetConstraintDof ? For that I will need >>> >the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>> >could cast it to DM_Network from the dm, networkdm, declared in the >>> main >>> >program, maybe something like this: >>> >DM_Network *network = (DM_Network*) networkdm->data;Then I would >>> loop >>> >over the vertices and call PetscSectionSetConstraintDof if it's a >>> >boundary node (by checking the corresponding component) >>> > >>> > >>> > >>> > >>> >I admit to not completely understanding DMNetwork. However, it >>> eventually >>> >builds a PetscSection for data layout, which >>> >you could get from DMGetDefaultSection(). The right thing to do is find >>> >where it builds the Section, and put in your BC >>> >there, but that sounds like it would entail coding. >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > >>> > >>> > >>> >Thanks for your responses.Miguel >>> > >>> > >>> > >>> > >>> >On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >>> > >>> >Matthew Knepley writes: >>> > >>> >> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. >>> >>>> >>> wrote: >>> >> >>> >>> You are right. The Jacobian for the power grid application is indeed >>> >>> non-symmetric. Is that a problem for your application? >>> >>> >>> >> >>> >> If you need a symmetric Jacobian, you can use the BC facility in >>> >> PetscSection, which eliminates the >>> >> variables completely. This is how the FEM examples, like ex12, work. >>> > >>> >You can also use MatZeroRowsColumns() or do the equivalent >>> >transformation during assembly (my preference). >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >Miguel Angel Salazar de Troya >>> > >>> > >>> >Graduate Research Assistant >>> >Department of Mechanical Science and Engineering >>> >University of Illinois at Urbana-Champaign >>> >(217) 550-2360 >>> > >>> > >>> >salaza11 at illinois.edu >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >What most experimenters take for granted before they begin their >>> >experiments is infinitely more interesting than any results to which >>> >their experiments lead. >>> >-- Norbert Wiener >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >Miguel Angel Salazar de Troya >>> >Graduate Research Assistant >>> >Department of Mechanical Science and Engineering >>> >University of Illinois at Urbana-Champaign >>> >(217) 550-2360 >>> >salaza11 at illinois.edu >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >What most experimenters take for granted before they begin their >>> >experiments is infinitely more interesting than any results to which >>> >their experiments lead. >>> >-- Norbert Wiener >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >Miguel Angel Salazar de Troya >>> >Graduate Research Assistant >>> >Department of Mechanical Science and Engineering >>> >University of Illinois at Urbana-Champaign >>> >(217) 550-2360 >>> >salaza11 at illinois.edu >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >What most experimenters take for granted before they begin their >>> >experiments is infinitely more interesting than any results to which >>> >their experiments lead. >>> >-- Norbert Wiener >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >Miguel Angel Salazar de Troya >>> >Graduate Research Assistant >>> >Department of Mechanical Science and Engineering >>> >University of Illinois at Urbana-Champaign >>> >(217) 550-2360 >>> >salaza11 at illinois.edu >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >-- >>> >What most experimenters take for granted before they begin their >>> >experiments is infinitely more interesting than any results to which >>> >their experiments lead. >>> >-- Norbert Wiener >>> >>> >> >> >> -- >> *Miguel Angel Salazar de Troya* >> >> Graduate Research Assistant >> Department of Mechanical Science and Engineering >> University of Illinois at Urbana-Champaign >> (217) 550-2360 >> salaza11 at illinois.edu >> >> > > > -- > *Miguel Angel Salazar de Troya* > Graduate Research Assistant > Department of Mechanical Science and Engineering > University of Illinois at Urbana-Champaign > (217) 550-2360 > salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Tue Sep 30 11:22:43 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Tue, 30 Sep 2014 16:22:43 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: From: Miguel Angel Salazar de Troya Date: Mon, 29 Sep 2014 16:55:14 -0500 To: Shri Cc: "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] DMPlex with spring elements >Hi all >I'm bumping this post because I have more questions related to the same >problem. > >I am looping over the edges of my DMNetwork, then I obtain the vertices >that make up each edge with DMNetworkGetConnectedNode(). Each of these >vertices have two variables (or actually, two degrees of freedom for my >problem). My intentions are to modify the solution vector entries that >are affected by these variables in each vertex. I would call the function >DMNetworkGetVariableOffset() to do this. What happens if one of the >vertices is a ghost vertex? Can I still modify the solution vector? My >problem is that the edge has information to provide to these nodes. > > Sorry for the delay. I think you would want to use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMNetworkIsG hostVertex.html and not modify the value for the ghost vertex. Shri > >Thanks >Miguel > > > >On Fri, Sep 26, 2014 at 12:33 PM, Miguel Angel Salazar de Troya > wrote: > >I understand. Thanks a lot. >Miguel > > >On Fri, Sep 26, 2014 at 10:53 AM, Abhyankar, Shrirang G. > wrote: > >What Matt is saying is that there are two interfaces in PETSc for setting >the residual evaluation routine: > >i) SNESSetFunction takes in a function pointer for the residual evaluation >routine that has the prototype > PetscErrorCode xyzroutine(SNES snes, Vec X, Vec F, void* >ctx); > >X and F are the "global" solution and residual vectors. To compute the >global residual evaluation, typically one does -- (a) scattering X and F >onto local vectors localX and localF (DMGlobalToLocal), (b) computing the >local residual, and (c) gathering the localF in the global F >(DMLocalToGlobal). This is what is done in the example. > >ii) DMSNESSetFunctionLocal takes in a function pointer for the residual >evaluation routine that has the prototype > PetscErrorCode xyzlocalroutine(DM, Vec localX, localF, >void* ctx) > >In this case, the localX and localF get passed to the routine. So, you >only have to do the local residual evaluation. PETSc does the >LocalToGlobal gather to form the global residual. > >I chose to use SNESSetFunction in the example. You can use either of them. > >Shri > >From: Matthew Knepley >Date: Fri, 26 Sep 2014 10:28:26 -0500 >To: Miguel Angel Salazar de Troya >Cc: Jed Brown , Shri , >"petsc-users at mcs.anl.gov" >Subject: Re: [petsc-users] DMPlex with spring elements > > >>On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya >> wrote: >> >>Yeah, but doesn't it only work with the local vectors localX and localF? >> >> >> >>I am telling you what the interface for the functions is. You can do >>whatever you want inside. >> >> Matt >> >> >>Miguel >> >>On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley >>wrote: >> >>On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya >> wrote: >> >>That means that if we call SNESSetFunction() we don't build the residual >>vector in parallel? In the pflow example >>(http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tu >>t >>orials/network/pflow/pf.c.html) the function FormFunction() (Input for >>SNESSetFunction() works with the local vectors. I don't understand this. >> >> >> >>FormFunction() in that link clearly takes in a global vector X and >>returns a global vector F. Inside, it >>converts them to local vectors. This is exactly what you would do for a >>function given to SNESSetFunction(). >> >> Matt >> >> >> >>Thanks >>Miguel >> >> >>On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley >>wrote: >> >>On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya >> wrote: >> >>Thanks. I had another question about the DM and SNES and TS. There are >>similar routines to assign the residual and jacobian evaluation to both >>objects. For the SNES case are: >>DMSNESSetFunctionLocal >>DMSNESSetJacobianLocal >> >>What are the differences of these with: >> >>SNESSetFunction >>SNESSetJacobian >> >> >> >> >>SNESSetFunction() expects the user to construct the entire parallel >>residual vector. DMSNESSetFunctionLocal() >>expects the user to construct the local pieces of the residual, and then >>it automatically calls DMLocalToGlobal() >>to assembly the full residual. It also converts the input from global >>vectors to local vectors, and in the case of >>DMDA multidimensional arrays. >> >> Thanks, >> >> Matt >> >> >>and when should we use each? With "Local", it is meant to evaluate the >>function/jacobian for the elements in the local processor? I could get >>the local edges in DMNetwork by calling DMNetworkGetEdgeRange? >> >>Miguel >> >> >>On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >>wrote: >> >> >> >>On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya >> wrote: >> >>> If you need a symmetric Jacobian, you can use the BC facility in >>> PetscSection, which eliminates the >>> variables completely. This is how the FEM examples, like ex12, work. >>Would that be with PetscSectionSetConstraintDof ? For that I will need >>the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>could cast it to DM_Network from the dm, networkdm, declared in the main >>program, maybe something like this: >>DM_Network *network = (DM_Network*) networkdm->data;Then I would loop >>over the vertices and call PetscSectionSetConstraintDof if it's a >>boundary node (by checking the corresponding component) >> >> >> >> >>I admit to not completely understanding DMNetwork. However, it eventually >>builds a PetscSection for data layout, which >>you could get from DMGetDefaultSection(). The right thing to do is find >>where it builds the Section, and put in your BC >>there, but that sounds like it would entail coding. >> >> Thanks, >> >> Matt >> >> >> >> >>Thanks for your responses.Miguel >> >> >> >> >>On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >> >>Matthew Knepley writes: >> >>> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. >>>>>> wrote: >>> >>>> You are right. The Jacobian for the power grid application is indeed >>>> non-symmetric. Is that a problem for your application? >>>> >>> >>> If you need a symmetric Jacobian, you can use the BC facility in >>> PetscSection, which eliminates the >>> variables completely. This is how the FEM examples, like ex12, work. >> >>You can also use MatZeroRowsColumns() or do the equivalent >>transformation during assembly (my preference). >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >> >> >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign > > >>(217) 550-2360 >> >> >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >> >>-- >>What most experimenters take for granted before they begin their >>experiments is infinitely more interesting than any results to which >>their experiments lead. >>-- Norbert Wiener >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >> >>-- >>What most experimenters take for granted before they begin their >>experiments is infinitely more interesting than any results to which >>their experiments lead. >>-- Norbert Wiener >> >> >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >> >>-- >>What most experimenters take for granted before they begin their >>experiments is infinitely more interesting than any results to which >>their experiments lead. >>-- Norbert Wiener >> >> >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >> >>-- >>What most experimenters take for granted before they begin their >>experiments is infinitely more interesting than any results to which >>their experiments lead. >>-- Norbert Wiener > > > > > > > > > >-- > > >Miguel Angel Salazar de Troya > > >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 > > >salaza11 at illinois.edu > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu From knepley at gmail.com Tue Sep 30 11:24:23 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 30 Sep 2014 11:24:23 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: On Tue, Sep 30, 2014 at 11:22 AM, Abhyankar, Shrirang G. < abhyshr at mcs.anl.gov> wrote: > > From: Miguel Angel Salazar de Troya > Date: Mon, 29 Sep 2014 16:55:14 -0500 > To: Shri > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements > > > >Hi all > >I'm bumping this post because I have more questions related to the same > >problem. > > > >I am looping over the edges of my DMNetwork, then I obtain the vertices > >that make up each edge with DMNetworkGetConnectedNode(). Each of these > >vertices have two variables (or actually, two degrees of freedom for my > >problem). My intentions are to modify the solution vector entries that > >are affected by these variables in each vertex. I would call the function > >DMNetworkGetVariableOffset() to do this. What happens if one of the > >vertices is a ghost vertex? Can I still modify the solution vector? My > >problem is that the edge has information to provide to these nodes. > > > > > > Sorry for the delay. I think you would want to use > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMNetworkIsG > hostVertex.html and not modify the value for the ghost vertex. > This depends on the discretization. In FD, you do not modify ghosts, but in FEM you do. Matt > Shri > > > > >Thanks > >Miguel > > > > > > > >On Fri, Sep 26, 2014 at 12:33 PM, Miguel Angel Salazar de Troya > > wrote: > > > >I understand. Thanks a lot. > >Miguel > > > > > >On Fri, Sep 26, 2014 at 10:53 AM, Abhyankar, Shrirang G. > > wrote: > > > >What Matt is saying is that there are two interfaces in PETSc for setting > >the residual evaluation routine: > > > >i) SNESSetFunction takes in a function pointer for the residual evaluation > >routine that has the prototype > > PetscErrorCode xyzroutine(SNES snes, Vec X, Vec F, void* > >ctx); > > > >X and F are the "global" solution and residual vectors. To compute the > >global residual evaluation, typically one does -- (a) scattering X and F > >onto local vectors localX and localF (DMGlobalToLocal), (b) computing the > >local residual, and (c) gathering the localF in the global F > >(DMLocalToGlobal). This is what is done in the example. > > > >ii) DMSNESSetFunctionLocal takes in a function pointer for the residual > >evaluation routine that has the prototype > > PetscErrorCode xyzlocalroutine(DM, Vec localX, localF, > >void* ctx) > > > >In this case, the localX and localF get passed to the routine. So, you > >only have to do the local residual evaluation. PETSc does the > >LocalToGlobal gather to form the global residual. > > > >I chose to use SNESSetFunction in the example. You can use either of them. > > > >Shri > > > >From: Matthew Knepley > >Date: Fri, 26 Sep 2014 10:28:26 -0500 > >To: Miguel Angel Salazar de Troya > >Cc: Jed Brown , Shri , > >"petsc-users at mcs.anl.gov" > >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > >>On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>Yeah, but doesn't it only work with the local vectors localX and localF? > >> > >> > >> > >>I am telling you what the interface for the functions is. You can do > >>whatever you want inside. > >> > >> Matt > >> > >> > >>Miguel > >> > >>On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley > >>wrote: > >> > >>On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>That means that if we call SNESSetFunction() we don't build the residual > >>vector in parallel? In the pflow example > >>( > http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tu > >>t > >>orials/network/pflow/pf.c.html) the function FormFunction() (Input for > >>SNESSetFunction() works with the local vectors. I don't understand this. > >> > >> > >> > >>FormFunction() in that link clearly takes in a global vector X and > >>returns a global vector F. Inside, it > >>converts them to local vectors. This is exactly what you would do for a > >>function given to SNESSetFunction(). > >> > >> Matt > >> > >> > >> > >>Thanks > >>Miguel > >> > >> > >>On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley > >>wrote: > >> > >>On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>Thanks. I had another question about the DM and SNES and TS. There are > >>similar routines to assign the residual and jacobian evaluation to both > >>objects. For the SNES case are: > >>DMSNESSetFunctionLocal > >>DMSNESSetJacobianLocal > >> > >>What are the differences of these with: > >> > >>SNESSetFunction > >>SNESSetJacobian > >> > >> > >> > >> > >>SNESSetFunction() expects the user to construct the entire parallel > >>residual vector. DMSNESSetFunctionLocal() > >>expects the user to construct the local pieces of the residual, and then > >>it automatically calls DMLocalToGlobal() > >>to assembly the full residual. It also converts the input from global > >>vectors to local vectors, and in the case of > >>DMDA multidimensional arrays. > >> > >> Thanks, > >> > >> Matt > >> > >> > >>and when should we use each? With "Local", it is meant to evaluate the > >>function/jacobian for the elements in the local processor? I could get > >>the local edges in DMNetwork by calling DMNetworkGetEdgeRange? > >> > >>Miguel > >> > >> > >>On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley > >>wrote: > >> > >> > >> > >>On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>> If you need a symmetric Jacobian, you can use the BC facility in > >>> PetscSection, which eliminates the > >>> variables completely. This is how the FEM examples, like ex12, work. > >>Would that be with PetscSectionSetConstraintDof ? For that I will need > >>the PetscSection, DofSection, within DMNetwork, how can I obtain it? I > >>could cast it to DM_Network from the dm, networkdm, declared in the main > >>program, maybe something like this: > >>DM_Network *network = (DM_Network*) networkdm->data;Then I would loop > >>over the vertices and call PetscSectionSetConstraintDof if it's a > >>boundary node (by checking the corresponding component) > >> > >> > >> > >> > >>I admit to not completely understanding DMNetwork. However, it eventually > >>builds a PetscSection for data layout, which > >>you could get from DMGetDefaultSection(). The right thing to do is find > >>where it builds the Section, and put in your BC > >>there, but that sounds like it would entail coding. > >> > >> Thanks, > >> > >> Matt > >> > >> > >> > >> > >>Thanks for your responses.Miguel > >> > >> > >> > >> > >>On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: > >> > >>Matthew Knepley writes: > >> > >>> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. > >>> >>>> wrote: > >>> > >>>> You are right. The Jacobian for the power grid application is indeed > >>>> non-symmetric. Is that a problem for your application? > >>>> > >>> > >>> If you need a symmetric Jacobian, you can use the BC facility in > >>> PetscSection, which eliminates the > >>> variables completely. This is how the FEM examples, like ex12, work. > >> > >>You can also use MatZeroRowsColumns() or do the equivalent > >>transformation during assembly (my preference). > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >> > >> > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > > > > > >>(217) 550-2360 > >> > >> > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > > > > > > > > > > > > > > > > > > > >-- > > > > > >Miguel Angel Salazar de Troya > > > > > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > > > > > >salaza11 at illinois.edu > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From salazardetroya at gmail.com Tue Sep 30 11:35:31 2014 From: salazardetroya at gmail.com (Miguel Angel Salazar de Troya) Date: Tue, 30 Sep 2014 11:35:31 -0500 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: References: Message-ID: No worries. Thanks for your responses. I'm assuming you suggested to use DMNetworkIsGhostVertex() and not modify its value for the case in which I were using the global vectors and not the local vectors, where it is possible, as Matt suggested. Miguel On Tue, Sep 30, 2014 at 11:22 AM, Abhyankar, Shrirang G. < abhyshr at mcs.anl.gov> wrote: > > From: Miguel Angel Salazar de Troya > Date: Mon, 29 Sep 2014 16:55:14 -0500 > To: Shri > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] DMPlex with spring elements > > > >Hi all > >I'm bumping this post because I have more questions related to the same > >problem. > > > >I am looping over the edges of my DMNetwork, then I obtain the vertices > >that make up each edge with DMNetworkGetConnectedNode(). Each of these > >vertices have two variables (or actually, two degrees of freedom for my > >problem). My intentions are to modify the solution vector entries that > >are affected by these variables in each vertex. I would call the function > >DMNetworkGetVariableOffset() to do this. What happens if one of the > >vertices is a ghost vertex? Can I still modify the solution vector? My > >problem is that the edge has information to provide to these nodes. > > > > > > Sorry for the delay. I think you would want to use > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMNetworkIsG > hostVertex.html and not modify the value for the ghost vertex. > > Shri > > > > >Thanks > >Miguel > > > > > > > >On Fri, Sep 26, 2014 at 12:33 PM, Miguel Angel Salazar de Troya > > wrote: > > > >I understand. Thanks a lot. > >Miguel > > > > > >On Fri, Sep 26, 2014 at 10:53 AM, Abhyankar, Shrirang G. > > wrote: > > > >What Matt is saying is that there are two interfaces in PETSc for setting > >the residual evaluation routine: > > > >i) SNESSetFunction takes in a function pointer for the residual evaluation > >routine that has the prototype > > PetscErrorCode xyzroutine(SNES snes, Vec X, Vec F, void* > >ctx); > > > >X and F are the "global" solution and residual vectors. To compute the > >global residual evaluation, typically one does -- (a) scattering X and F > >onto local vectors localX and localF (DMGlobalToLocal), (b) computing the > >local residual, and (c) gathering the localF in the global F > >(DMLocalToGlobal). This is what is done in the example. > > > >ii) DMSNESSetFunctionLocal takes in a function pointer for the residual > >evaluation routine that has the prototype > > PetscErrorCode xyzlocalroutine(DM, Vec localX, localF, > >void* ctx) > > > >In this case, the localX and localF get passed to the routine. So, you > >only have to do the local residual evaluation. PETSc does the > >LocalToGlobal gather to form the global residual. > > > >I chose to use SNESSetFunction in the example. You can use either of them. > > > >Shri > > > >From: Matthew Knepley > >Date: Fri, 26 Sep 2014 10:28:26 -0500 > >To: Miguel Angel Salazar de Troya > >Cc: Jed Brown , Shri , > >"petsc-users at mcs.anl.gov" > >Subject: Re: [petsc-users] DMPlex with spring elements > > > > > >>On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>Yeah, but doesn't it only work with the local vectors localX and localF? > >> > >> > >> > >>I am telling you what the interface for the functions is. You can do > >>whatever you want inside. > >> > >> Matt > >> > >> > >>Miguel > >> > >>On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley > >>wrote: > >> > >>On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>That means that if we call SNESSetFunction() we don't build the residual > >>vector in parallel? In the pflow example > >>( > http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/tu > >>t > >>orials/network/pflow/pf.c.html) the function FormFunction() (Input for > >>SNESSetFunction() works with the local vectors. I don't understand this. > >> > >> > >> > >>FormFunction() in that link clearly takes in a global vector X and > >>returns a global vector F. Inside, it > >>converts them to local vectors. This is exactly what you would do for a > >>function given to SNESSetFunction(). > >> > >> Matt > >> > >> > >> > >>Thanks > >>Miguel > >> > >> > >>On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley > >>wrote: > >> > >>On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>Thanks. I had another question about the DM and SNES and TS. There are > >>similar routines to assign the residual and jacobian evaluation to both > >>objects. For the SNES case are: > >>DMSNESSetFunctionLocal > >>DMSNESSetJacobianLocal > >> > >>What are the differences of these with: > >> > >>SNESSetFunction > >>SNESSetJacobian > >> > >> > >> > >> > >>SNESSetFunction() expects the user to construct the entire parallel > >>residual vector. DMSNESSetFunctionLocal() > >>expects the user to construct the local pieces of the residual, and then > >>it automatically calls DMLocalToGlobal() > >>to assembly the full residual. It also converts the input from global > >>vectors to local vectors, and in the case of > >>DMDA multidimensional arrays. > >> > >> Thanks, > >> > >> Matt > >> > >> > >>and when should we use each? With "Local", it is meant to evaluate the > >>function/jacobian for the elements in the local processor? I could get > >>the local edges in DMNetwork by calling DMNetworkGetEdgeRange? > >> > >>Miguel > >> > >> > >>On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley > >>wrote: > >> > >> > >> > >>On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya > >> wrote: > >> > >>> If you need a symmetric Jacobian, you can use the BC facility in > >>> PetscSection, which eliminates the > >>> variables completely. This is how the FEM examples, like ex12, work. > >>Would that be with PetscSectionSetConstraintDof ? For that I will need > >>the PetscSection, DofSection, within DMNetwork, how can I obtain it? I > >>could cast it to DM_Network from the dm, networkdm, declared in the main > >>program, maybe something like this: > >>DM_Network *network = (DM_Network*) networkdm->data;Then I would loop > >>over the vertices and call PetscSectionSetConstraintDof if it's a > >>boundary node (by checking the corresponding component) > >> > >> > >> > >> > >>I admit to not completely understanding DMNetwork. However, it eventually > >>builds a PetscSection for data layout, which > >>you could get from DMGetDefaultSection(). The right thing to do is find > >>where it builds the Section, and put in your BC > >>there, but that sounds like it would entail coding. > >> > >> Thanks, > >> > >> Matt > >> > >> > >> > >> > >>Thanks for your responses.Miguel > >> > >> > >> > >> > >>On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: > >> > >>Matthew Knepley writes: > >> > >>> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. > >>> >>>> wrote: > >>> > >>>> You are right. The Jacobian for the power grid application is indeed > >>>> non-symmetric. Is that a problem for your application? > >>>> > >>> > >>> If you need a symmetric Jacobian, you can use the BC facility in > >>> PetscSection, which eliminates the > >>> variables completely. This is how the FEM examples, like ex12, work. > >> > >>You can also use MatZeroRowsColumns() or do the equivalent > >>transformation during assembly (my preference). > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >> > >> > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > > > > > >>(217) 550-2360 > >> > >> > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>Miguel Angel Salazar de Troya > >>Graduate Research Assistant > >>Department of Mechanical Science and Engineering > >>University of Illinois at Urbana-Champaign > >>(217) 550-2360 > >>salaza11 at illinois.edu > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>-- > >>What most experimenters take for granted before they begin their > >>experiments is infinitely more interesting than any results to which > >>their experiments lead. > >>-- Norbert Wiener > > > > > > > > > > > > > > > > > > > >-- > > > > > >Miguel Angel Salazar de Troya > > > > > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > > > > > >salaza11 at illinois.edu > > > > > > > > > > > > > > > > > >-- > >Miguel Angel Salazar de Troya > >Graduate Research Assistant > >Department of Mechanical Science and Engineering > >University of Illinois at Urbana-Champaign > >(217) 550-2360 > >salaza11 at illinois.edu > > -- *Miguel Angel Salazar de Troya* Graduate Research Assistant Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign (217) 550-2360 salaza11 at illinois.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at mcs.anl.gov Tue Sep 30 12:20:57 2014 From: abhyshr at mcs.anl.gov (Abhyankar, Shrirang G.) Date: Tue, 30 Sep 2014 17:20:57 +0000 Subject: [petsc-users] DMPlex with spring elements In-Reply-To: Message-ID: My reply was based on the assumption that the values set at a vertex do not have contributions from other vertices or edges. In which case, you would not want to set any value for the ghost vertex. For example, consider the following network v1 e1 v2 e2 v3 | ---------------- | --------------- | that gets distributed on two processors as follows v1 e1 v2' [0] | --------------- | v2 e2 v3 [1] | ----------------- | Here, v2' is the ghost vertex on P0. Now, if you want to set f(v2) = 2, then you do not need to set anything for f(v2') and call DMLocalToGlobal() at the end with INSERT_VALUES flag (You can also use ADD_VALUES flag assuming that you have zeroed the local vector initially). On the other hand, if you have f(v2) = f(v1) + f(v3) then you would need to set f(v2') = f(v1) and f(v2) = f(v3) and call DMLocalToGlobal with the flag ADD_VALUES. Shri From: Miguel Angel Salazar de Troya Date: Tue, 30 Sep 2014 11:35:31 -0500 To: Shri Cc: "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] DMPlex with spring elements >No worries. Thanks for your responses. I'm assuming you suggested to use >DMNetworkIsGhostVertex() and not modify its value for the case in which I >were using the global vectors and not the local vectors, where it is >possible, as Matt suggested. >Miguel > > >On Tue, Sep 30, 2014 at 11:22 AM, Abhyankar, Shrirang G. > wrote: > > >From: Miguel Angel Salazar de Troya >Date: Mon, 29 Sep 2014 16:55:14 -0500 >To: Shri >Cc: "petsc-users at mcs.anl.gov" >Subject: Re: [petsc-users] DMPlex with spring elements > > >>Hi all >>I'm bumping this post because I have more questions related to the same >>problem. >> >>I am looping over the edges of my DMNetwork, then I obtain the vertices >>that make up each edge with DMNetworkGetConnectedNode(). Each of these >>vertices have two variables (or actually, two degrees of freedom for my >>problem). My intentions are to modify the solution vector entries that >>are affected by these variables in each vertex. I would call the function >>DMNetworkGetVariableOffset() to do this. What happens if one of the >>vertices is a ghost vertex? Can I still modify the solution vector? My >>problem is that the edge has information to provide to these nodes. >> >> > >Sorry for the delay. I think you would want to use >http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMNetworkIs >G >hostVertex.html and not modify the value for the ghost vertex. > >Shri > >> >>Thanks >>Miguel >> >> >> >>On Fri, Sep 26, 2014 at 12:33 PM, Miguel Angel Salazar de Troya >> wrote: >> >>I understand. Thanks a lot. >>Miguel >> >> >>On Fri, Sep 26, 2014 at 10:53 AM, Abhyankar, Shrirang G. >> wrote: >> >>What Matt is saying is that there are two interfaces in PETSc for setting >>the residual evaluation routine: >> >>i) SNESSetFunction takes in a function pointer for the residual >>evaluation >>routine that has the prototype >> PetscErrorCode xyzroutine(SNES snes, Vec X, Vec F, void* >>ctx); >> >>X and F are the "global" solution and residual vectors. To compute the >>global residual evaluation, typically one does -- (a) scattering X and F >>onto local vectors localX and localF (DMGlobalToLocal), (b) computing the >>local residual, and (c) gathering the localF in the global F >>(DMLocalToGlobal). This is what is done in the example. >> >>ii) DMSNESSetFunctionLocal takes in a function pointer for the residual >>evaluation routine that has the prototype >> PetscErrorCode xyzlocalroutine(DM, Vec localX, localF, >>void* ctx) >> >>In this case, the localX and localF get passed to the routine. So, you >>only have to do the local residual evaluation. PETSc does the >>LocalToGlobal gather to form the global residual. >> >>I chose to use SNESSetFunction in the example. You can use either of >>them. >> >>Shri >> >>From: Matthew Knepley >>Date: Fri, 26 Sep 2014 10:28:26 -0500 >>To: Miguel Angel Salazar de Troya >>Cc: Jed Brown , Shri , >>"petsc-users at mcs.anl.gov" >>Subject: Re: [petsc-users] DMPlex with spring elements >> >> >>>On Fri, Sep 26, 2014 at 10:26 AM, Miguel Angel Salazar de Troya >>> wrote: >>> >>>Yeah, but doesn't it only work with the local vectors localX and localF? >>> >>> >>> >>>I am telling you what the interface for the functions is. You can do >>>whatever you want inside. >>> >>> Matt >>> >>> >>>Miguel >>> >>>On Fri, Sep 26, 2014 at 10:10 AM, Matthew Knepley >>>wrote: >>> >>>On Fri, Sep 26, 2014 at 10:06 AM, Miguel Angel Salazar de Troya >>> wrote: >>> >>>That means that if we call SNESSetFunction() we don't build the residual >>>vector in parallel? In the pflow example >>>(http://www.mcs.anl.gov/petsc/petsc-as/petsc-current/src/snes/examples/t >>>u >>>t >>>orials/network/pflow/pf.c.html) the function FormFunction() (Input for >>>SNESSetFunction() works with the local vectors. I don't understand this. >>> >>> >>> >>>FormFunction() in that link clearly takes in a global vector X and >>>returns a global vector F. Inside, it >>>converts them to local vectors. This is exactly what you would do for a >>>function given to SNESSetFunction(). >>> >>> Matt >>> >>> >>> >>>Thanks >>>Miguel >>> >>> >>>On Fri, Sep 26, 2014 at 9:34 AM, Matthew Knepley >>>wrote: >>> >>>On Fri, Sep 26, 2014 at 9:31 AM, Miguel Angel Salazar de Troya >>> wrote: >>> >>>Thanks. I had another question about the DM and SNES and TS. There are >>>similar routines to assign the residual and jacobian evaluation to both >>>objects. For the SNES case are: >>>DMSNESSetFunctionLocal >>>DMSNESSetJacobianLocal >>> >>>What are the differences of these with: >>> >>>SNESSetFunction >>>SNESSetJacobian >>> >>> >>> >>> >>>SNESSetFunction() expects the user to construct the entire parallel >>>residual vector. DMSNESSetFunctionLocal() >>>expects the user to construct the local pieces of the residual, and then >>>it automatically calls DMLocalToGlobal() >>>to assembly the full residual. It also converts the input from global >>>vectors to local vectors, and in the case of >>>DMDA multidimensional arrays. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>and when should we use each? With "Local", it is meant to evaluate the >>>function/jacobian for the elements in the local processor? I could get >>>the local edges in DMNetwork by calling DMNetworkGetEdgeRange? >>> >>>Miguel >>> >>> >>>On Thu, Sep 25, 2014 at 5:17 PM, Matthew Knepley >>>wrote: >>> >>> >>> >>>On Thu, Sep 25, 2014 at 5:15 PM, Miguel Angel Salazar de Troya >>> wrote: >>> >>>> If you need a symmetric Jacobian, you can use the BC facility in >>>> PetscSection, which eliminates the >>>> variables completely. This is how the FEM examples, like ex12, work. >>>Would that be with PetscSectionSetConstraintDof ? For that I will need >>>the PetscSection, DofSection, within DMNetwork, how can I obtain it? I >>>could cast it to DM_Network from the dm, networkdm, declared in the >>>main >>>program, maybe something like this: >>>DM_Network *network = (DM_Network*) networkdm->data;Then I would >>>loop >>>over the vertices and call PetscSectionSetConstraintDof if it's a >>>boundary node (by checking the corresponding component) >>> >>> >>> >>> >>>I admit to not completely understanding DMNetwork. However, it >>>eventually >>>builds a PetscSection for data layout, which >>>you could get from DMGetDefaultSection(). The right thing to do is find >>>where it builds the Section, and put in your BC >>>there, but that sounds like it would entail coding. >>> >>> Thanks, >>> >>> Matt >>> >>> >>> >>> >>>Thanks for your responses.Miguel >>> >>> >>> >>> >>>On Thu, Sep 25, 2014 at 2:42 PM, Jed Brown wrote: >>> >>>Matthew Knepley writes: >>> >>>> On Thu, Sep 25, 2014 at 1:46 PM, Abhyankar, Shrirang G. >>>>>>>> wrote: >>>> >>>>> You are right. The Jacobian for the power grid application is indeed >>>>> non-symmetric. Is that a problem for your application? >>>>> >>>> >>>> If you need a symmetric Jacobian, you can use the BC facility in >>>> PetscSection, which eliminates the >>>> variables completely. This is how the FEM examples, like ex12, work. >>> >>>You can also use MatZeroRowsColumns() or do the equivalent >>>transformation during assembly (my preference). >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>Miguel Angel Salazar de Troya >>> >>> >>>Graduate Research Assistant >>>Department of Mechanical Science and Engineering >>>University of Illinois at Urbana-Champaign >> >> >>>(217) 550-2360 >>> >>> >>>salaza11 at illinois.edu >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>>-- Norbert Wiener >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>Miguel Angel Salazar de Troya >>>Graduate Research Assistant >>>Department of Mechanical Science and Engineering >>>University of Illinois at Urbana-Champaign >>>(217) 550-2360 >>>salaza11 at illinois.edu >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>>-- Norbert Wiener >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>Miguel Angel Salazar de Troya >>>Graduate Research Assistant >>>Department of Mechanical Science and Engineering >>>University of Illinois at Urbana-Champaign >>>(217) 550-2360 >>>salaza11 at illinois.edu >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>>-- Norbert Wiener >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>Miguel Angel Salazar de Troya >>>Graduate Research Assistant >>>Department of Mechanical Science and Engineering >>>University of Illinois at Urbana-Champaign >>>(217) 550-2360 >>>salaza11 at illinois.edu >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>-- >>>What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>>-- Norbert Wiener >> >> >> >> >> >> >> >> >> >>-- >> >> >>Miguel Angel Salazar de Troya >> >> >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >> >> >>salaza11 at illinois.edu >> >> >> >> >> >> >> >> >>-- >>Miguel Angel Salazar de Troya >>Graduate Research Assistant >>Department of Mechanical Science and Engineering >>University of Illinois at Urbana-Champaign >>(217) 550-2360 >>salaza11 at illinois.edu > > > > > > > > > >-- >Miguel Angel Salazar de Troya >Graduate Research Assistant >Department of Mechanical Science and Engineering >University of Illinois at Urbana-Champaign >(217) 550-2360 >salaza11 at illinois.edu From hillsmattc at outlook.com Tue Sep 30 12:59:53 2014 From: hillsmattc at outlook.com (Matthew Hills) Date: Tue, 30 Sep 2014 17:59:53 +0000 Subject: [petsc-users] =?utf-8?q?PETSc_and_MPE?= Message-ID: Thanks a mil, it worked perfectly. Matt Sent from Windows Mail From: Satish Balay Sent: ?Tuesday?, ?30? ?September? ?2014 ?15?:?06 To: Matthew Hills Cc: petsc-users at mcs.anl.gov, Sebastian Skatulla This hasn't been tested in a while - but I think you have to configur with: --download-mpe And then run the code with: -log_mpe Satish On Tue, 30 Sep 2014, Matthew Hills wrote: > Hi PETSc team, > > I'm attempting to analyze and optimize a structural analysis program called SESKA using mpe. I have configured PETSc with: > > > > > > > > > > > ./configure > --with-mpi=1 --with-cc=gcc > --with-cxx=g++ > --with-fc=gfortran > --download-f-blas-lapack=${PETSC_DIR}/externalpackages/fblaslapacklinpack-3.1.1.tar.gz > --with-parmetis=1 > --download-parmetis=${PETSC_DIR}/externalpackages/parmetis-4.0.2-p3.tar.gz > --with-mumps=1 > --download-mumps=${PETSC_DIR}/externalpackages/MUMPS_4.10.0-p3.tar.gz > --with-scalapack=1 > --download-scalapack=${PETSC_DIR}/externalpackages/scalapack-2.0.2.tgz > --download-blacs=${PETSC_DIR}/externalpackages/blacs-dev.tar.gz > --with-superlu_dist=1 > --download-superlu_dist=${PETSC_DIR}/externalpackages/superlu_dist_3.1.tar.gz > --with-metis > --download-metis=${PETSC_DIR}/externalpackages/metis-5.0.2-p3.tar.gz > --with-sowing=${PETSC_DIR}/externalpackages/sowing-1.1.16d.tar.gz > --with-c2html=0 --with-shared-libraries=1 --download-mpich-mpe=1 > --download-mpich=${SESKADIR}/packages/downloads/mpich-3.0.4.tar.gz > > and then run SESKA with: > mpiexec -n 8 seska -log_mpe mpe.log > > However mpe.log is not created. I wish to view this file in Jumpsuit. Any assistance would be greatly appreciated. > Regards,Matthew > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gbisht at lbl.gov Tue Sep 30 17:29:10 2014 From: gbisht at lbl.gov (Gautam Bisht) Date: Tue, 30 Sep 2014 15:29:10 -0700 Subject: [petsc-users] Allocating memory for off-diagonal matrix blocks when using DMComposite Message-ID: Hi, The comment on line 419 of SNES ex 28.c says that the approach used in this example is not the best way to allocate off-diagonal blocks. Is there an example that shows a better way off allocating memory for off-diagonal matrix blocks when using DMComposite? Thanks, -Gautam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Sep 30 17:48:19 2014 From: jed at jedbrown.org (Jed Brown) Date: Tue, 30 Sep 2014 17:48:19 -0500 Subject: [petsc-users] Allocating memory for off-diagonal matrix blocks when using DMComposite In-Reply-To: References: Message-ID: <87mw9g3eto.fsf@jedbrown.org> Gautam Bisht writes: > Hi, > > The comment on line 419 of SNES ex 28.c > > says > that the approach used in this example is not the best way to allocate > off-diagonal blocks. Is there an example that shows a better way off > allocating memory for off-diagonal matrix blocks when using DMComposite? The problem is that the allocation interfaces specialize on the matrix type. Barry wrote a DMCompositeSetCoupling(), but there are no examples. This is something that PETSc needs to improve. I have been unsuccessful at conceiving a _simple_ yet flexible/extensible solution. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From Nan.Zhang at colorado.edu Tue Sep 30 18:20:24 2014 From: Nan.Zhang at colorado.edu (Nan Zhang) Date: Tue, 30 Sep 2014 19:20:24 -0400 Subject: [petsc-users] What is "You must specify a path for MPI with --with-mpi-dir=" Message-ID: Hi there, I am trying to install petsc with its own mpicc. I download it and untar it under my directory. I export PETSC_DIR=`pwd` and export PETSC_ARCH=x86_64. When I do configure: ./config/configure.py ?with-shared-libraries=1 ?with-x=0 ?with-mpi=1 -download-hypre=1 ?download-f-blas-lapack=1 ?download-mpich=1 -download-mpicc=1 ?download-f2cblaslapack=1 I got an error report: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The version of PETSc you are using is out-of-date, we recommend updating to the new release Available Version: 3.5.2 Installed Version: 3.4.2 http://www.mcs.anl.gov/petsc/download/index.html +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ =============================================================================== Configuring PETSc to compile on your system =============================================================================== TESTING: check from config.libraries(config/BuildSystem/config/libraries.py:145) ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- You must specify a path for MPI with --with-mpi-dir= If you do not want MPI, then give --with-mpi=0 You might also consider using --download-mpi instead ******************************************************************************* How to solve this? I also attach my configure.log below. Thanks! Nan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2195763 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Sep 30 18:28:25 2014 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 30 Sep 2014 18:28:25 -0500 Subject: [petsc-users] What is "You must specify a path for MPI with --with-mpi-dir=" In-Reply-To: References: Message-ID: Do not use any of these options -download-mpicc=1 ?download-f2cblaslapack=1 ?download-f-blas-lapack=1 Send the resulting configure.log again if it fails. Also we recommend installing the latest version of PETSc instead of this old one. Barry On Sep 30, 2014, at 6:20 PM, Nan Zhang wrote: > Hi there, > > I am trying to install petsc with its own mpicc. I download it and untar it under my directory. I export PETSC_DIR=`pwd` and export PETSC_ARCH=x86_64. > > When I do configure: > ./config/configure.py ?with-shared-libraries=1 ?with-x=0 ?with-mpi=1 -download-hypre=1 ?download-f-blas-lapack=1 ?download-mpich=1 -download-mpicc=1 ?download-f2cblaslapack=1 > > I got an error report: > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > The version of PETSc you are using is out-of-date, we recommend updating to the new release > Available Version: 3.5.2 Installed Version: 3.4.2 > http://www.mcs.anl.gov/petsc/download/index.html > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > =============================================================================== > Configuring PETSc to compile on your system > =============================================================================== > TESTING: check from config.libraries(config/BuildSystem/config/libraries.py:145) ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > ------------------------------------------------------------------------------- > You must specify a path for MPI with --with-mpi-dir= > If you do not want MPI, then give --with-mpi=0 > You might also consider using --download-mpi instead > ******************************************************************************* > How to solve this? I also attach my configure.log below. > > Thanks! > Nan > From thronesf at gmail.com Tue Sep 30 21:13:37 2014 From: thronesf at gmail.com (Sharp Stone) Date: Tue, 30 Sep 2014 22:13:37 -0400 Subject: [petsc-users] How to use Message-ID: Hi all, I have four differential equations to be solved with a linear sparse matrix system that has the form of "(mat)A dot (vec)X = (vec)B". For each node, I have dof=4. I found few tutorials or examples on the KSPSolve with dof>1, and I do know it's possible to solve this problem with Petsc. Are there any sources illustrating ksp solver with dof>1? Many thanks! Sorry for the stupid questions. -- Best regards, -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Sep 30 21:15:21 2014 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 30 Sep 2014 21:15:21 -0500 Subject: [petsc-users] How to use In-Reply-To: References: Message-ID: On Tue, Sep 30, 2014 at 9:13 PM, Sharp Stone wrote: > Hi all, > > I have four differential equations to be solved with a linear sparse > matrix system that has the form of "(mat)A dot (vec)X = (vec)B". For each > node, I have dof=4. I found few tutorials or examples on the KSPSolve with > dof>1, and I do know it's possible to solve this problem with Petsc. Are > there any sources illustrating ksp solver with dof>1? Many thanks! > Yes, try SNES ex19. Thanks, Matt > Sorry for the stupid questions. > > -- > Best regards, > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: