[petsc-users] [beginner question] Different communicators in the two objects: Argument # 1 and 2 flag 3!

Matthew Knepley knepley at gmail.com
Tue Apr 22 10:13:44 CDT 2014


On Tue, Apr 22, 2014 at 7:59 AM, Niklas Fischer <niklas at niklasfi.de> wrote:

>  I should probably note that everything is fine if I run the serial
> version of this (with the exact same matrix + right hand side).
>
> PETSc KSPSolve done, residual norm: 3.13459e-13, it took 6 iterations.
>

Yes, your preconditioner is weaker in parallel since it is block Jacobi. If
you just want to solve
the problem, use a parallel sparse direct factorization, like SuperLU_dist
or MUMPS. You
reconfigure using --download-superlu-dist or --download-mumps, and then use

  -pc_type lu -pc_factor_mat_solver_package mumps

If you want a really scalable solution, then you have to know about your
operator, not just the discretization.

   Matt


> Am 22.04.2014 14:12, schrieb Niklas Fischer:
>
>
> Am 22.04.2014 13:57, schrieb Matthew Knepley:
>
>  On Tue, Apr 22, 2014 at 6:48 AM, Niklas Fischer <niklas at niklasfi.de>wrote:
>
>> Am 22.04.2014 13:08, schrieb Jed Brown:
>>
>>> Niklas Fischer <niklas at niklasfi.de> writes:
>>>
>>>  Hello,
>>>>
>>>> I have attached a small test case for a problem I am experiencing. What
>>>> this dummy program does is it reads a vector and a matrix from a text
>>>> file and then solves Ax=b. The same data is available in two forms:
>>>>   - everything is in one file (matops.s.0 and vops.s.0)
>>>>   - the matrix and vector are split between processes (matops.0,
>>>> matops.1, vops.0, vops.1)
>>>>
>>>> The serial version of the program works perfectly fine but unfortunately
>>>> errors occure, when running the parallel version:
>>>>
>>>> make && mpirun -n 2 a.out matops vops
>>>>
>>>> mpic++ -DPETSC_CLANGUAGE_CXX -isystem
>>>> /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/include -isystem
>>>> /home/data/fischer/libs/petsc-3.4.3/include petsctest.cpp -Werror -Wall
>>>> -Wpedantic -std=c++11 -L
>>>> /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib -lpetsc
>>>> /usr/bin/ld: warning: libmpi_cxx.so.0, needed by
>>>> /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib/libpetsc.so,
>>>> may conflict with libmpi_cxx.so.1
>>>> /usr/bin/ld: warning: libmpi.so.0, needed by
>>>> /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib/libpetsc.so,
>>>> may conflict with libmpi.so.1
>>>> librdmacm: couldn't read ABI version.
>>>> librdmacm: assuming: 4
>>>> CMA: unable to get RDMA device list
>>>>
>>>> --------------------------------------------------------------------------
>>>> [[43019,1],0]: A high-performance Open MPI point-to-point messaging
>>>> module
>>>> was unable to find any relevant network interfaces:
>>>>
>>>> Module: OpenFabrics (openib)
>>>>    Host: dornroeschen.igpm.rwth-aachen.de
>>>> CMA: unable to get RDMA device list
>>>>
>>> It looks like your MPI is either broken or some of the code linked into
>>> your application was compiled with a different MPI or different version.
>>> Make sure you can compile and run simple MPI programs in parallel.
>>>
>> Hello Jed,
>>
>> thank you for your inputs. Unfortunately MPI does not seem to be the
>> issue here. The attachment contains a simple MPI hello world program which
>> runs flawlessly (I will append the output to this mail) and I have not
>> encountered any problems with other MPI programs. My question still stands.
>>
>
>  This is a simple error. You created the matrix A using PETSC_COMM_WORLD,
> but you try to view it
> using PETSC_VIEWER_STDOUT_SELF. You need to use PETSC_VIEWER_STDOUT_WORLD
> in
> order to match.
>
>    Thanks,
>
>       Matt
>
>
>> Greetings,
>> Niklas Fischer
>>
>> mpirun -np 2 ./mpitest
>>
>> librdmacm: couldn't read ABI version.
>> librdmacm: assuming: 4
>> CMA: unable to get RDMA device list
>> --------------------------------------------------------------------------
>> [[44086,1],0]: A high-performance Open MPI point-to-point messaging module
>> was unable to find any relevant network interfaces:
>>
>> Module: OpenFabrics (openib)
>>   Host: dornroeschen.igpm.rwth-aachen.de
>>
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>> librdmacm: couldn't read ABI version.
>> librdmacm: assuming: 4
>> CMA: unable to get RDMA device list
>> Hello world from processor dornroeschen.igpm.rwth-aachen.de, rank 0 out
>> of 2 processors
>> Hello world from processor dornroeschen.igpm.rwth-aachen.de, rank 1 out
>> of 2 processors
>> [dornroeschen.igpm.rwth-aachen.de:128141] 1 more process has sent help
>> message help-mpi-btl-base.txt / btl:no-nics
>> [dornroeschen.igpm.rwth-aachen.de:128141] Set MCA parameter
>> "orte_base_help_aggregate" to 0 to see all help / error messages
>>
>
>   Thank you, Matthew, this solves my viewing problem. Am I doing
> something wrong when initializing the matrices as well? The matrix' viewing
> output starts with "Matrix Object: 1 MPI processes" and the Krylov solver
> does not converge.
>
> Your help is really appreciated,
> Niklas Fischer
>
>
>
>  --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140422/a43f86d2/attachment.html>


More information about the petsc-users mailing list