[petsc-users] [beginner question] Different communicators in the two objects: Argument # 1 and 2 flag 3!

Tue Apr 22 11:35:48 CDT 2014

   Call BOTH the AssemblyBegin() and End AFTER you have called the set values routines.

   Barry

  Yes, you could argue it is confusing terminology.

On Apr 22, 2014, at 11:32 AM, Niklas Fischer <niklas at niklasfi.de> wrote:

> Hello Barry,
> 
> Am 22.04.2014 18:08, schrieb Barry Smith:
>> On Apr 22, 2014, at 10:23 AM, Niklas Fischer <niklas at niklasfi.de> wrote:
>> 
>>> I have tracked down the problem further and it basically boils down to the following question: Is it possible to use MatSetValue(s) to set values which are owned by other processes?
>>    Yes it is certainly possible. You should not set a large percent of the values on the “wrong” process but setting some is fine. The values will also be added together if you use ADD_VALUES.
>> 
>>    Below have you called the MatAssemblyBegin/End after setting all the values?
> It certainly is AssemblyBegin first, then set values, then AssemblyEnd
> 
>    CHKERRXX(MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY));
>    CHKERRXX(VecAssemblyBegin(b));
> 
> 
>    for(int i = 0; i < 4 * size; ++i){
>      CHKERRXX(VecSetValue(b, i, 1, ADD_VALUES));
>    }
> 
>    for(int i = 0; i < 4 * size; ++i){
>      CHKERRXX(MatSetValue(A, i, i, rank+1, ADD_VALUES));
>    }
> 
>    CHKERRXX(MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY));
>    CHKERRXX(VecAssemblyEnd(b));
> 
> My observation, setting the values does not work, also ties in with the solution given by the solver which is the result of
> 
> Diag [1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4]x=Constant(1,16)
> 
> Process [0]
> 1
> 1
> 1
> 1
> Process [1]
> 0.5
> 0.5
> 0.5
> 0.5
> Process [2]
> 0.333333
> 0.333333
> 0.333333
> 0.333333
> Process [3]
> 0.25
> 0.25
> 0.25
> 0.25
>>    
>>> If I create a matrix with
>>> 
>>>     for(int i = 0; i < 4 * size; ++i){
>>>       CHKERRXX(MatSetValue(A, i, i, rank+1, ADD_VALUES));
>>>     }
>>> 
>>> for n=m=4, on four processes, one would expect each entry to be 1 + 2 + 3 + 4 = 10, however, PETSc prints
>>> Matrix Object: 1 MPI processes
>>>   type: mpiaij
>>> row 0: (0, 1)
>>> row 1: (1, 1)
>>> row 2: (2, 1)
>>> row 3: (3, 1)
>>> row 4: (4, 2)
>>> row 5: (5, 2)
>>> row 6: (6, 2)
>>> row 7: (7, 2)
>>> row 8: (8, 3)
>>> row 9: (9, 3)
>>> row 10: (10, 3)
>>> row 11: (11, 3)
>>> row 12: (12, 4)
>>> row 13: (13, 4)
>>> row 14: (14, 4)
>>> row 15: (15, 4)
>>> which is exactly, what
>>> 
>>> CHKERRXX(VecGetOwnershipRange(x, &ownership_start, &ownership_end));
>>> for(int i = ownership_start; i < ownership_end; ++i){
>>>    CHKERRXX(MatSetValue(A, i, i, rank+1, ADD_VALUES));
>>> }
>>> would give us.
>>> 
>>> Kind regards,
>>> Niklas Fischer
>>> 
>>> Am 22.04.2014 14:59, schrieb Niklas Fischer:
>>>> I should probably note that everything is fine if I run the serial version of this (with the exact same matrix + right hand side).
>>>> 
>>>> PETSc KSPSolve done, residual norm: 3.13459e-13, it took 6 iterations.
>>>> 
>>>> Am 22.04.2014 14:12, schrieb Niklas Fischer:
>>>>> Am 22.04.2014 13:57, schrieb Matthew Knepley:
>>>>>> On Tue, Apr 22, 2014 at 6:48 AM, Niklas Fischer <niklas at niklasfi.de> wrote:
>>>>>> Am 22.04.2014 13:08, schrieb Jed Brown:
>>>>>> Niklas Fischer <niklas at niklasfi.de> writes:
>>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> I have attached a small test case for a problem I am experiencing. What
>>>>>> this dummy program does is it reads a vector and a matrix from a text
>>>>>> file and then solves Ax=b. The same data is available in two forms:
>>>>>>   - everything is in one file (matops.s.0 and vops.s.0)
>>>>>>   - the matrix and vector are split between processes (matops.0,
>>>>>> matops.1, vops.0, vops.1)
>>>>>> 
>>>>>> The serial version of the program works perfectly fine but unfortunately
>>>>>> errors occure, when running the parallel version:
>>>>>> 
>>>>>> make && mpirun -n 2 a.out matops vops
>>>>>> 
>>>>>> mpic++ -DPETSC_CLANGUAGE_CXX -isystem
>>>>>> /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/include -isystem
>>>>>> /home/data/fischer/libs/petsc-3.4.3/include petsctest.cpp -Werror -Wall
>>>>>> -Wpedantic -std=c++11 -L
>>>>>> /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib -lpetsc
>>>>>> /usr/bin/ld: warning: libmpi_cxx.so.0, needed by
>>>>>> /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib/libpetsc.so,
>>>>>> may conflict with libmpi_cxx.so.1
>>>>>> /usr/bin/ld: warning: libmpi.so.0, needed by
>>>>>> /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib/libpetsc.so,
>>>>>> may conflict with libmpi.so.1
>>>>>> librdmacm: couldn't read ABI version.
>>>>>> librdmacm: assuming: 4
>>>>>> CMA: unable to get RDMA device list
>>>>>> --------------------------------------------------------------------------
>>>>>> [[43019,1],0]: A high-performance Open MPI point-to-point messaging module
>>>>>> was unable to find any relevant network interfaces:
>>>>>> 
>>>>>> Module: OpenFabrics (openib)
>>>>>>    Host: dornroeschen.igpm.rwth-aachen.de
>>>>>> CMA: unable to get RDMA device list
>>>>>> It looks like your MPI is either broken or some of the code linked into
>>>>>> your application was compiled with a different MPI or different version.
>>>>>> Make sure you can compile and run simple MPI programs in parallel.
>>>>>> Hello Jed,
>>>>>> 
>>>>>> thank you for your inputs. Unfortunately MPI does not seem to be the issue here. The attachment contains a simple MPI hello world program which runs flawlessly (I will append the output to this mail) and I have not encountered any problems with other MPI programs. My question still stands.
>>>>>> 
>>>>>> This is a simple error. You created the matrix A using PETSC_COMM_WORLD, but you try to view it
>>>>>> using PETSC_VIEWER_STDOUT_SELF. You need to use PETSC_VIEWER_STDOUT_WORLD in
>>>>>> order to match.
>>>>>> 
>>>>>>   Thanks,
>>>>>> 
>>>>>>      Matt
>>>>>>  Greetings,
>>>>>> Niklas Fischer
>>>>>> 
>>>>>> mpirun -np 2 ./mpitest
>>>>>> 
>>>>>> librdmacm: couldn't read ABI version.
>>>>>> librdmacm: assuming: 4
>>>>>> CMA: unable to get RDMA device list
>>>>>> --------------------------------------------------------------------------
>>>>>> [[44086,1],0]: A high-performance Open MPI point-to-point messaging module
>>>>>> was unable to find any relevant network interfaces:
>>>>>> 
>>>>>> Module: OpenFabrics (openib)
>>>>>>   Host: dornroeschen.igpm.rwth-aachen.de
>>>>>> 
>>>>>> Another transport will be used instead, although this may result in
>>>>>> lower performance.
>>>>>> --------------------------------------------------------------------------
>>>>>> librdmacm: couldn't read ABI version.
>>>>>> librdmacm: assuming: 4
>>>>>> CMA: unable to get RDMA device list
>>>>>> Hello world from processor dornroeschen.igpm.rwth-aachen.de, rank 0 out of 2 processors
>>>>>> Hello world from processor dornroeschen.igpm.rwth-aachen.de, rank 1 out of 2 processors
>>>>>> [dornroeschen.igpm.rwth-aachen.de:128141] 1 more process has sent help message help-mpi-btl-base.txt / btl:no-nics
>>>>>> [dornroeschen.igpm.rwth-aachen.de:128141] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>>>>> 
>>>>> Thank you, Matthew, this solves my viewing problem. Am I doing something wrong when initializing the matrices as well? The matrix' viewing output starts with "Matrix Object: 1 MPI processes" and the Krylov solver does not converge.
>>>>> 
>>>>> Your help is really appreciated,
>>>>> Niklas Fischer
>>>>>> 
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener
>