<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Apr 22, 2014 at 7:59 AM, Niklas Fischer <span dir="ltr"><<a href="mailto:niklas@niklasfi.de" target="_blank">niklas@niklasfi.de</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
I should probably note that everything is fine if I run the serial
version of this (with the exact same matrix + right hand side).<br>
<br>
PETSc KSPSolve done, residual norm: 3.13459e-13, it took 6
iterations.<br></div></blockquote><div><br></div><div>Yes, your preconditioner is weaker in parallel since it is block Jacobi. If you just want to solve</div><div>the problem, use a parallel sparse direct factorization, like SuperLU_dist or MUMPS. You</div>
<div>reconfigure using --download-superlu-dist or --download-mumps, and then use</div><div><br></div><div> -pc_type lu -pc_factor_mat_solver_package mumps</div><div><br></div><div>If you want a really scalable solution, then you have to know about your operator, not just the discretization.</div>
<div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF">
<div>Am 22.04.2014 14:12, schrieb Niklas
Fischer:<br>
</div>
<blockquote type="cite">
<br>
<div>Am 22.04.2014 13:57, schrieb Matthew
Knepley:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Tue, Apr 22, 2014 at 6:48 AM,
Niklas Fischer <span dir="ltr"><<a href="mailto:niklas@niklasfi.de" target="_blank">niklas@niklasfi.de</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Am
22.04.2014 13:08, schrieb Jed Brown:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Niklas Fischer <<a href="mailto:niklas@niklasfi.de" target="_blank">niklas@niklasfi.de</a>>
writes:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hello,<br>
<br>
I have attached a small test case for a problem I am
experiencing. What<br>
this dummy program does is it reads a vector and a
matrix from a text<br>
file and then solves Ax=b. The same data is
available in two forms:<br>
- everything is in one file (matops.s.0 and
vops.s.0)<br>
- the matrix and vector are split between
processes (matops.0,<br>
matops.1, vops.0, vops.1)<br>
<br>
The serial version of the program works perfectly
fine but unfortunately<br>
errors occure, when running the parallel version:<br>
<br>
make && mpirun -n 2 a.out matops vops<br>
<br>
mpic++ -DPETSC_CLANGUAGE_CXX -isystem<br>
/home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/include
-isystem<br>
/home/data/fischer/libs/petsc-3.4.3/include
petsctest.cpp -Werror -Wall<br>
-Wpedantic -std=c++11 -L<br>
/home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib
-lpetsc<br>
/usr/bin/ld: warning: libmpi_cxx.so.0, needed by<br>
/home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib/libpetsc.so,<br>
may conflict with libmpi_cxx.so.1<br>
/usr/bin/ld: warning: libmpi.so.0, needed by<br>
/home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib/libpetsc.so,<br>
may conflict with libmpi.so.1<br>
librdmacm: couldn't read ABI version.<br>
librdmacm: assuming: 4<br>
CMA: unable to get RDMA device list<br>
--------------------------------------------------------------------------<br>
[[43019,1],0]: A high-performance Open MPI
point-to-point messaging module<br>
was unable to find any relevant network interfaces:<br>
<br>
Module: OpenFabrics (openib)<br>
Host: <a href="http://dornroeschen.igpm.rwth-aachen.de" target="_blank">dornroeschen.igpm.rwth-aachen.de</a><br>
CMA: unable to get RDMA device list<br>
</blockquote>
It looks like your MPI is either broken or some of the
code linked into<br>
your application was compiled with a different MPI or
different version.<br>
Make sure you can compile and run simple MPI programs
in parallel.<br>
</blockquote>
Hello Jed,<br>
<br>
thank you for your inputs. Unfortunately MPI does not
seem to be the issue here. The attachment contains a
simple MPI hello world program which runs flawlessly (I
will append the output to this mail) and I have not
encountered any problems with other MPI programs. My
question still stands.<br>
</blockquote>
<div><br>
</div>
<div>This is a simple error. You created the matrix A
using PETSC_COMM_WORLD, but you try to view it</div>
<div>using PETSC_VIEWER_STDOUT_SELF. You need to use
PETSC_VIEWER_STDOUT_WORLD in</div>
<div>order to match.</div>
<div><br>
</div>
<div> Thanks,</div>
<div><br>
</div>
<div> Matt</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Greetings,<br>
Niklas Fischer<br>
<br>
mpirun -np 2 ./mpitest<br>
<br>
librdmacm: couldn't read ABI version.<br>
librdmacm: assuming: 4<br>
CMA: unable to get RDMA device list<br>
--------------------------------------------------------------------------<br>
[[44086,1],0]: A high-performance Open MPI
point-to-point messaging module<br>
was unable to find any relevant network interfaces:<br>
<br>
Module: OpenFabrics (openib)<br>
Host: <a href="http://dornroeschen.igpm.rwth-aachen.de" target="_blank">dornroeschen.igpm.rwth-aachen.de</a><br>
<br>
Another transport will be used instead, although this
may result in<br>
lower performance.<br>
--------------------------------------------------------------------------<br>
librdmacm: couldn't read ABI version.<br>
librdmacm: assuming: 4<br>
CMA: unable to get RDMA device list<br>
Hello world from processor <a href="http://dornroeschen.igpm.rwth-aachen.de" target="_blank">dornroeschen.igpm.rwth-aachen.de</a>,
rank 0 out of 2 processors<br>
Hello world from processor <a href="http://dornroeschen.igpm.rwth-aachen.de" target="_blank">dornroeschen.igpm.rwth-aachen.de</a>,
rank 1 out of 2 processors<br>
[<a href="http://dornroeschen.igpm.rwth-aachen.de:128141" target="_blank">dornroeschen.igpm.rwth-aachen.de:128141</a>]
1 more process has sent help message
help-mpi-btl-base.txt / btl:no-nics<br>
[<a href="http://dornroeschen.igpm.rwth-aachen.de:128141" target="_blank">dornroeschen.igpm.rwth-aachen.de:128141</a>]
Set MCA parameter "orte_base_help_aggregate" to 0 to see
all help / error messages<br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
Thank you, Matthew, this solves my viewing problem. Am I doing
something wrong when initializing the matrices as well? The
matrix' viewing output starts with "Matrix Object: 1 MPI
processes" and the Krylov solver does not converge.<br>
<br>
Your help is really appreciated,<br>
Niklas Fischer<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra"><br clear="all"><span class="HOEnZb"><font color="#888888">
<div><br>
</div>
-- <br>
What most experimenters take for granted before they begin
their experiments is infinitely more interesting than any
results to which their experiments lead.<br>
-- Norbert Wiener </font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></div><span class="HOEnZb"><font color="#888888">
</font></span></blockquote><span class="HOEnZb"><font color="#888888">
<br>
</font></span></blockquote>
<br>
</div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener
</div></div>