<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Apr 22, 2014 at 7:59 AM, Niklas Fischer <span dir="ltr"><<a href="mailto:niklas@niklasfi.de" target="_blank">niklas@niklasfi.de</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    I should probably note that everything is fine if I run the serial
    version of this (with the exact same matrix + right hand side).<br>
    <br>
    PETSc KSPSolve done, residual norm: 3.13459e-13, it took 6
    iterations.<br></div></blockquote><div><br></div><div>Yes, your preconditioner is weaker in parallel since it is block Jacobi. If you just want to solve</div><div>the problem, use a parallel sparse direct factorization, like SuperLU_dist or MUMPS. You</div>
<div>reconfigure using --download-superlu-dist or --download-mumps, and then use</div><div><br></div><div>  -pc_type lu -pc_factor_mat_solver_package mumps</div><div><br></div><div>If you want a really scalable solution, then you have to know about your operator, not just the discretization.</div>
<div><br></div><div>   Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF">
    <div>Am 22.04.2014 14:12, schrieb Niklas
      Fischer:<br>
    </div>
    <blockquote type="cite">
      
      <br>
      <div>Am 22.04.2014 13:57, schrieb Matthew
        Knepley:<br>
      </div>
      <blockquote type="cite">
        <div dir="ltr">
          <div class="gmail_extra">
            <div class="gmail_quote">On Tue, Apr 22, 2014 at 6:48 AM,
              Niklas Fischer <span dir="ltr"><<a href="mailto:niklas@niklasfi.de" target="_blank">niklas@niklasfi.de</a>></span>
              wrote:<br>
              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Am
                22.04.2014 13:08, schrieb Jed Brown:<br>
                <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  Niklas Fischer <<a href="mailto:niklas@niklasfi.de" target="_blank">niklas@niklasfi.de</a>>


                  writes:<br>
                  <br>
                  <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    Hello,<br>
                    <br>
                    I have attached a small test case for a problem I am
                    experiencing. What<br>
                    this dummy program does is it reads a vector and a
                    matrix from a text<br>
                    file and then solves Ax=b. The same data is
                    available in two forms:<br>
                      - everything is in one file (matops.s.0 and
                    vops.s.0)<br>
                      - the matrix and vector are split between
                    processes (matops.0,<br>
                    matops.1, vops.0, vops.1)<br>
                    <br>
                    The serial version of the program works perfectly
                    fine but unfortunately<br>
                    errors occure, when running the parallel version:<br>
                    <br>
                    make && mpirun -n 2 a.out matops vops<br>
                    <br>
                    mpic++ -DPETSC_CLANGUAGE_CXX -isystem<br>
                    /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/include


                    -isystem<br>
                    /home/data/fischer/libs/petsc-3.4.3/include
                    petsctest.cpp -Werror -Wall<br>
                    -Wpedantic -std=c++11 -L<br>
                    /home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib


                    -lpetsc<br>
                    /usr/bin/ld: warning: libmpi_cxx.so.0, needed by<br>
/home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib/libpetsc.so,<br>
                    may conflict with libmpi_cxx.so.1<br>
                    /usr/bin/ld: warning: libmpi.so.0, needed by<br>
/home/data/fischer/libs/petsc-3.4.3/arch-linux2-c-debug/lib/libpetsc.so,<br>
                    may conflict with libmpi.so.1<br>
                    librdmacm: couldn't read ABI version.<br>
                    librdmacm: assuming: 4<br>
                    CMA: unable to get RDMA device list<br>
--------------------------------------------------------------------------<br>
                    [[43019,1],0]: A high-performance Open MPI
                    point-to-point messaging module<br>
                    was unable to find any relevant network interfaces:<br>
                    <br>
                    Module: OpenFabrics (openib)<br>
                       Host: <a href="http://dornroeschen.igpm.rwth-aachen.de" target="_blank">dornroeschen.igpm.rwth-aachen.de</a><br>
                    CMA: unable to get RDMA device list<br>
                  </blockquote>
                  It looks like your MPI is either broken or some of the
                  code linked into<br>
                  your application was compiled with a different MPI or
                  different version.<br>
                  Make sure you can compile and run simple MPI programs
                  in parallel.<br>
                </blockquote>
                Hello Jed,<br>
                <br>
                thank you for your inputs. Unfortunately MPI does not
                seem to be the issue here. The attachment contains a
                simple MPI hello world program which runs flawlessly (I
                will append the output to this mail) and I have not
                encountered any problems with other MPI programs. My
                question still stands.<br>
              </blockquote>
              <div><br>
              </div>
              <div>This is a simple error. You created the matrix A
                using PETSC_COMM_WORLD, but you try to view it</div>
              <div>using PETSC_VIEWER_STDOUT_SELF. You need to use
                PETSC_VIEWER_STDOUT_WORLD in</div>
              <div>order to match.</div>
              <div><br>
              </div>
              <div>  Thanks,</div>
              <div><br>
              </div>
              <div>     Matt</div>
              <div> </div>
              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                Greetings,<br>
                Niklas Fischer<br>
                <br>
                mpirun -np 2 ./mpitest<br>
                <br>
                librdmacm: couldn't read ABI version.<br>
                librdmacm: assuming: 4<br>
                CMA: unable to get RDMA device list<br>
--------------------------------------------------------------------------<br>
                [[44086,1],0]: A high-performance Open MPI
                point-to-point messaging module<br>
                was unable to find any relevant network interfaces:<br>
                <br>
                Module: OpenFabrics (openib)<br>
                  Host: <a href="http://dornroeschen.igpm.rwth-aachen.de" target="_blank">dornroeschen.igpm.rwth-aachen.de</a><br>
                <br>
                Another transport will be used instead, although this
                may result in<br>
                lower performance.<br>
--------------------------------------------------------------------------<br>
                librdmacm: couldn't read ABI version.<br>
                librdmacm: assuming: 4<br>
                CMA: unable to get RDMA device list<br>
                Hello world from processor <a href="http://dornroeschen.igpm.rwth-aachen.de" target="_blank">dornroeschen.igpm.rwth-aachen.de</a>,
                rank 0 out of 2 processors<br>
                Hello world from processor <a href="http://dornroeschen.igpm.rwth-aachen.de" target="_blank">dornroeschen.igpm.rwth-aachen.de</a>,
                rank 1 out of 2 processors<br>
                [<a href="http://dornroeschen.igpm.rwth-aachen.de:128141" target="_blank">dornroeschen.igpm.rwth-aachen.de:128141</a>]
                1 more process has sent help message
                help-mpi-btl-base.txt / btl:no-nics<br>
                [<a href="http://dornroeschen.igpm.rwth-aachen.de:128141" target="_blank">dornroeschen.igpm.rwth-aachen.de:128141</a>]
                Set MCA parameter "orte_base_help_aggregate" to 0 to see
                all help / error messages<br>
              </blockquote>
            </div>
            <br>
          </div>
        </div>
      </blockquote>
      Thank you, Matthew, this solves my viewing problem. Am I doing
      something wrong when initializing the matrices as well? The
      matrix' viewing output starts with "Matrix Object: 1 MPI
      processes" and the Krylov solver does not converge.<br>
      <br>
      Your help is really appreciated,<br>
      Niklas Fischer<br>
      <blockquote type="cite">
        <div dir="ltr">
          <div class="gmail_extra"><br clear="all"><span class="HOEnZb"><font color="#888888">
            <div><br>
            </div>
            -- <br>
            What most experimenters take for granted before they begin
            their experiments is infinitely more interesting than any
            results to which their experiments lead.<br>
            -- Norbert Wiener </font></span></div><span class="HOEnZb"><font color="#888888">
        </font></span></div><span class="HOEnZb"><font color="#888888">
      </font></span></blockquote><span class="HOEnZb"><font color="#888888">
      <br>
    </font></span></blockquote>
    <br>
  </div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener
</div></div>