<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    No problem. Thanks for the feedback.<br>
    <blockquote
cite="mid:CAM9tzSmby=VRyND2kLy1NLqKMHRAZLSzjWsA+g-FL0fw-XgNfQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">Sorry about the slow reply. Yes, petsc-dev is a
        good place for this. I have a couple comments. First, it's <i>much</i> better
        to split the "organization" part of the patch (which should be
        benign) from the new functionality. Small independent patches
        are so much easier to evaluate than monolithic patches. When in
        doubt, split them apart.
        <div>
          <br>
        </div>
      </div>
    </blockquote>
    Ok.<br>
    <br>
    I looked at the code for VecCUSPGetArrayWrite() in cuspvecimpl.h. I
    can't see why the compiler complains about uninitialized values
    although this only happens, I think, when I compile in double
    complex.<br>
    <blockquote
cite="mid:CAM9tzSmby=VRyND2kLy1NLqKMHRAZLSzjWsA+g-FL0fw-XgNfQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>
          <div>-  CUSPARRAY      *xarray,*yarray;</div>
          <div>+  CUSPARRAY      *xarray,*yarray=PETSC_NULL;</div>
          <div> </div>
          <div style="">This is a very bad method for squashing warnings
            because it prevents valgrind and static analysis from
            checking for uninitialized access. It's much better to go to
            the routine that initializes yarray and have it initialize
            explicitly at the start. This fixes the warnings in all the
            cases I've encountered. (Hopefully this bug, in which the
            compiler sometimes does not figure out that that a return
            value is checked so that a variable is never used
            uninitialized, will eventually be fixed so this problem goes
            away.)</div>
          <div style=""><br>
          </div>
          <div style="">
            <div>+  }</div>
            <div>+  else {</div>
            <div><br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    Ok. Will fix.<br>
    <blockquote
cite="mid:CAM9tzSmby=VRyND2kLy1NLqKMHRAZLSzjWsA+g-FL0fw-XgNfQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div style="">
            <div style="">The PETSc developer's manual says to use '}
              else {' instead.</div>
            <div style=""><br>
            </div>
            <div style="">
              <div>+PetscErrorCode
                MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(Mat A)</div>
              <div>+{</div>
              <div>+  Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors
                 = (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;</div>
              <div>+  GPU_Matrix_Ifc* cusparseMatLo  =
                (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;</div>
              <div>+  GPU_Matrix_Ifc* cusparseMatUp  =
                (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;</div>
              <div>+  cusparseStatus_t stat;</div>
              <div>+  PetscFunctionBegin;</div>
              <div>+  stat =
                cusparseMatLo->initializeCusparseMat(MAT_cusparseHandle,</div>
              <div>...</div>
              <div>
                <div>PetscErrorCode MatSolveTranspose_SeqAIJCUSPARSE(Mat
                  A,Vec bb,Vec xx)</div>
                <div>+{</div>
                <div>+  Mat_SeqAIJ     *a = (Mat_SeqAIJ*)A->data;</div>
                <div>+  PetscErrorCode ierr;</div>
                <div>+  CUSPARRAY      *xGPU=PETSC_NULL, *bGPU;</div>
                <div>+  cusparseStatus_t stat;</div>
                <div>+  Mat_SeqAIJCUSPARSETriFactors *cusparseTriFactors
                   = (Mat_SeqAIJCUSPARSETriFactors*)A->spptr;</div>
                <div>+  GPU_Matrix_Ifc *cusparseMatLo  =
                  (GPU_Matrix_Ifc*)cusparseTriFactors->loTriFactorPtr;</div>
                <div>+  GPU_Matrix_Ifc *cusparseMatUp  =
                  (GPU_Matrix_Ifc*)cusparseTriFactors->upTriFactorPtr;</div>
                <div>+  CUSPARRAY * tempGPU = (CUSPARRAY*)
                  cusparseTriFactors->tempvec;</div>
                <div>+</div>
                <div>+  PetscFunctionBegin;</div>
                <div>+  /* Analyze the matrix ... on the fly */</div>
                <div>+  ierr =
                  MatSeqAIJCUSPARSEAnalyzeTransposeForSolve(A);CHKERRQ(ierr);</div>
              </div>
              <div><br>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Yes this should only be called once, I will add a fix to this. Although
    there is no memory leak as I check this under the hood in
    txpetscgpu.<br>
    <br>
    Is there any way to determine from a MAT object whether a Tranpose
    solve is needed? That is, is there any global information that I can
    access that tells my I'm running BiCG and thus I need a Transpose
    Solve?  If so, then I could do all of this in the Setup functions
    (i.e. the functions that do the ILU factorization and then push the
    data down to the GPU.)<br>
    <blockquote
cite="mid:CAM9tzSmby=VRyND2kLy1NLqKMHRAZLSzjWsA+g-FL0fw-XgNfQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div style="">
            <div style="">
              <div>Is this a memory leak on repeat solves? In any case,
                shouldn't this be done once rather than every time
                MatSolveTranspose() is called? Same story with repeat
                calls to MatSeqAIJCUSPARSEGenerateTransposeForMult?<br>
                <br>
                <div>+  ierr =
                  VecCUSPGetArrayRead(xin,&xarray);CHKERRQ(ierr);</div>
                <div>+#if defined(PETSC_USE_COMPLEX)</div>
                <div>+  thrust::transform(xarray->begin(),
                  xarray->end(), xarray->begin(), conjugate());</div>
                <div>+#endif</div>
                <div>+  ierr =
                  VecCUSPRestoreArrayRead(xin,&xarray);CHKERRQ(ierr);</div>
              </div>
              <div style=""><br>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Oops. Will fix.<br>
    <blockquote
cite="mid:CAM9tzSmby=VRyND2kLy1NLqKMHRAZLSzjWsA+g-FL0fw-XgNfQ@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div style="">
            <div style="">
              <div style="">WAT? The <b>Read</b> accessor returns a
                non-const pointer and you're using it to modify the
                data? Oh the humanity.</div>
            </div>
          </div>
          <div><br>
          </div>
          <div><br>
          </div>
          <div>On Tue, Jan 1, 2013 at 2:20 PM, Paul Mullowney <span
              dir="ltr"><<a moz-do-not-send="true"
                href="mailto:paulm@txcorp.com" target="_blank">paulm@txcorp.com</a>></span>
            wrote:<br>
          </div>
          <div class="gmail_extra">
            <div class="gmail_quote">
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi,<br>
                <br>
                Here's a patch for running BiCG on GPUs (with ILU(0))
                preconditioners on GPUs for the <a
                  moz-do-not-send="true" href="http://aijcusparse.cu"
                  target="_blank">aijcusparse.cu</a> class. I needed to
                add<br>
                (1) a VecConjugate implementation in <a
                  moz-do-not-send="true" href="http://veccusp.cu"
                  target="_blank">veccusp.cu</a><br>
                (2) various methods in <a moz-do-not-send="true"
                  href="http://aijcusparse.cu" target="_blank">aijcusparse.cu</a>
                for building the transpose matrices for
                MatSolveTranspose* methods. The implementation of the
                solves is done under the hood in the txpetscgpu library.<br>
                <br>
                Everything builds and runs fine on my end. Let me know
                if this ok to push.<br>
                <br>
                Also, do patches like this go to petsc-maint or
                petsc-dev?<br>
                <br>
                Thanks,<br>
                -Paul<br>
                <br>
              </blockquote>
            </div>
            <br>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>