Matrix input file format

Thu Jun 15 09:03:58 CDT 2006

Hi Evrim,

The call to VecGetOwnershipRange below is the call that determines,
for each processor, what part of the global matrix it "owns".
Say you have a 10x10 matrix, and 2 processors. The first 5 rows
would be sent to processor 1 and the second 5 rows would be sent
to processor 2. The subroutine pre_op3d.F is below, but I would
suggest that you try to build up incrementally.

Just write a little program that has the first four Vec calls
below, and then print out the results, like,

	print*,rank, Istart, Iend

so you get an understanding of what's going on. Then you can add
the next part about figuring out the pre-allocation, then print
out those values, etc.

Here's a snippet of my subroutine to figure out the preallocation. The variables
ijkhx, ijkhy, and ijkhz are simply the global row numbers for
each entry.

        subroutine pre_op3d(l,m,n,Istart,Iend,ijkhx,ijkhy,ijkhz,
      .                     d_nnz,o_nnz)

************************************************************************
* subroutine to determine the number of on-processor and off-processor
* elements in the A coefficient matrix. these increases the
* efficiency of building the parallel matrix.
************************************************************************

       USE KINDS
       IMPLICIT NONE

! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
!                    Include files
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

#include "include/finclude/petsc.h"
#include "include/finclude/petscvec.h"
#include "include/finclude/petscsys.h"

! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
!                   Variable declarations
! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

       PetscInt             Istart,Iend
       PetscInt             i,j,k,jj,ic
       PetscInt             col(13),iloc,kk
       PetscInt             d_nnz(*),o_nnz(*)
       PetscErrorCode       ierr

       INTEGER (KIND=i4) :: l,m,n
       INTEGER (KIND=i4) :: ijkhx(l,m,n),ijkhy(l,m,n),ijkhz(l,m,n)

       jj=0
       iloc=0

       do i=1,l
         do k=2,n
           do j=2,m

             jj=jj+1

             IF (jj >= Istart+1 .and. jj <= Iend) THEN

               iloc=iloc+1
               d_nnz(iloc)=0
               o_nnz(iloc)=0

* Hxijk
               ic=1
               col(ic)=jj-1

               if(k.ne.2) then
                 ic=ic+1
                 col(ic)=ijkhx(i,j,k-1)-1       ! Hxijc
               end if

               if(i.ne.1) then
                 ic=ic+1
                 col(ic)=ijkhz(i,j,k-1)-1       ! Hzijc
               end if

               if(i.ne.l) then
                 ic=ic+1
                 col(ic)=ijkhz(i+1,j,k-1)-1     ! Hzdjc
               end if

	[more entries like these...]

               do kk=1,ic

                 IF (col(kk) > Istart .and. col(kk) <= Iend) THEN
                   d_nnz(iloc)=d_nnz(iloc)+1
                 ELSE
                   o_nnz(iloc)=o_nnz(iloc)+1
                 END IF

               end do

             END IF

           end do
         end do
       end do

And now you can use d_nnz and o_nnz to pre-allocate your matrix.

Randy

Evrim Dizemen wrote:
> Dear Randall,
> 
> I guess i began to understand the consepts but there is still a missing 
> point that i do not know when and how we define Istart and Iend. I'll be 
> glade if you can send me the pre_op3d routine so i can see the algorithm 
> which is a black box for me now.
> 
> Thanks a lot
> 
> EVRIM
> 
> 
> Randall Mackie wrote:
>> Hi Evrim,
>>
>> It's quite easy to modify your Fortran code to do what you want. I 
>> thought
>> I had written it all out before, but I'll try again. There are many ways
>> to do this, but I'll start with the easiest, at least if you're going to
>> just modify your current sequential code.
>>
>> Let's say that your matrix has np global rows. Then
>>
>>       call VecCreateMPI(PETSC_COMM_WORLD,PETSC_DECIDE,np,b,ierr)
>>       call VecDuplicate(b,xsol,ierr)
>>       call VecGetLocalSize(b,mloc,ierr)
>>       call VecGetOwnershipRange(b,Istart,Iend,ierr)
>>
>>       do i=Istart+1,Iend
>>         loc(i)=i-1
>>       end do
>>
>> These statements create parallel vectors for the solution (xsol) and
>> the right hand side (b). The vector loc(i)is used to set values in the
>> vectors later.
>>
>> Then
>>
>> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> !         Create the linear solver context
>> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>
>>       call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
>>
>> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> !         Create the scatter context for getting results back to
>> !         each node.
>> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>
>>       call VecScatterCreateToAll(xsol,xToLocalAll,xseq,ierr)
>>
>> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> !     Create the matrix that defines the linear system, Ax = b,
>> !     for the EM problem.
>> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>
>>       call pre_op3d(l,m,nzn,Istart,Iend,ijkhx,ijkhy,ijkhz,d_nnz,o_nnz)
>>
>>       call MatCreateMPIAIJ(PETSC_COMM_WORLD,mloc,mloc,np,np,
>>      .     PETSC_NULL_INTEGER, d_nnz, PETSC_NULL_INTEGER,
>>      .     o_nnz,A,ierr)
>>
>>       call set_op3d(A,l,m,nzn,period,resist,x,y,z,Istart,Iend,
>>      .           ijkhx,ijkhy,ijkhz)
>>
>>
>> The subroutine pre_op3d is an important routine and it figures out
>> how to pre-allocate space for your parallel matrix. This will be
>> the difference between near-instanteous assembly and 2.5 hours that
>> you experienced. Basically, it just computes the global column numbers,
>> and figures out if they are between Istart and Iend. I can send you
>> my subroutine if you'd like.
>>
>> The subroutine set_op3d.F actually assembles the parallel matrix and
>> goes like this:
>>
>>
>>       jj=0
>>
>>       do i=1,l
>>         do k=2,n
>>           do j=2,m
>>
>>             jj=jj+1
>>             row = jj-1
>>
>>             IF (jj >= Istart+1 .and. jj <= Iend) THEN
>>
>>            compute elements...
>>
>>               call MatSetValues(A,i1,row,ic,col,v,INSERT_VALUES,
>>      .             ierr)
>>
>>             END IF
>>
>>           end do
>>         end do
>>       end do
>>
>> At the end,
>>
>>       call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr)
>>       call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr)
>>
>>
>> Again, because you compute the pre-allocation, this is near-instanteous,
>> even for large models (larger than you're using).
>>
>>
>> Once you do that, you're golden:
>>
>>         call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr)
>>
>> etc.
>>
>> Randy M.
>> San Francisco
>>
>>
>> Evrim Dizemen wrote:
>>> Hi all,
>>>
>>> Again thanks for your comments. I guess i can not define the problem 
>>> correctly. I have a sequantial fortran code giving me the global 
>>> matrix of 200000x200000. The code writes the matrix in a binary file 
>>> at little endian mode (i can write only the nonzero terms or the 
>>> entire matrix). I tried to change the binary mode to big endian and 
>>> read the global matrix by a c program as in the example 
>>> /src/mat/example/tests/ex31.c. However the program reads binary file 
>>> wrong and gives the following error message but the true value of 
>>> no-nonzero in the binary file is 6 (for the test case 3x3 matrix) :
>>>
>>> reading matrix in binary from matrix.dat ...
>>> ------------------------------------------------------------------------
>>> Petsc Release Version 2.3.1, Patch 13, Wed May 10 11:08:35 CDT 2006
>>> BK revision: balay at asterix.mcs.anl.gov|ChangeSet|20060510160640|13832
>>> See docs/changes/index.html for recent updates.
>>> See docs/faq.html for hints about trouble shooting.
>>> See docs/index.html for manual pages.
>>> ------------------------------------------------------------------------
>>> ./ex31 on a linux named akbaba.ae.metu.edu.tr by evrim Thu Jun 15 
>>> 09:26:36 2006
>>> Libraries linked from /home/evrim/petsc-2.3.1-p13/lib/linux
>>> Configure run at Tue May 30 10:26:48 2006
>>> Configure options --with-scalar-type=complex --with-shared=0
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: MatLoad_SeqAIJ() line 3055 in 
>>> src/mat/impls/aij/seq/aij.c
>>> [0]PETSC ERROR: Read from file failed!
>>> [0]PETSC ERROR: Inconsistant matrix data in file. no-nonzeros = 
>>> 100663296, sum-row-lengths = 234300
>>> !
>>> [0]PETSC ERROR: MatLoad() line 149 in src/mat/utils/matio.c
>>> [0]PETSC ERROR: main() line 37 in src/mat/examples/tests/ex31.c
>>>
>>> I want to send the global matrix at once to Petsc by a written input 
>>> file (as i'm working on now) or by sending the matrix array from my 
>>> fortran code and then partition it and solve iteratively. After the 
>>> solution i also want to get the solution vector back to my fortran 
>>> code. As i told in the previous mails i tried to send the matrix 
>>> array to Petsc and used MatSetValues as reading one value at time in 
>>> a do loop but it took about 2,5 hours to read the global matrix. 
>>> Additionally i tried to read a row at  a time but can not figure out 
>>> a algorithm for this. Hence i do not prefer to create the matrix 
>>> again in Petsc by MatSetValues.
>>>
>>> Aside i figured out that the binary files written in fortran and c 
>>> are completely different from each other (fortran adds the size of 
>>> the characters to the beginning and end of each character) so i wrote 
>>> a c interface code to get the matrix array from the fortran code and 
>>> write it to a binary file in c format. By this code i avoided from 
>>> the additional information in the binary file but i still have the 
>>> endianness problem.
>>>
>>> I know that i asked so much but since i'm a rookie in parallel 
>>> programming, c language and library using i really need your comments 
>>> on my problem. Sorry for this long mail and thanks a lot for your 
>>> kind effort on guiding me.
>>>
>>> Thanks
>>>
>>> EVRIM
>>>
>>
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034