[petsc-users] Fwd: Building the same petsc matrix with different numprocs gives different results!

Matthew Knepley knepley at gmail.com
Tue Sep 24 11:03:13 CDT 2013


On Tue, Sep 24, 2013 at 8:35 AM, Analabha Roy <hariseldon99 at gmail.com>wrote:

>
>
>
> On Tue, Sep 24, 2013 at 8:41 PM, Matthew Knepley <knepley at gmail.com>wrote:
>
>> On Tue, Sep 24, 2013 at 8:08 AM, Analabha Roy <hariseldon99 at gmail.com>wrote:
>>
>>>
>>> On Tue, Sep 24, 2013 at 1:42 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
>>>
>>>> Analabha Roy <hariseldon99 at gmail.com> writes:
>>>>
>>>> > Hi all,
>>>> >
>>>> >
>>>> > Compiling and running this
>>>> > code<
>>>> https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c
>>>> >that
>>>> > builds a petsc matrix gives different results when run with different
>>>> > number of processors.
>>>>
>>>>
>>> Thanks for the reply.
>>>
>>>
>>>>  Uh, if you call rand() on different processors, why would you expect
>>>> it
>>>> to give the same results?
>>>>
>>>> Right, I get that. The rand() was a placeholder.
>>>
>>> This original much larger code<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c>replicates the same loop structure and runs the same Petsc subroutines, but
>>> running it by
>>>
>>> mpirun -np $N ./eth -lattice_size 5 -vector_size 1 -repulsion 0.0
>>> -draw_out -draw_pause -1
>>>
>>> with N=1,2,3,4 gives different results for the matrix dumped out by
>>> lines 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>.
>>> The matrix itself is evaluated in parallel, created in lines 263-275
>>> <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#263>and
>>> evaluated in lines 294-356<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#294>
>>>
>>> (you can click on the line numbers above to navigate directly to them)
>>>
>>> Here is a sample <http://i43.tinypic.com/zyhf2f.jpg> of the output of
>>> lines 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>  for N=1,2,3,4 procs left to right.
>>>
>>> Thty're different for different procs. They should be the same, since
>>> none of my input parameters are numprocs dependent, and I don't explicitly
>>> use the size or rank anywhere in the code.
>>>
>>
>> You are likely not dividing the rows you loop over so you are redundantly
>> computing.
>>
>
> Thanks for the reply.
>
> Line 274<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#274>gets the local row indices of Petsc Matrix
> AVG_BDIBJ
>
> Line 295
> <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#295>iterates
> over the local rows and the lines below get the column
> elements. For each row, the column elements are assigned by the lines up
> to  Line 344<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#344>and stored locally in colvalues[]. Dunno if the details are relevant.
>
> Line 347<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#347>inserts the sitestride1^th row into the matrix
>
> Line 353+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#353>does the mat assembly
>
> Then, after a lot of currently irrelevant code,
>
> Line 514+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>dumps the mat plot to graphics
>
>
> Different numprocs give different matrices.
>
> Can somebody suggest what  I did wrong (or didn't do)?
>

Different values are being given to MatSetValues() for different numbers of
processes. So

  1) Reduce this to the smallest problem size possible

  2) Print out all rows/cols/values for each call

  3) Compare 2 procs to the serial case

    Matt


>
>>    Matt
>>
>>
>>>
>>>
>>>> for (sitestride1 = Istart; sitestride1 < Iend; sitestride1++)
>>>>     {
>>>>       for (sitestride2 = 0; sitestride2 < matsize; sitestride2++)
>>>>         {
>>>>           for (alpha = 0; alpha < dim; alpha++)
>>>>             {
>>>>               for (mu = 0; mu < dim; mu++)
>>>>                 for (lambda = 0; lambda < dim; lambda++)
>>>>                   {
>>>>                     vecval = rand () / rand ();
>>>>                   }
>>>>
>>>>               VecSetValue (BDB_AA, alpha, vecval, INSERT_VALUES);
>>>>
>>>>             }
>>>>           VecAssemblyBegin (BDB_AA);
>>>>           VecAssemblyEnd (BDB_AA);
>>>>           VecSum (BDB_AA, &element);
>>>>           colvalues[sitestride2] = element;
>>>>
>>>>         }
>>>>       //Insert the array of colvalues to the sitestride1^th row of H
>>>>       MatSetValues (AVG_BDIBJ, 1, &sitestride1, matsize, idx, colvalues,
>>>>                     INSERT_VALUES);
>>>>
>>>>     }
>>>>
>>>> > The code is large and complex, so I have created a smaller program
>>>> > with the same
>>>> > loop structure here. <http://pastebin.ca/2457643>
>>>> >
>>>> > Compile it and run it with "mpirun -np $N ./test -draw_pause -1" gives
>>>> > different results for different values of N even though it's not
>>>> supposed
>>>> > to.
>>>>
>>>> What do you expect to see?
>>>>
>>>> > Here is a sample output <http://i42.tinypic.com/2s16ccw.jpg> for
>>>> N=1,2,3,4
>>>> > from left to right.
>>>> >
>>>> > Can anyone guide me as to what I'm doing wrong? Are any of the petssc
>>>> > routines used not parallelizable?
>>>> >
>>>> > Thanks in advance,
>>>> >
>>>> > Regards.
>>>> >
>>>> > --
>>>> > ---
>>>> > *Analabha Roy*
>>>> > C.S.I.R <http://www.csir.res.in>  Senior Research
>>>> > Associate<http://csirhrdg.res.in/poolsra.htm>
>>>> > Saha Institute of Nuclear Physics <http://www.saha.ac.in>
>>>> > Section 1, Block AF
>>>> > Bidhannagar, Calcutta 700064
>>>> > India
>>>> > *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
>>>> > *Webpage*: http://www.ph.utexas.edu/~daneel/
>>>>
>>>
>>>
>>>
>>> --
>>> ---
>>> *Analabha Roy*
>>> C.S.I.R <http://www.csir.res.in>  Senior Research Associate<http://csirhrdg.res.in/poolsra.htm>
>>> Saha Institute of Nuclear Physics <http://www.saha.ac.in>
>>> Section 1, Block AF
>>> Bidhannagar, Calcutta 700064
>>> India
>>> *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
>>> *Webpage*: http://www.ph.utexas.edu/~daneel/
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> ---
> *Analabha Roy*
> C.S.I.R <http://www.csir.res.in>  Senior Research Associate<http://csirhrdg.res.in/poolsra.htm>
> Saha Institute of Nuclear Physics <http://www.saha.ac.in>
> Section 1, Block AF
> Bidhannagar, Calcutta 700064
> India
> *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
> *Webpage*: http://www.ph.utexas.edu/~daneel/
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130924/bb054254/attachment.html>


More information about the petsc-users mailing list