[petsc-users] Fwd: Building the same petsc matrix with different numprocs gives different results!
Matthew Knepley
knepley at gmail.com
Tue Sep 24 13:05:04 CDT 2013
On Tue, Sep 24, 2013 at 10:58 AM, Analabha Roy <hariseldon99 at gmail.com>wrote:
> Hi,
>
> Sorry for misunderstanding
>
> I modified my source thus <http://pastebin.ca/2457850> so that the
> rows/cols/values for each call are printed before inserting into
> MatSetValues()
>
>
> Then ran it with 1,2 processors
>
>
> Here are the outputs <http://pastebin.ca/2457852>
>
>
> Strange! Running it with 2 procs and only half the values show up!!!!!
>
PetscPrintf() only prints from rank 0. Use PETSC_COMM_SELF.
Matt
> And even those do not match!!!!
>
>
>
>
> On Tue, Sep 24, 2013 at 11:12 PM, Matthew Knepley <knepley at gmail.com>wrote:
>
>> On Tue, Sep 24, 2013 at 10:39 AM, Analabha Roy <hariseldon99 at gmail.com>wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> On Tue, Sep 24, 2013 at 9:33 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>>
>>>> On Tue, Sep 24, 2013 at 8:35 AM, Analabha Roy <hariseldon99 at gmail.com>wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Sep 24, 2013 at 8:41 PM, Matthew Knepley <knepley at gmail.com>wrote:
>>>>>
>>>>>> On Tue, Sep 24, 2013 at 8:08 AM, Analabha Roy <hariseldon99 at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 24, 2013 at 1:42 PM, Jed Brown <jedbrown at mcs.anl.gov>wrote:
>>>>>>>
>>>>>>>> Analabha Roy <hariseldon99 at gmail.com> writes:
>>>>>>>>
>>>>>>>> > Hi all,
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Compiling and running this
>>>>>>>> > code<
>>>>>>>> https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c
>>>>>>>> >that
>>>>>>>> > builds a petsc matrix gives different results when run with
>>>>>>>> different
>>>>>>>> > number of processors.
>>>>>>>>
>>>>>>>>
>>>>>>> Thanks for the reply.
>>>>>>>
>>>>>>>
>>>>>>>> Uh, if you call rand() on different processors, why would you
>>>>>>>> expect it
>>>>>>>> to give the same results?
>>>>>>>>
>>>>>>>> Right, I get that. The rand() was a placeholder.
>>>>>>>
>>>>>>> This original much larger code<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c>replicates the same loop structure and runs the same Petsc subroutines, but
>>>>>>> running it by
>>>>>>>
>>>>>>> mpirun -np $N ./eth -lattice_size 5 -vector_size 1 -repulsion 0.0
>>>>>>> -draw_out -draw_pause -1
>>>>>>>
>>>>>>> with N=1,2,3,4 gives different results for the matrix dumped out by
>>>>>>> lines 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>.
>>>>>>> The matrix itself is evaluated in parallel, created in lines263-275
>>>>>>> <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#263>and
>>>>>>> evaluated in lines 294-356<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#294>
>>>>>>>
>>>>>>> (you can click on the line numbers above to navigate directly to
>>>>>>> them)
>>>>>>>
>>>>>>> Here is a sample <http://i43.tinypic.com/zyhf2f.jpg> of the output
>>>>>>> of lines 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514> for N=1,2,3,4 procs left to right.
>>>>>>>
>>>>>>> Thty're different for different procs. They should be the same,
>>>>>>> since none of my input parameters are numprocs dependent, and I don't
>>>>>>> explicitly use the size or rank anywhere in the code.
>>>>>>>
>>>>>>
>>>>>> You are likely not dividing the rows you loop over so you are
>>>>>> redundantly computing.
>>>>>>
>>>>>
>>>>> Thanks for the reply.
>>>>>
>>>>> Line 274<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#274>gets the local row indices of Petsc Matrix
>>>>> AVG_BDIBJ
>>>>>
>>>>> Line 295
>>>>> <https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#295>iterates
>>>>> over the local rows and the lines below get the column
>>>>> elements. For each row, the column elements are assigned by the lines
>>>>> up to Line 344<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#344>and stored locally in colvalues[]. Dunno if the details are relevant.
>>>>>
>>>>> Line 347<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#347>inserts the sitestride1^th row into the matrix
>>>>>
>>>>> Line 353+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#353>does the mat assembly
>>>>>
>>>>> Then, after a lot of currently irrelevant code,
>>>>>
>>>>> Line 514+<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>dumps the mat plot to graphics
>>>>>
>>>>>
>>>>> Different numprocs give different matrices.
>>>>>
>>>>> Can somebody suggest what I did wrong (or didn't do)?
>>>>>
>>>>
>>>> Different values are being given to MatSetValues() for different
>>>> numbers of processes. So
>>>>
>>>> 1) Reduce this to the smallest problem size possible
>>>>
>>>> 2) Print out all rows/cols/values for each call
>>>>
>>>> 3) Compare 2 procs to the serial case
>>>>
>>>>
>>>
>>> Thanks for your excellent suggestion.
>>>
>>> I modified my code<https://code.google.com/p/daneelrepo/source/diff?spec=svn1435&r=1435&format=side&path=/eth_question/eth.c>to dump the matrix in binary
>>>
>>> Then I used this python script I had<https://code.google.com/p/daneelrepo/source/browse/eth_question/mat_bin2ascii.py>to convert to ascii
>>>
>>
>> Do not print the matrix, print the data you are passing to MatSetValues().
>>
>> MatSetValues() is not likely to be broken. Every PETSc code in the world
>> calls this many times on every simulation.
>>
>> Matt
>>
>>
>>>
>>> Here are the values of <http://pastebin.ca/2457842>AVG_BDIBJ<http://pastebin.ca/2457842>,
>>> a 9X9 matrix (the smallest possible problem size) run with the exact same
>>> input parameters with 1,2,3 and 4 procs
>>>
>>> As you can see, the 1 and 2 procs match up, but the 3 and 4 procs do not.
>>>
>>> Serious wierdness.
>>>
>>>
>>>
>>>> Matt
>>>>
>>>>
>>>>>
>>>>>> Matt
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> for (sitestride1 = Istart; sitestride1 < Iend; sitestride1++)
>>>>>>>> {
>>>>>>>> for (sitestride2 = 0; sitestride2 < matsize; sitestride2++)
>>>>>>>> {
>>>>>>>> for (alpha = 0; alpha < dim; alpha++)
>>>>>>>> {
>>>>>>>> for (mu = 0; mu < dim; mu++)
>>>>>>>> for (lambda = 0; lambda < dim; lambda++)
>>>>>>>> {
>>>>>>>> vecval = rand () / rand ();
>>>>>>>> }
>>>>>>>>
>>>>>>>> VecSetValue (BDB_AA, alpha, vecval, INSERT_VALUES);
>>>>>>>>
>>>>>>>> }
>>>>>>>> VecAssemblyBegin (BDB_AA);
>>>>>>>> VecAssemblyEnd (BDB_AA);
>>>>>>>> VecSum (BDB_AA, &element);
>>>>>>>> colvalues[sitestride2] = element;
>>>>>>>>
>>>>>>>> }
>>>>>>>> //Insert the array of colvalues to the sitestride1^th row of H
>>>>>>>> MatSetValues (AVG_BDIBJ, 1, &sitestride1, matsize, idx,
>>>>>>>> colvalues,
>>>>>>>> INSERT_VALUES);
>>>>>>>>
>>>>>>>> }
>>>>>>>>
>>>>>>>> > The code is large and complex, so I have created a smaller program
>>>>>>>> > with the same
>>>>>>>> > loop structure here. <http://pastebin.ca/2457643>
>>>>>>>> >
>>>>>>>> > Compile it and run it with "mpirun -np $N ./test -draw_pause -1"
>>>>>>>> gives
>>>>>>>> > different results for different values of N even though it's not
>>>>>>>> supposed
>>>>>>>> > to.
>>>>>>>>
>>>>>>>> What do you expect to see?
>>>>>>>>
>>>>>>>> > Here is a sample output <http://i42.tinypic.com/2s16ccw.jpg> for
>>>>>>>> N=1,2,3,4
>>>>>>>> > from left to right.
>>>>>>>> >
>>>>>>>> > Can anyone guide me as to what I'm doing wrong? Are any of the
>>>>>>>> petssc
>>>>>>>> > routines used not parallelizable?
>>>>>>>> >
>>>>>>>> > Thanks in advance,
>>>>>>>> >
>>>>>>>> > Regards.
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > ---
>>>>>>>> > *Analabha Roy*
>>>>>>>> > C.S.I.R <http://www.csir.res.in> Senior Research
>>>>>>>> > Associate<http://csirhrdg.res.in/poolsra.htm>
>>>>>>>> > Saha Institute of Nuclear Physics <http://www.saha.ac.in>
>>>>>>>> > Section 1, Block AF
>>>>>>>> > Bidhannagar, Calcutta 700064
>>>>>>>> > India
>>>>>>>> > *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
>>>>>>>> > *Webpage*: http://www.ph.utexas.edu/~daneel/
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ---
>>>>>>> *Analabha Roy*
>>>>>>> C.S.I.R <http://www.csir.res.in> Senior Research Associate<http://csirhrdg.res.in/poolsra.htm>
>>>>>>> Saha Institute of Nuclear Physics <http://www.saha.ac.in>
>>>>>>> Section 1, Block AF
>>>>>>> Bidhannagar, Calcutta 700064
>>>>>>> India
>>>>>>> *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
>>>>>>> *Webpage*: http://www.ph.utexas.edu/~daneel/
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ---
>>>>> *Analabha Roy*
>>>>> C.S.I.R <http://www.csir.res.in> Senior Research Associate<http://csirhrdg.res.in/poolsra.htm>
>>>>> Saha Institute of Nuclear Physics <http://www.saha.ac.in>
>>>>> Section 1, Block AF
>>>>> Bidhannagar, Calcutta 700064
>>>>> India
>>>>> *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
>>>>> *Webpage*: http://www.ph.utexas.edu/~daneel/
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>>>
>>> --
>>> ---
>>> *Analabha Roy*
>>> C.S.I.R <http://www.csir.res.in> Senior Research Associate<http://csirhrdg.res.in/poolsra.htm>
>>> Saha Institute of Nuclear Physics <http://www.saha.ac.in>
>>> Section 1, Block AF
>>> Bidhannagar, Calcutta 700064
>>> India
>>> *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
>>> *Webpage*: http://www.ph.utexas.edu/~daneel/
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> ---
> *Analabha Roy*
> C.S.I.R <http://www.csir.res.in> Senior Research Associate<http://csirhrdg.res.in/poolsra.htm>
> Saha Institute of Nuclear Physics <http://www.saha.ac.in>
> Section 1, Block AF
> Bidhannagar, Calcutta 700064
> India
> *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
> *Webpage*: http://www.ph.utexas.edu/~daneel/
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130924/a10afa9b/attachment-0001.html>
More information about the petsc-users
mailing list