[petsc-users] Fwd: Building the same petsc matrix with different numprocs gives different results!
Analabha Roy
hariseldon99 at gmail.com
Tue Sep 24 10:08:16 CDT 2013
On Tue, Sep 24, 2013 at 1:42 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> Analabha Roy <hariseldon99 at gmail.com> writes:
>
> > Hi all,
> >
> >
> > Compiling and running this
> > code<
> https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c>that
> > builds a petsc matrix gives different results when run with different
> > number of processors.
>
>
Thanks for the reply.
> Uh, if you call rand() on different processors, why would you expect it
> to give the same results?
>
> Right, I get that. The rand() was a placeholder.
This original much larger
code<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c>replicates
the same loop structure and runs the same Petsc subroutines, but
running it by
mpirun -np $N ./eth -lattice_size 5 -vector_size 1 -repulsion 0.0 -draw_out
-draw_pause -1
with N=1,2,3,4 gives different results for the matrix dumped out by lines
514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>.
The matrix itself is evaluated in parallel, created in lines 263-275
<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#263>and
evaluated in lines
294-356<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#294>
(you can click on the line numbers above to navigate directly to them)
Here is a sample <http://i43.tinypic.com/zyhf2f.jpg> of the output of
lines 514-519<https://code.google.com/p/daneelrepo/source/browse/eth_question/eth.c#514>
for N=1,2,3,4 procs left to right.
Thty're different for different procs. They should be the same, since none
of my input parameters are numprocs dependent, and I don't explicitly use
the size or rank anywhere in the code.
> for (sitestride1 = Istart; sitestride1 < Iend; sitestride1++)
> {
> for (sitestride2 = 0; sitestride2 < matsize; sitestride2++)
> {
> for (alpha = 0; alpha < dim; alpha++)
> {
> for (mu = 0; mu < dim; mu++)
> for (lambda = 0; lambda < dim; lambda++)
> {
> vecval = rand () / rand ();
> }
>
> VecSetValue (BDB_AA, alpha, vecval, INSERT_VALUES);
>
> }
> VecAssemblyBegin (BDB_AA);
> VecAssemblyEnd (BDB_AA);
> VecSum (BDB_AA, &element);
> colvalues[sitestride2] = element;
>
> }
> //Insert the array of colvalues to the sitestride1^th row of H
> MatSetValues (AVG_BDIBJ, 1, &sitestride1, matsize, idx, colvalues,
> INSERT_VALUES);
>
> }
>
> > The code is large and complex, so I have created a smaller program
> > with the same
> > loop structure here. <http://pastebin.ca/2457643>
> >
> > Compile it and run it with "mpirun -np $N ./test -draw_pause -1" gives
> > different results for different values of N even though it's not supposed
> > to.
>
> What do you expect to see?
>
> > Here is a sample output <http://i42.tinypic.com/2s16ccw.jpg> for
> N=1,2,3,4
> > from left to right.
> >
> > Can anyone guide me as to what I'm doing wrong? Are any of the petssc
> > routines used not parallelizable?
> >
> > Thanks in advance,
> >
> > Regards.
> >
> > --
> > ---
> > *Analabha Roy*
> > C.S.I.R <http://www.csir.res.in> Senior Research
> > Associate<http://csirhrdg.res.in/poolsra.htm>
> > Saha Institute of Nuclear Physics <http://www.saha.ac.in>
> > Section 1, Block AF
> > Bidhannagar, Calcutta 700064
> > India
> > *Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
> > *Webpage*: http://www.ph.utexas.edu/~daneel/
>
--
---
*Analabha Roy*
C.S.I.R <http://www.csir.res.in> Senior Research
Associate<http://csirhrdg.res.in/poolsra.htm>
Saha Institute of Nuclear Physics <http://www.saha.ac.in>
Section 1, Block AF
Bidhannagar, Calcutta 700064
India
*Emails*: daneel at physics.utexas.edu, hariseldon99 at gmail.com
*Webpage*: http://www.ph.utexas.edu/~daneel/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130924/aaf70c61/attachment-0001.html>
More information about the petsc-users
mailing list