[petsc-users] no-debug-mode problem

Longxiang Chen suifengls at gmail.com
Sat Jun 29 01:10:14 CDT 2013


When I don't use -O2 or -O3, both compilers produce the correct answers.
When I use ifort with -O2/-O3, it produces wrong answer.
When I use gfortran-4.7 with -O2/-O3, it cannot even read in the input
file.
When I use gofrtran-4.4 with -O2/-O3, it also produce correct answer.
I think the problem is on the Fortran code.

All the error information is about "used uninitialized" from valgrind:

==8397== Memcheck, a memory error detector
==8397== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==8397== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info
==8397== Command: ./eco2n
==8397==
==8397== Conditional jump or move depends on uninitialised value(s)
==8397==    at 0x11E1F367: __intel_sse2_strcat (in
/opt/intel/composer_xe_2013.3.163/compiler/lib/intel64/libirc.so)
==8397==    by 0xC728F0: read_configuration_files (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xC20DE3: MPID_Init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBEAC54: MPIR_Init_thread (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBEA443: PMPI_Init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBB6CAC: mpi_init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0x43C724: MAIN__ (t2cg22.f:319)
==8397==    by 0x40C2CB: main (in /home/lchen/seq_v2/eco2n)
==8397==
==8397== Use of uninitialised value of size 8
==8397==    at 0x11E1F3DA: __intel_sse2_strcat (in
/opt/intel/composer_xe_2013.3.163/compiler/lib/intel64/libirc.so)
==8397==    by 0xC728F0: read_configuration_files (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xC20DE3: MPID_Init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBEAC54: MPIR_Init_thread (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBEA443: PMPI_Init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBB6CAC: mpi_init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0x43C724: MAIN__ (t2cg22.f:319)
==8397==    by 0x40C2CB: main (in /home/lchen/seq_v2/eco2n)


==8397== Memcheck, a memory error detector
==8397== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==8397== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info
==8397== Command: ./eco2n
==8397==
==8397== Conditional jump or move depends on uninitialised value(s)
==8397==    at 0x11E1F367: __intel_sse2_strcat (in
/opt/intel/composer_xe_2013.3.163/compiler/lib/intel64/libirc.so)
==8397==    by 0xC728F0: read_configuration_files (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xC20DE3: MPID_Init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBEAC54: MPIR_Init_thread (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBEA443: PMPI_Init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBB6CAC: mpi_init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0x43C724: MAIN__ (t2cg22.f:319)
==8397==    by 0x40C2CB: main (in /home/lchen/seq_v2/eco2n)
==8397==
==8397== Use of uninitialised value of size 8
==8397==    at 0x11E1F3DA: __intel_sse2_strcat (in
/opt/intel/composer_xe_2013.3.163/compiler/lib/intel64/libirc.so)
==8397==    by 0xC728F0: read_configuration_files (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xC20DE3: MPID_Init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBEAC54: MPIR_Init_thread (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBEA443: PMPI_Init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0xBB6CAC: mpi_init (in /home/lchen/seq_v2/eco2n)
==8397==    by 0x43C724: MAIN__ (t2cg22.f:319)
==8397==    by 0x40C2CB: main (in /home/lchen/seq_v2/eco2n)

...

hundreds of similar error.


Best regards,
Longxiang Chen



On Fri, Jun 28, 2013 at 11:05 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On Jun 29, 2013, at 12:24 AM, Longxiang Chen <suifengls at gmail.com> wrote:
>
> > I find that the fortran program cannot add optimization flag (-O2 or
> -O3) if I use gfortran-4.7.2 or ifort to compile.
>
>    What do you mean? Do you mean to say that if use -O2 or -O3 flags with
> gfortran 4.7.2 or ifort  it produces the wrong answer? Does it produce the
> same wrong answer with both compilers?  If you use no optimization with
> both compilers does it produce the "correct" answer?
>
> > When I change the compiler to gfortran-4.4, now the --with-debug=0
> (adding -O3) works.
>
>    but if you use -O3 flag with gfortran 4.4 it runs with the correct
> answer?
>
>    Since it works incorrectly with two very different fortran compilers
> (optimized) it is more likely there is something wrong with your  code than
> with both compilers. You most definitely should run with valgrind and make
> sure there is no memory corruption in your code.
>
>    Barry
>
> >
> > Thank you all.
> >
> > Best regards,
> > Longxiang Chen
> >
> >
> >
> >
> > On Fri, Jun 28, 2013 at 3:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   Run both debugged and optimized versions with valgrind:
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> >
> >    Barry
> >
> > On Jun 28, 2013, at 3:30 PM, Longxiang Chen <suifengls at gmail.com> wrote:
> >
> > > A is formed by three arrays IA[NZ], JA[NZ] and VA[NZ]
> > > IA[i] is row index, JA[i] is column index and VA[i] is the value in
> (IA[i], JA[i]).
> > >
> > > For intel compiler:
> > > when I use --with-debugging=0, the VA[] is not correct. I don't know
> what kind of optimization it does.
> > >
> > > A[0][0] = 1.000000e-25
> > > A[0][1] = 0.000000e+00
> > > A[0][2] = 0.000000e+00
> > > A[1][0] = 0.000000e+00
> > > A[1][1] = -3.479028e+02
> > > A[1][2] = 0.000000e+00
> > > A[2][0] = 0.000000e+00
> > > A[2][1] = 0.000000e+00
> > > A[2][2] = -3.479028e+02
> > > A[3][3] = 1.000000e-25
> > > ...
> > >
> > > CORRECT:
> > > A[0][0] = -2.961372e-07
> > > A[0][1] = 1.160201e+02
> > > A[0][2] = 2.744589e+02
> > > A[1][0] = 0.000000e+00
> > > A[1][1] = -3.479028e+02
> > > A[1][2] = 0.000000e+00
> > > A[2][0] = -8.332708e-08
> > > A[2][1] = 0.000000e+00
> > > A[2][2] = -3.479028e+02
> > > A[3][3] = -3.027917e-07
> > > ...
> > >
> > > For gcc-4.7.2:
> > > when I use --with-debugging=0, the fortran main function cannot read
> input data before it starts the LOOP, the ksp_solver() is called inside the
> loop.
> > > ===>
> > >  ERRONEOUS DATA INITIALIZATION            STOP EXECUTION---------
> > >
> > > Best regards,
> > > Longxiang Chen
> > >
> > > Do something everyday that gets you closer to being done.
> > >
> > >
> > >
> > > On Thu, Jun 27, 2013 at 6:42 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > > On Jun 27, 2013, at 2:23 PM, Tabrez Ali <stali at geology.wisc.edu>
> wrote:
> > >
> > > > Fortran can be tricky.
> > > >
> > > > Try to run the program in valgrind and/or recheck all your
> arguments. I once forgot MatAssemblyType in the call to MatAssembly and the
> code still ran fine on one machine but failed on other. It is better to
> test with at least 2-3 compilers (GNU, Solaris and Open64 Fortran/C
> compilers are all free on Linux).
> > > >
> > > > T
> > >
> > >    You can also run both versions with -snes_monitor -ksp_monitor and
> see if they both start out the same way.
> > >
> > >    Barry
> > >
> > > >
> > > >
> > > > On 27.06.2013 14:52, Longxiang Chen wrote:
> > > >> Dear all,
> > > >>
> > > >> I use ksp_solver to solve Ax=b, where A is from an outer loop of
> PDE.
> > > >> Under debug mode(default), it solves the problem in about 4000
> > > >> iterations.
> > > >> And the final answer is correct (comparing to another solver).
> > > >>
> > > >> I use intel compiler.
> > > >> The program is in Fortran (by mpif90), except the solver is in c (by
> > > >> mpicc).
> > > >>
> > > >> However, when I re-configure with --with-debugging=0 (the only
> > > >> change),
> > > >> the program terminates in about 30 iterations with the wrong final
> > > >> solution.
> > > >>
> > > >> Thank you.
> > > >>
> > > >> Best regards,
> > > >> Longxiang Chen
> > > >>
> > > >> Do something everyday that gets you closer to being done.
> > > >
> > >
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130628/99fd9320/attachment.html>


More information about the petsc-users mailing list