[petsc-dev] Petsc: Error code 1

Satish Balay balay at mcs.anl.gov
Tue Apr 6 12:06:37 CDT 2021


> See the attachements. alltest.log is on a machine with 96 cores, ARM, with FC,gcc9.3.5,mpich3.4.1,fblaslapack; 6 failures

Perhaps this is an issue with ARM - and such diffs are expected - as we already have multiple alt files for some of these tests

$ ls -lt src/tao/bound/tutorials/output/plate2f_*
-rw-r--r--. 1 balay balay 1029 Mar 23 19:48 src/tao/bound/tutorials/output/plate2f_1_alt.out
-rw-r--r--. 1 balay balay 1071 Mar 23 19:48 src/tao/bound/tutorials/output/plate2f_1.out
-rw-r--r--. 1 balay balay 1029 Mar 23 19:48 src/tao/bound/tutorials/output/plate2f_2_alt.out
-rw-r--r--. 1 balay balay 1071 Mar 23 19:48 src/tao/bound/tutorials/output/plate2f_2.out

>>>>>>
not ok diff-vec_is_is_tutorials-ex2f_1 # Error code: 1
#       16,24d15
#       <   5
#       <   7
#       <   9
#       <  11
#       <  13
#       <  15
#       <  17
#       <  19
#       <  21
<<<<<

This one is puzzling - missing fortran stdout? Perhaps compile issue on ARM? [its a sequential example - so can't blame MPI]

Or they are all related to the optimization flags used? What configure options were used for the build?

Satish

On Tue, 6 Apr 2021, Barry Smith wrote:

> 
>     Alp,
> 
>    Except for the first test, these are all optimization problems (mostly in Fortran). The function values are very different so I am sending it to our optimization expert to take a look at it. The differences could possibly be related to the use of real() and maybe the direct use of floating point numbers that the compiler first treats as single and then converts to double thus losing precision.
> 
>    Chen Gang, I assume you compiled with the default standard precision PETSc configure options?
> 
> 
> 
> On Apr 6, 2021, at 3:56 AM, Chen Gang <569615491 at qq.com<mailto:569615491 at qq.com>> wrote:
> 
> 
> See the attachements. alltest.log is on a machine with 96 cores, ARM, with FC,gcc9.3.5,mpich3.4.1,fblaslapack; 6 failures
>                                  alltest2.log is on an intel machine with 40  cores,x86, without FC; icc&mkl& intel mpi; only 1 failure
> 
> ------------------ 原始邮件 ------------------
> 发件人: "petsc-dev" <balay at mcs.anl.gov<mailto:balay at mcs.anl.gov>>;
> 发送时间: 2021年4月6日(星期二) 中午12:38
> 收件人: "petsc-dev"<petsc-dev at mcs.anl.gov<mailto:petsc-dev at mcs.anl.gov>>;
> 抄送: "Chen Gang"<569615491 at qq.com<mailto:569615491 at qq.com>>;"cglwdm"<cglwdm at scu.edu.cn<mailto:cglwdm at scu.edu.cn>>;
> 主题: Re: [petsc-dev] Petsc: Error code 1
> 
> Note: do not use '-j' with alltests.
> 
> And run alltests on both machines [but *not* at the same time on machines] and send us logs from both the runs.
> 
> Satish
> 
> 
> On Mon, 5 Apr 2021, Satish Balay wrote:
> 
> > Try:
> >
> > make alltests TIMEOUT=600
> >
> > And send us the complete log (alltests.log)
> >
> > Satish
> >
> > On Tue, 6 Apr 2021, Chen Gang wrote:
> >
> > > Dear sir,
> > >
> > >
> > > The result of make check is OK. And I do set the timeout to a larger value, which keeps me from getting timeout error. The thing is I have two machines. And I get the error code 1 in different tests on different machines.I don’t know what is error code1. What case this? How can I fix the failure tests.
> > >
> > >
> > > ------------------ Original ------------------
> > > From: Satish Balay <balay at mcs.anl.gov<mailto:balay at mcs.anl.gov>>
> > > Date: Tue,Apr 6,2021 0:18 PM
> > > To: Chen Gang <569615491 at qq.com<mailto:569615491 at qq.com>>
> > > Cc: petsc-dev <petsc-dev at mcs.anl.gov<mailto:petsc-dev at mcs.anl.gov>>, cglwdm <cglwdm at scu.edu.cn<mailto:cglwdm at scu.edu.cn>>
> > > Subject: Re: [petsc-dev] Petsc: Error code 1
> >
> 
> <alltests2.log><alltests.log>
> 
> 


More information about the petsc-dev mailing list