[petsc-dev] PETSC_VERSION_GE
Barry Smith
bsmith at mcs.anl.gov
Wed Apr 16 13:33:53 CDT 2014
Mark,
Please send configure.log and make.log and run with 4 threads and send all output.
Now Ed and I have had no problem running this code. But there are some issues with running code with each thread creating their own objects. That is, I have an another example in C that does not work. There are places where we work with MPI attributes and they are not properly protected with locks. This may or may not be affecting you. If you have to develop code that has different threads create different objects you are welcome to work with Jed and I etc in getting the thread stuff working in PETSc but this branch is not the starting point. So basically Ed got lucky and we won’t have “real” support for this usage of threads for a while (months at least).
You absolutely should configure with —with-debugging —with-log=0
Barry
On Apr 16, 2014, at 10:26 AM, Mark Adams <mfadams at lbl.gov> wrote:
> I could also use your compile line. I am getting no output.
>
>
> On Wed, Apr 16, 2014 at 10:41 AM, Ed D'Azevedo <dazevedoef at ornl.gov> wrote:
> Hi Mark,
>
> I hope this back trace might be helpful.
>
> You may need to build petsc with
> ./configure \
> --with-x=0 \
> --with-debugging=0 \
> --with-log=0 \
>
>
> env003> addr2line --exe=tpetsc_madams 0x4b76e3 0x4aa3d7 0x484888 0x475798 0x548aa5 0x5e5df2 0x57a353 0x4a701e 0x44e9be 0x44e1f4
> /autofs/na3_home1/adams/petsc/src/sys/memory/mal.c:27
> /autofs/na3_home1/adams/petsc/src/sys/utils/str.c:188
> /autofs/na3_home1/adams/petsc/src/sys/logging/utils/eventlog.c:317
> /autofs/na3_home1/adams/petsc/src/sys/logging/plog.c:747
> /autofs/na3_home1/adams/petsc/src/mat/interface/dlregismat.c:145
> /autofs/na3_home1/adams/petsc/src/mat/utils/gcreate.c:57
> /autofs/na3_home1/adams/petsc/src/mat/impls/aij/seq/aij.c:3576
> /autofs/na3_home1/adams/petsc/src/mat/impls/aij/seq/ftn-custom/zaijf.c:14
> /autofs/na3_home1/efdazedo/test/PETSC/./tpetsc.F90:147
>
>
>
>
>
> ======= Backtrace: =========
> /lib64/libc.so.6(+0x75558)[0x2aaaba5ff558]
> /lib64/libc.so.6(cfree+0x6c)[0x2aaaba6044fc]
> ./tpetsc_madams[0x4b7759]
> /lib64/libpthread.so.0(+0xf7c0) [0x2aaaaacdb7c0]
> /lib64/libc.so.6(gsignal+0x35) [0x2aaaba5bcb55]
> /lib64/libc.so.6(abort+0x181) [0x2aaaba5be131]
> /lib64/libc.so.6(+0x7576d) [0x2aaaba5ff76d]
> /lib64/libc.so.6(+0x789f0) [0x2aaaba6029f0]
> /lib64/libc.so.6(__libc_malloc+0x77) [0x2aaaba6045e7]
> ./tpetsc_madams() [0x4b76e3]
> ./tpetsc_madams() [0x4aa3d7]
> ./tpetsc_madams() [0x484888]
> ./tpetsc_madams() [0x475798]
> ./tpetsc_madams() [0x548aa5]
> ./tpetsc_madams() [0x5e5df2]
> ./tpetsc_madams() [0x57a353]
> ./tpetsc_madams() [0x4a701e]
> ./tpetsc_madams() [0x44e9be]
> ./tpetsc_madams() [0x44e1f4]
> /lib64/libc.so.6(__libc_start_main+0xe6) [0x2aaaba5a8c36]
> ./tpetsc_madams() [0x44e0e9]
>
>
>
>
>
>
> On 04/16/2014 10:33 AM, Mark Adams wrote:
>> cc'ing petsc-dev.
>>
>> I will try it.
>>
>> On Wed, Apr 16, 2014 at 10:21 AM, Ed D'Azevedo <dazevedoef at ornl.gov> wrote:
>>
>> Hi Mark,
>>
>> I got an error when I tried the simple test code (see attached) on Titan.
>>
>> Can you try to run the attached test case to see if it will work for you?
>>
>> I have also sent the simple test code to Barry.
>>
>>
>>
>> The code seems to work with 1 thread
>>
>> env003> export OMP_NUM_THREADS=1
>> env003> aprun -n 1 -d 16 ./tpetsc_madams
>> PETSC_VERSION_RELEASE 0
>> PETSC_VERSION_MAJOR 3
>> PETSC_VERSION_MINOR 4
>> PETSC_VERSION_SUBMINOR 4
>> PETSC_VERSION_PATCH 0
>> PETSC_VERSION_DATE unknown
>> petsc_version_lt(3,3,0) is false
>> nthreads = 1 NCASES = 100
>> nz = 88804
>> Warning: ieee_inexact is signaling
>> all done
>> total time is 7.268438
>> maxval(err) 6.9650285539069046E-011
>> Application 4895013 resources: utime ~8s, stime ~0s, Rss ~22300, inblocks ~11428, outblocks ~35848
>>
>>
>> The code seems to have trouble using more threads
>>
>> env003> export OMP_NUM_THREADS=16
>> env003> aprun -n 1 -d 16 ./tpetsc_madams
>> *** glibc detected *** ./tpetsc_madams: double free or corruption (!prev): 0x00000000015ee0b0 ***
>> PETSC_VERSION_RELEASE 0
>> PETSC_VERSION_MAJOR 3
>> PETSC_VERSION_MINOR 4
>> PETSC_VERSION_SUBMINOR 4
>> PETSC_VERSION_PATCH 0
>> PETSC_VERSION_DATE unknown
>> petsc_version_lt(3,3,0) is false
>> nthreads = 16 NCASES = 100
>> tpetsc_madams: malloc.c:3091: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
>> Error: abort
>> rax 0000000000000000, rbx 0000000000000fff, rcx ffffffffffffffff
>> rdx 0000000000000006, rsp 00002aaac4554f78, rbp 00002aaad0000098
>> rsi 0000000000005200, rdi 00000000000051f1, r8 00000000ffffffff
>> r9 00002aaaba8f9e40, r10 0000000000000008, r11 0000000000000202
>> r12 0000000000000000, r13 00002aaad0008e30, r14 0000000000000020
>> r15 0000000000000000
>> ======= Backtrace: =========
>> /lib64/libc.so.6(+0x75558)[0x2aaaba5ff558]
>> /lib64/libc.so.6(cfree+0x6c)[0x2aaaba6044fc]
>> ./tpetsc_madams[0x4b7759]
>> /lib64/libpthread.so.0(+0xf7c0) [0x2aaaaacdb7c0]
>> /lib64/libc.so.6(gsignal+0x35) [0x2aaaba5bcb55]
>> /lib64/libc.so.6(abort+0x181) [0x2aaaba5be131]
>> /lib64/libc.so.6(+0x7576d) [0x2aaaba5ff76d]
>> /lib64/libc.so.6(+0x789f0) [0x2aaaba6029f0]
>> /lib64/libc.so.6(__libc_malloc+0x77) [0x2aaaba6045e7]
>> ./tpetsc_madams() [0x4b76e3]
>> ./tpetsc_madams() [0x4aa3d7]
>> ./tpetsc_madams() [0x484888]
>> ./tpetsc_madams() [0x475798]
>> ./tpetsc_madams() [0x548aa5]
>> ./tpetsc_madams() [0x5e5df2]
>> ./tpetsc_madams() [0x57a353]
>> ./tpetsc_madams() [0x4a701e]
>> ./tpetsc_madams() [0x44e9be]
>> ./tpetsc_madams() [0x44e1f4]
>> /lib64/libc.so.6(__libc_start_main+0xe6) [0x2aaaba5a8c36]
>> ./tpetsc_madams() [0x44e0e9]
>> Application 4895021 exit codes: 127
>> Application 4895021 resources: utime ~0s, stime ~0s, Rss ~12544, inblocks ~11429, outblocks ~35849
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On 04/11/2014 04:34 PM, Mark Adams wrote:
>>> PETSc dev 'master' now has Barry's thread safe stuff so you should be able to use that. I have build it in:
>>>
>>> PETSC_DIR=/autofs/na3_home1/adams/petsc
>>> PETSC_ARCH=arch-titan-opt
>>>
>>> So try this version out. And revert the code to the repo version by doing:
>>>
>>> > git checkout poisson.F90
>>>
>>> and any other place where #if PETSC_VERSION_GE(3,5,0) is used. I only see poisson.F90.
>>>
>>> If this works I can install it wherever you like as Ed did.
>>>
>>> Mark
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Apr 10, 2014 at 11:24 PM, Seung-Hoe Ku <sku at pppl.gov> wrote:
>>> Hi Mark,
>>>
>>> I think it is not a big problem until we use petsc 3.5 other than you. Now I disabled #if #endif for other users. Sorry for the inconvenience. Could you uncomment it when you are working with 3.5, and does not commit to core_dev, please?
>>> We need Ed's thread safe petsc for performance issue, so we can find some way to resolve it when 3.5 is released and installed on titan.
>>>
>>> Thanks,
>>> Seung-Hoe
>>>
>>>
>>>
>>> On Thu, Apr 10, 2014 at 11:38 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>> We've figured out the problem (again). It will be fixed in future versions.
>>>
>>> We can fix your installation. I'm guessing Ed did this installation so it might not be worth fixing this code since you are up and running. Let me know what you want to do.
>>>
>>> The problem is that is a development version of the code, which is not as stable as the releases. This version needs to be updated to get fix this problem. I have to tell you how to do this update so let me know if you want to do it.
>>>
>>> Mark
>>>
>>>
>>>
>>> On Thu, Apr 10, 2014 at 8:58 AM, Seung-Hoe Ku <sku at pppl.gov> wrote:
>>> One question is..
>>> If PETSC_VERSION_GE is defined with PETSC_VERSION_GT and PETSC_VERSION_GT is defined with something in petsc.h,
>>> redefinition of PETSC_VERSION_GT after petsc.h will change PETSC_VERSION_GE?
>>>
>>>
>>>
>>>
>>> On Thu, Apr 10, 2014 at 10:55 AM, Seung-Hoe Ku <sku at pppl.gov> wrote:
>>> This is the code I tried to compile.
>>>
>>> #if PETSC_VERSION_GE(5,5,0)
>>> 214 BBBBBBBBBBBBBBBBBBB=0
>>> 215 call KSPSetOperators(solver%ksp, solver%Amat, solver%Amat, ierr )
>>> 216 #else
>>> 217 AAAAAAAAAAAA=0
>>> 218 call KSPSetOperators(solver%ksp, solver%Amat, solver%Amat, SAME_NONZERO_PATT ERN, ierr )
>>> 219 #endif
>>>
>>>
>>>
>>> On Thu, Apr 10, 2014 at 10:49 AM, Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>
>>>
>>> On Thu, Apr 10, 2014 at 8:25 AM, Seung-Hoe Ku <sku at pppl.gov> wrote:
>>> I have the same problem. I used poisson.F90 in /lustre/atlas2/env003/scratch/shku/XGC1_3_petsc_problem/
>>>
>>> It seems that #undef should be after #include<finclude/petsc.h>. Otherwise, petsc.h seems to try redefine it.
>>>
>>>
>>> Oh yes.
>>>
>>> Anyway, I got the same error message:
>>>
>>> PGF90-S-0038-Symbol, bbbbbbbbbbbbbbbbbbb, has not been explicitly declared (poisson.F90)
>>> 0 inform, 0 warnings, 1 severes, 0 fatal for init_1field_solver
>>>
>>>
>>> is this in the source file?
>>>
>>> I tried 5.5.0 for the arguments of PETSC_VERSION_GE, but still have the same problem.
>>>
>>> Or do you mean VERSION_GE and VERSION_LT instead of VERSION_GT and VERSION_LE?
>>>
>>> Thanks,
>>> Seung-Hoe
>>>
>>>
>>>
>>> On Wed, Apr 9, 2014 at 9:54 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>> We might have a fix. It turns out that some fortran compilers to not do the #define quite right. Try this:
>>>
>>> > git checkout poisson.F90
>>>
>>> This will out the original (bad) file back. Then add to poisson.F90
>>>
>>> #undef PETSC_VERSION_GT
>>> #define PETSC_VERSION_GT(MAJOR,MINOR,SUBMINOR) \
>>> (0==PETSC_VERSION_LE(MAJOR,MINOR,SUBMINOR))
>>>
>>> This is the fix that Jed thinks will work. If it works we can propagate it.
>>>
>>> Sorry about the confusion,
>>> Mark
>>>
>>>
>>>
>>> On Wed, Apr 9, 2014 at 7:36 PM, Seung-Hoe Ku <sku at pppl.gov> wrote:
>>> Great! Thank you.
>>>
>>>
>>>
>>> On Wed, Apr 9, 2014 at 9:35 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>> Good news. I'm at a meeting and Jed Brown is here and he seems to have been able to reproduce the error. Preprocessors can be a pain. I'm going to wait until Jed has a chance to look at this and come up with a solution.
>>>
>>>
>>> On Wed, Apr 9, 2014 at 7:28 PM, Seung-Hoe Ku <sku at pppl.gov> wrote:
>>> Now I am using iterative solver which will be replaced to your 2 field solver later.
>>> Sorry. The line number 3003 is wrong. It is 209 or near it.
>>> Yes. We can take a look at this when you come. It is not urgent problem.
>>>
>>> Thanks,
>>> Seung-Hoe
>>>
>>>
>>>
>>> On Wed, Apr 9, 2014 at 9:25 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>> I noticed that your makefile was not using the new petsc solver. Is that intensional?
>>>
>>> The line you gave me 3003 poission.F90 seems to be the last line of the file.
>>>
>>> I am not getting Edison or Titan to build. We can take a look at this next week.
>>>
>>> Mark
>>>
>>>
>>>
>>> On Tue, Apr 8, 2014 at 12:38 PM, Seung-Hoe Ku <sku at pppl.gov> wrote:
>>> Hi Mark,
>>>
>>> It seems that PETSC_VERSION_GE is not working line 3003 of poisson.F90. Is there any reason of using PETSC_VERSION_GE instead of PETSC_VERSION_LE? The previous code worked, I think.
>>>
>>> Thanks,
>>> Seung-Hoe
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
More information about the petsc-dev
mailing list