bombing out writing large scratch files
Barry Smith
bsmith at mcs.anl.gov
Sat May 27 17:48:55 CDT 2006
Sometimes a subtle memory bug can lurk under the covers and
then appear in a big problem. You can try putting a CHKMEMQ
right before the if (rank == ) in the code and run the debug
version with -malloc_debug
You could also consider valgrind (valgrind.org).
Barry
On Sat, 27 May 2006, Randall Mackie wrote:
> xvec is a double precision complex vector that is dynamically allocated
> once np is known. I've printed out the np value and it is correct.
> This works on the first pass, but not the second.
>
> This PETSc program has been working just fine for a couple years now,
> the only difference this time is the size of the model I'm working
> with, which is substantially larger than typical.
>
> I'm going to try to run this in the debugger and see if I can get
> anymore information.
>
> Randy
>
>
> Barry Smith wrote:
>>
>> Randy,
>>
>> The only "PETSc" related reason for this is that
>> xvec(i), i=1,np is accessing out of range. What is xvec
>> and is it of length 1 to np?
>>
>> Barry
>>
>>
>> On Sat, 27 May 2006, Randall Mackie wrote:
>>
>>> In my PETSc based modeling code, I write out intermediate results to a
>>> scratch
>>> file, and then read them back later. This has worked fine up until today,
>>> when for a large model, this seems to be causing my program to crash with
>>> errors like:
>>>
>>> ------------------------------------------------------------------------
>>> [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>>
>>>
>>> I've tracked down the offending code to:
>>>
>>> IF (rank == 0) THEN
>>> irec=(iper-1)*2+ipol
>>> write(7,rec=irec) (xvec(i),i=1,np)
>>> END IF
>>>
>>> It writes out xvec for the first record, but then on the second
>>> record my program is crashing.
>>>
>>> The record length (from an inquire statement) is recl 22626552
>>>
>>> The size of the scratch file when my program crashes is 98M.
>>>
>>> PETSc is compiled using the intel compilers (v9.0 for fortran),
>>> and the users manual says that you can have record lengths of
>>> up to 2 billion bytes.
>>>
>>> I'm kind of stuck as to what might be the cause. Any ideas from anyone
>>> would be greatly appreciated.
>>>
>>> Randy Mackie
>>>
>>> ps. I've tried both the optimized and debugging versions of the PETSc
>>> libraries, with the same result.
>>>
>>>
>>>
>>
>
>
More information about the petsc-users
mailing list