bombing out writing large scratch files

Barry Smith bsmith at mcs.anl.gov
Sat May 27 17:48:55 CDT 2006


   Sometimes a subtle memory bug can lurk under the covers and
then appear in a big problem. You can try putting a CHKMEMQ
right before the if (rank == ) in the code and run the debug
version with -malloc_debug
You could also consider valgrind (valgrind.org).

    Barry

On Sat, 27 May 2006, Randall Mackie wrote:

> xvec is a double precision complex vector that is dynamically allocated
> once np is known. I've printed out the np value and it is correct.
> This works on the first pass, but not the second.
>
> This PETSc program has been working just fine for a couple years now,
> the only difference this time is the size of the model I'm working
> with, which is substantially larger than typical.
>
> I'm going to try to run this in the debugger and see if I can get
> anymore information.
>
> Randy
>
>
> Barry Smith wrote:
>>
>>   Randy,
>>
>>     The only "PETSc" related reason for this is that
>> xvec(i), i=1,np is accessing out of range. What is xvec
>> and is it of length 1 to np?
>>
>>    Barry
>> 
>> 
>> On Sat, 27 May 2006, Randall Mackie wrote:
>> 
>>> In my PETSc based modeling code, I write out intermediate results to a 
>>> scratch
>>> file, and then read them back later. This has worked fine up until today,
>>> when for a large model, this seems to be causing my program to crash with
>>> errors like:
>>> 
>>> ------------------------------------------------------------------------
>>> [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
>>> probably memory access out of range
>>> 
>>> 
>>> I've tracked down the offending code to:
>>>
>>>          IF (rank == 0) THEN
>>>            irec=(iper-1)*2+ipol
>>>            write(7,rec=irec) (xvec(i),i=1,np)
>>>          END IF
>>> 
>>> It writes out xvec for the first record, but then on the second
>>> record my program is crashing.
>>> 
>>> The record length (from an inquire statement) is  recl     22626552
>>> 
>>> The size of the scratch file when my program crashes is 98M.
>>> 
>>> PETSc is compiled using the intel compilers (v9.0 for fortran),
>>> and the users manual says that you can have record lengths of
>>> up to 2 billion bytes.
>>> 
>>> I'm kind of stuck as to what might be the cause. Any ideas from anyone
>>> would be greatly appreciated.
>>> 
>>> Randy Mackie
>>> 
>>> ps. I've tried both the optimized and debugging versions of the PETSc
>>> libraries, with the same result.
>>> 
>>> 
>>> 
>> 
>
>




More information about the petsc-users mailing list