bombing out writing large scratch files

Randall Mackie randy at geosystem.us
Sat May 27 21:51:18 CDT 2006


If using valgrind, can you tell me how to do that with mpirun and
a parallel petsc program?

is it valgrind mpirun program, or mpirun valgrind program?

Randy


Barry Smith wrote:
> 
>   Sometimes a subtle memory bug can lurk under the covers and
> then appear in a big problem. You can try putting a CHKMEMQ
> right before the if (rank == ) in the code and run the debug
> version with -malloc_debug
> You could also consider valgrind (valgrind.org).
> 
>    Barry
> 
> On Sat, 27 May 2006, Randall Mackie wrote:
> 
>> xvec is a double precision complex vector that is dynamically allocated
>> once np is known. I've printed out the np value and it is correct.
>> This works on the first pass, but not the second.
>>
>> This PETSc program has been working just fine for a couple years now,
>> the only difference this time is the size of the model I'm working
>> with, which is substantially larger than typical.
>>
>> I'm going to try to run this in the debugger and see if I can get
>> anymore information.
>>
>> Randy
>>
>>
>> Barry Smith wrote:
>>>
>>>   Randy,
>>>
>>>     The only "PETSc" related reason for this is that
>>> xvec(i), i=1,np is accessing out of range. What is xvec
>>> and is it of length 1 to np?
>>>
>>>    Barry
>>>
>>>
>>> On Sat, 27 May 2006, Randall Mackie wrote:
>>>
>>>> In my PETSc based modeling code, I write out intermediate results to 
>>>> a scratch
>>>> file, and then read them back later. This has worked fine up until 
>>>> today,
>>>> when for a large model, this seems to be causing my program to crash 
>>>> with
>>>> errors like:
>>>>
>>>> ------------------------------------------------------------------------ 
>>>>
>>>> [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation 
>>>> Violation, probably memory access out of range
>>>>
>>>>
>>>> I've tracked down the offending code to:
>>>>
>>>>          IF (rank == 0) THEN
>>>>            irec=(iper-1)*2+ipol
>>>>            write(7,rec=irec) (xvec(i),i=1,np)
>>>>          END IF
>>>>
>>>> It writes out xvec for the first record, but then on the second
>>>> record my program is crashing.
>>>>
>>>> The record length (from an inquire statement) is  recl     22626552
>>>>
>>>> The size of the scratch file when my program crashes is 98M.
>>>>
>>>> PETSc is compiled using the intel compilers (v9.0 for fortran),
>>>> and the users manual says that you can have record lengths of
>>>> up to 2 billion bytes.
>>>>
>>>> I'm kind of stuck as to what might be the cause. Any ideas from anyone
>>>> would be greatly appreciated.
>>>>
>>>> Randy Mackie
>>>>
>>>> ps. I've tried both the optimized and debugging versions of the PETSc
>>>> libraries, with the same result.
>>>>
>>>>
>>>>
>>>
>>
>>
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034




More information about the petsc-users mailing list