[petsc-users] segfault in MatAssemblyEnd() when using large matrices on multi-core MAC OS-X

Jed Brown jedbrown at mcs.anl.gov
Wed Aug 1 13:36:20 CDT 2012


We have to reproduce or at least get a debugger trace to speculate about
what went wrong in your case. Have you tried running in valgrind?

On Wed, Aug 1, 2012 at 11:23 AM, Ronald M. Caplan <caplanr at predsci.com>wrote:

> Hi,
>
> Using FLUSH_ASSEMBLY periodically solved the segfault problem.
> I now use the following code to set the matrix:
>
> DO i=1,N
>
>         IF (rank .eq. 0) THEN
>            DO j=CSR_AJ(i)+1,CSR_AJ(i+1)
>               call
> MatSetValue(A,i-1,CSR_AI(j),CSR_A(j),INSERT_VALUES,ierr)
>            END DO
>         END IF
>         !Need to send out matrix periodically otherwise get a segfault
>         !(local buffer gets full? prob on mac only?)
>         IF (mod(i,100) .eq. 0) THEN
>            call MatAssemblyBegin(A,MAT_FLUSH_ASSEMBLY,ierr)
>            call MatAssemblyEnd(A,MAT_FLUSH_ASSEMBLY,ierr)
>         END IF
>       END DO
>       !Assemble final matrix A across all cores:
>       call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr)
>       call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr)
>
> and everything works fine.
>
> I find the error strange since I am on a single quad-core MAC, the
> "buffer" should never get "full"...    Is this a bug?
>
>  - Ron C.
>
>
> On Mon, Jul 30, 2012 at 3:09 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
>
>> On Mon, Jul 30, 2012 at 3:04 PM, Ronald M. Caplan <caplanr at predsci.com>wrote:
>>
>>> I seem to have solved the problem.
>>>
>>> I was storing my entire matrix on node 0 and then calling MatAssembly
>>> (begin and end) on all nodes (which should have worked...).
>>>
>>> Apparently I was using too much space for the buffering or the like,
>>> because when I change the code so each node sets its own matrix values,
>>> than the MatAssemblyEnd does not seg fault.
>>>
>>
>> Can you send the test case. It shouldn't seg-fault unless the machine
>> runs out of memory (and most desktop systems have overcommit, so the system
>> will kill arbitrary processes, not necessarily the job that did the latest
>> malloc.
>>
>> In practice, you should call MatAssemblyBegin(...,MAT_FLUSH_ASSEMBLY)
>> periodically.
>>
>>
>>>
>>> Why should this be the case?   How many elements of a vector or matrix
>>> can a single node "set" before Assembly to distribute over all nodes?
>>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120801/84c59177/attachment.html>


More information about the petsc-users mailing list