[petsc-users] segfault in MatAssemblyEnd() when using large matrices on multi-core MAC OS-X
Ronald M. Caplan
caplanr at predsci.com
Wed Aug 1 13:44:11 CDT 2012
Yes, but valgrind did not catch any errors.
I am having the same segfault issue when I use vectors which are "too big"
and do the following:
!Initialize xtrue vector to random values (true solution of Ax=b):
IF (rank .eq. 0) THEN
allocate(xvec(N))
allocate(veci(N))
!Create list of indices (0-based for petsc)
DO i=0,N-1
veci(i+1) = i
END DO
call RANDOM_NUMBER(xvec)
call
VecSetValues(xtrue,N,veci,xvec,INSERT_VALUES,ierr)
deallocate(veci)
END IF
!Assemble the xtrue vector across all cores:
call VecAssemblyBegin(xtrue,ierr);
call VecAssemblyEnd(xtrue,ierr);
This works on 1 processor but on multiple cores it segfaults if N is "too
big" (around 3 million). Is there an equivalent "flush" command for
vectors to get around this?
- Ron
On Wed, Aug 1, 2012 at 11:36 AM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> We have to reproduce or at least get a debugger trace to speculate about
> what went wrong in your case. Have you tried running in valgrind?
>
>
> On Wed, Aug 1, 2012 at 11:23 AM, Ronald M. Caplan <caplanr at predsci.com>wrote:
>
>> Hi,
>>
>> Using FLUSH_ASSEMBLY periodically solved the segfault problem.
>> I now use the following code to set the matrix:
>>
>> DO i=1,N
>>
>> IF (rank .eq. 0) THEN
>> DO j=CSR_AJ(i)+1,CSR_AJ(i+1)
>> call
>> MatSetValue(A,i-1,CSR_AI(j),CSR_A(j),INSERT_VALUES,ierr)
>> END DO
>> END IF
>> !Need to send out matrix periodically otherwise get a segfault
>> !(local buffer gets full? prob on mac only?)
>> IF (mod(i,100) .eq. 0) THEN
>> call MatAssemblyBegin(A,MAT_FLUSH_ASSEMBLY,ierr)
>> call MatAssemblyEnd(A,MAT_FLUSH_ASSEMBLY,ierr)
>> END IF
>> END DO
>> !Assemble final matrix A across all cores:
>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr)
>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr)
>>
>> and everything works fine.
>>
>> I find the error strange since I am on a single quad-core MAC, the
>> "buffer" should never get "full"... Is this a bug?
>>
>> - Ron C.
>>
>>
>> On Mon, Jul 30, 2012 at 3:09 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
>>
>>> On Mon, Jul 30, 2012 at 3:04 PM, Ronald M. Caplan <caplanr at predsci.com>wrote:
>>>
>>>> I seem to have solved the problem.
>>>>
>>>> I was storing my entire matrix on node 0 and then calling MatAssembly
>>>> (begin and end) on all nodes (which should have worked...).
>>>>
>>>> Apparently I was using too much space for the buffering or the like,
>>>> because when I change the code so each node sets its own matrix values,
>>>> than the MatAssemblyEnd does not seg fault.
>>>>
>>>
>>> Can you send the test case. It shouldn't seg-fault unless the machine
>>> runs out of memory (and most desktop systems have overcommit, so the system
>>> will kill arbitrary processes, not necessarily the job that did the latest
>>> malloc.
>>>
>>> In practice, you should call MatAssemblyBegin(...,MAT_FLUSH_ASSEMBLY)
>>> periodically.
>>>
>>>
>>>>
>>>> Why should this be the case? How many elements of a vector or matrix
>>>> can a single node "set" before Assembly to distribute over all nodes?
>>>>
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120801/001f8a29/attachment.html>
More information about the petsc-users
mailing list