[petsc-users] Floating point exception, only occurs when assembling "large" matrices

Karl Rupp rupp at mcs.anl.gov
Wed Nov 27 07:13:11 CST 2013


Hi Justin,

 > Unfortunately, I don’t have access to valgrind as I’m on Mac OS X 
10.9 right now. I’m only examining the global mass matrix right now, 
which will be block diagonal in 3x3 blocks if I’m not mistaken. The 
larger application is for DG FEM, which definitely has a different 
non-zero pattern and more than 3 non-zero entries per row. For the 
global matrix overall, my allocation of the non-zero entries per rough 
is just a rough estimate. As I understand it, I don’t have to be exact 
when specifying this number to MatCreateSeqAIJ.

This is correct, even though it is not ideal in terms of performance. 
It's usually better to be a bit pessimistic and avoid reallocations this 
way.


> Attached below is a minimal code that should run and produce the problem. I hope it will compile for you. I added comments where necessary. The variable ‘refs’ in main controls the mesh, and on my machine it works up to and including refs=5. For refs=6 is where I get the errors. It’s a bit lengthy but oh well.

The code is valgrind-clean on my machine, but it only writes zeros to 
the matrix (these are the values in Mlocal) for refs=6. Maybe you have 
an overflow or a spurious integer operation somewhere? It seems to me 
that something goes wrong in DG_local_mass() already.

Best regards,
Karli


>
> On Nov 27, 2013, at 3:22 AM, Karl Rupp <rupp at mcs.anl.gov> wrote:
>
>> Hi Justin,
>>
>> did you run your code through Valgrind? I suspect that this is caused by some memory corruption, particularly as you claim that smaller matrix sizes are fine. Also, do you check all the error codes returned by PETSc routines?
>>
>> Also, are you sure about the nonzeropattern of your matrix? From your description it sounds like you are solving 8192 decoupled problems of size 3x3, while for e.g. typical finite element applications you get 3x3 local mass matrices per cell, with the total number of degrees of freedom given by the number of vertices.
>>
>> If the above doesn't help: Any chance of sending us the relevant source code?
>>
>> Best regards,
>> Karli
>>
>>
>> On 11/27/2013 09:12 AM, Justin Dong (Me) wrote:
>>> I am assembling a global mass matrix on a mesh consisting of 8192
>>> elements and 3 basis functions per element (so the global dimension is
>>> 24,576. For some reason, when assembling this matrix I get tons of
>>> floating point exceptions everywhere:
>>>
>>> [0]PETSC ERROR: --------------------- Error Message
>>> ------------------------------------
>>> [0]PETSC ERROR: Floating point exception!
>>> [0]PETSC ERROR: Inserting nan at matrix entry (0,0)!
>>>
>>> I get this error for every 3rd diagonal entry of the matrix, but the
>>> problem is bigger than that I suspect. In my computation of the local
>>> 3x3 element mass matrices, printing out the values gives all zeros
>>> (which is what the entries are initialized to be in my code).
>>>
>>> I’m at a loss for how to debug this problem. My first thought was that
>>> there is an error in the mesh, but I’m certain now that this is not the
>>> case since the mesh is generated using the exact same routine that
>>> generates all of my coarser meshes before this one that fails. For
>>> example, the mesh that is one refinement level below this one has 2048
>>> elements and works completely fine. This is how I am creating the global
>>> matrix:
>>>
>>> MatCreateSeqAIJ(PETSC_COMM_SELF, NLoc*nElems, NLoc*nElems, NLoc,
>>>     PETSC_NULL, &Mglobal);
>>>
>>> where I allocate NLoc = 3 non-zero entries per row in this case.
>>
>



More information about the petsc-users mailing list