[petsc-users] MatAssemblyBegin and dapl MPI fabric

Antoine De Blois antoine.deblois at aero.bombardier.com
Thu Jan 22 09:17:43 CST 2015


Hi Everyone,

I get a strange error during a call to MatAssemblyBegin. The error message is triggered by Intel MPI, as shown below. The error does not always occurs, which is even more strange.
[333:node1179] unexpected disconnect completion event from [163:node1254]
Assertion failed in file ../../dapl_conn_rc.c at line 1128: 0

All ranks output the same error message with their own node number. I did a bit of research and some say that MPICH2 solves this issue. Since our group is keen in using Intel MPI, I would like to solves this issue at the root.

A few important points:

·         At the moment, we are assembling the matrix with a single MatAssembleBegin/End and MAT_FINAL_ASSEMBLY after doing MatSetValuesBlocked. Can it be due to memory overflow in the buffers?

·         We are using -genv I_MPI_FABRICS shm:dapl in the submission script

·         I tried using -malloc_log and -log_summary, but the crash prevents writing the log ouput

Has anyone of you already faced this issue?
Any suggestion is welcome,
Best regards,
Antoine DeBlois

Antoine DeBlois
Specialiste ingenierie, MDO lead / Engineering Specialist, MDO lead
Aéronautique / Aerospace
514-855-5001, x 50862
antoine.deblois at aero.bombardier.com<mailto:antoine.deblois at aero.bombardier.com>

2351 Blvd Alfred-Nobel
Montreal, Qc
H4S 1A9

[Description : Description : http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg]
CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information.
If you are not the intended recipient or received this communication by error, please notify the sender
and delete the message without copying

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150122/534e7845/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4648 bytes
Desc: image001.jpg
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150122/534e7845/attachment-0001.jpg>


More information about the petsc-users mailing list