[MPICH2-dev] MPICH IB driver with RDMA getting extra completions?

Heinz, Michael mheinz at silverstorm.com
Thu Sep 1 10:45:11 CDT 2005


Hi,

I'm looking at validating MPICH2 1.02 with our Infiniband drivers and I've run into an odd problem - while simple tests (bandwidth, latency, etc..) building with CH3/IB with rdma enabled works fine, but HPL/Linpack is failing with what appears to be an "extra" RDMA write completion. I say "extra" because after poring over the debug logs it looks to me like ibu_wait is properly waiting for each completion before going to the next.

Has anyone seen behavior like this? I noticed a line of commented out code that would cause the IB device to discard the extra completion, but uncommenting it causes lots of other problems and eventually results in a VAPI_LOC_PROT_ERR.

Any ideas?




More information about the mpich2-dev mailing list