[mpich-discuss] Recovering from a Bcast Timeout
Hiatt, Dave M
dave.m.hiatt at citi.com
Mon Jan 4 17:11:08 CST 2010
A general question to those in the know. From time to time I get a Bcast timeout error. I'm putting in an error handler to do a "catch" on this exception (C++). My question is, will an MPI:: Finalize() followed by and MPI:: Initi() work from the same process. This error is being caused by our deficient network, we've never lost a blade, and I'm confident both the app and MPI are functioning properly though considerable investigation.
So are there any consequences to simply doing a Finalize() and a new Init() to start up, or will I have to stop the whole process and start again? I'm assuming that it should restart without prejudice. I'm on 1.2.1 Windows/Linux releases.
Thanks
dave
"Consequences, Schmonsequences, as long as I'm rich". - Daffy Duck
Dave Hiatt
Market Risk Systems Integration
CitiMortgage, Inc.
1000 Technology Dr.
Third Floor East, M.S. 55
O'Fallon, MO 63368-2240
Phone: 636-261-1408
Mobile: 314-452-9165
FAX: 636-261-1312
Email: Dave.M.Hiatt at citigroup.com
More information about the mpich-discuss
mailing list