<div style>Hi all,</div><div style><br></div><div style>I've been experiencing some frequent problems using MPICH2.</div><div style>During the execution, the root process tries to broadcast a ~12 MB-matrix but, sometimes, it just don't get back from the MPI_BCAST function, freezing the execution. Here, "sometimes" is really a problem because we don't have any clue about when it is going to happen.</div>
<div class="im" style><div><br></div><div>Has someone already experienced any similar problem?</div><div><br></div></div><div style>Some few questions have been raised about the MPI_BCAST behaviour and its implementation:</div>
<div style>1) Is there any limitation on the size of the buffer that is sent?</div><div style>2) If this limit exists, would it be related to the number of the process of the communicator? in this case, I am using 32 processes, but I commonly had success with bigger clusters (over 200 processes).</div>
<div style>3) Is the content of data being sent relevant? If I have some uninitialized data, would it be a concern? In other words, I understand that the only thing that matters is that the buffer size must be correct in all process (any combination of datatype/array size) and there must be enough allocated space to receive the data, right?</div>
<div style>4) How is the best way to send this data? Split it in smaller broadcasts might be better/safer?</div><div style>5) How should I classify a 12 MB message? Small? Big? I believe it should be pretty small because I also have other typical executions instances with messages over 100 MB that had sucess.</div>
<div style><br></div><div style>Regards,</div><div style>Luiz</div>