Hi,<br><br>I found out recently that increasing  the value of the parameter P4_SOCKBUFSIZE to the maximum level allowed improves throughput significantly (Ref: Protocol Dependent Message-Passing Performance on Linux Clusters; Dave Turner and Xuehua Chen). Is there a way to achieve the same result in Windoze XP / Vista?<br>

<br>Also, suppose I have a 10 integer / double precision arrays (each of dimension, say, 100000) on CPU1. If I were to send this data to CPU0, which would be faster:<br><br>1. Sending + receiving data from the 10 arrays one array at a time, OR,<br>

2. Combining the data into one single array (dimension = 10 X 100000) and then sending / receiving it?<br><br>Is this a general result?<br><br>Thanks.<br><br>Rahul.<br>