<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=us-ascii" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.6001.18783"></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff
size=2 face=Arial>Bob,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff
size=2 face=Arial> If you are using a
large number of cores and the default Nemesis channel and the MPD process
manager, it will affect you. It has been fixed in the current source
(available via nightly snapshots) and will be in the 1.1.1 release later this
week.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff
size=2 face=Arial>(The problem was that in MPI_Init, each process was doing p
queries to the process manager for info about other processes, resulting in p^2
queries across all processes. On small p's it didn't matter, but on large p's,
the p^2 queries took too long. All that has been fixed now.)</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff
size=2 face=Arial>Rajeev</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff
size=2 face=Arial></FONT></SPAN> </DIV><BR>
<BLOCKQUOTE
style="BORDER-LEFT: #0000ff 2px solid; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px">
<DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>From:</B> mpich-discuss-bounces@mcs.anl.gov
[mailto:mpich-discuss-bounces@mcs.anl.gov] <B>On Behalf Of </B>bob
ilgner<BR><B>Sent:</B> Monday, July 06, 2009 5:05 AM<BR><B>To:</B>
mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> Re: [mpich-discuss] mpirun on
1500~2000 cores<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV>Hi Dmitry,</DIV>
<DIV> </DIV>
<DIV>What sort of cluster are you running the 1500-2000 cores on(example
e1350) and what is the nature of the application that you are running? </DIV>
<DIV> </DIV>
<DIV>I did see Rajeev's response and noted the improced loading time with the
latest build.</DIV>
<DIV> </DIV>
<DIV>Regards, bob<BR></DIV>
<DIV class=gmail_quote>On Sun, Jul 5, 2009 at 5:03 AM, dvg <SPAN
dir=ltr><<A href="mailto:dvg@ieee.org">dvg@ieee.org</A>></SPAN>
wrote:<BR>
<BLOCKQUOTE
style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex"
class=gmail_quote>Hello,<BR><BR>What would be considered as reasonable time
for mpirun to start a job on<BR>1500~2000 cores, 1 gige cluster?<BR><BR>Are
there any kernel (linux) or eth-related parameters which can be<BR>tuned to
speed it up? MPICH2 libraries were compiled with
most/all<BR>optimization options enabled.<BR><BR>Thank
you,<BR>Dmitry<BR><BR></BLOCKQUOTE></DIV><BR></BLOCKQUOTE></BODY></HTML>