<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META content="text/html; charset=us-ascii" http-equiv=Content-Type>

<META name=GENERATOR content="MSHTML 8.00.6001.18783"></HEAD>

<BODY>

<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff 

size=2 face=Arial>Bob,</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff 

size=2 face=Arial>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; If you are using a 

large number of cores and the default Nemesis channel and the MPD process 

manager, it will affect you. It has been fixed in the current&nbsp;source 

(available via nightly snapshots) and will be in the 1.1.1 release later this 

week.</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff 

size=2 face=Arial></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff 

size=2 face=Arial>(The problem was that in MPI_Init, each process was doing p 

queries to the process manager for info about other processes, resulting in p^2 

queries across all processes. On small p's it didn't matter, but on large p's, 

the p^2 queries took too long. All that has been fixed now.)</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff 

size=2 face=Arial></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff 

size=2 face=Arial>Rajeev</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=500540514-06072009><FONT color=#0000ff 

size=2 face=Arial></FONT></SPAN>&nbsp;</DIV><BR>

<BLOCKQUOTE 

style="BORDER-LEFT: #0000ff 2px solid; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px">

  <DIV dir=ltr lang=en-us class=OutlookMessageHeader align=left>

  <HR tabIndex=-1>

  <FONT size=2 face=Tahoma><B>From:</B> mpich-discuss-bounces@mcs.anl.gov 

  [mailto:mpich-discuss-bounces@mcs.anl.gov] <B>On Behalf Of </B>bob 

  ilgner<BR><B>Sent:</B> Monday, July 06, 2009 5:05 AM<BR><B>To:</B> 

  mpich-discuss@mcs.anl.gov<BR><B>Subject:</B> Re: [mpich-discuss] mpirun on 

  1500~2000 cores<BR></FONT><BR></DIV>

  <DIV></DIV>

  <DIV>Hi Dmitry,</DIV>

  <DIV>&nbsp;</DIV>

  <DIV>What sort of cluster are you running the 1500-2000 cores on(example 

  e1350) and what is the nature of the application that you are running? </DIV>

  <DIV>&nbsp;</DIV>

  <DIV>I did see Rajeev's response and noted the improced loading time with the 

  latest build.</DIV>

  <DIV>&nbsp;</DIV>

  <DIV>Regards, bob<BR></DIV>

  <DIV class=gmail_quote>On Sun, Jul 5, 2009 at 5:03 AM, dvg <SPAN 

  dir=ltr>&lt;<A href="mailto:dvg@ieee.org">dvg@ieee.org</A>&gt;</SPAN> 

  wrote:<BR>

  <BLOCKQUOTE 

  style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex" 

  class=gmail_quote>Hello,<BR><BR>What would be considered as reasonable time 

    for mpirun to start a job on<BR>1500~2000 cores, 1 gige cluster?<BR><BR>Are 

    there any kernel (linux) or eth-related parameters which can be<BR>tuned to 

    speed it up? &nbsp;MPICH2 libraries were compiled with 

    most/all<BR>optimization options enabled.<BR><BR>Thank 

  you,<BR>Dmitry<BR><BR></BLOCKQUOTE></DIV><BR></BLOCKQUOTE></BODY></HTML>