<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
IMO, the biggest hurdle for large workflows will be memory (I recommend
2GB+), and if the jobs are short that end up pushing hundreds of
jobs/sec for prolonged periods of time to Falkon, having multiple
processors might also be important.<br>
<br>
Ioan<br>
<br>
Ian Foster wrote:
<blockquote
 cite="mid:704362323-1190139464-cardhu_decombobulator_blackberry.rim.net-955906906-@bxe030.bisx.prod.on.blackberry"
 type="cite">
  <pre wrap="">It seems ridiculous to me that we are still using a student-supported machine to run major applications. Surely we should have one highly capable, well-maintained machine for this? And this shouldn't be a "suggestion" but a clear policy.


Sent via BlackBerry from T-Mobile

-----Original Message-----
From: Ioan Raicu <a class="moz-txt-link-rfc2396E" href="mailto:iraicu@cs.uchicago.edu"><iraicu@cs.uchicago.edu></a>

Date: Tue, 18 Sep 2007 13:13:08 
To:Michael Wilde <a class="moz-txt-link-rfc2396E" href="mailto:wilde@mcs.anl.gov"><wilde@mcs.anl.gov></a>
<a class="moz-txt-link-abbreviated" href="mailto:Cc:swift-devel@ci.uchicago.edu">Cc:swift-devel@ci.uchicago.edu</a>
Subject: Re: [Swift-devel] bug 53




Michael Wilde wrote:
  </pre>
  <blockquote type="cite">
    <pre wrap="">Its not clear when this happened, as Nika and Ioan's workflow 
submission from viper has afaik been mostly through Falkon for quite a 
while now.

Nika, perhaps you can shift back to trying the two Falkon approaches 
(with higher prio on testing Ioan's retry code) in the meantime.

Ioan, is CI Support / Ti supporting viper, or are you the "sysadmin" 
Ben is referring to?

    </pre>
  </blockquote>
  <pre wrap=""><!---->Yes, I am viper's support.  viper is my department office machine.
  </pre>
  <blockquote type="cite">
    <pre wrap="">Ive also suggested in the past that we focus on using evitable and 
terminable (and swift03/04) as our main submit hosts, primarily for 
support and coordination reasons.  Is this a good time to try the 
GRAM/non-Falkon workfow there?
    </pre>
  </blockquote>
  <pre wrap=""><!---->Sure, but watch out for the large MolDyn runs as 1GB or less of memory 
is not enough for 244 mol runs. 

Ioan
  </pre>
  <blockquote type="cite">
    <pre wrap="">- Mike


Ben Clifford wrote:
    </pre>
    <blockquote type="cite">
      <pre wrap="">sounds like viper had firewall configuration changed recently. viper 
sysadmin needs to help debug basic job submission with simple globus 
tools before that machine is worth using again.

On Tue, 18 Sep 2007, Michael Wilde wrote:

      </pre>
      <blockquote type="cite">
        <pre wrap="">does the cog equivalent of globus_tcp_source_range also need to be set?
is that only for gridftp, or gram as well?  or could this be a 
gridftp hang?

- mike

Ben Clifford wrote:
        </pre>
        <blockquote type="cite">
          <pre wrap="">can you submit a job using globus-job-run?

On Tue, 18 Sep 2007, Veronika Nefedova wrote:

          </pre>
          <blockquote type="cite">
            <pre wrap="">I set tcp.port.range in swift properties but even a simple helloworld
workflow
hangs (the  submit host doesn't receive the notification from the 
compute
host
that the job has finished).
tcp.port.range=50000,60000

Not sure what else has changed on viper? It used to be a very good 
submit
host, I never had any problems with it );

Nika

On Sep 18, 2007, at 9:13 AM, Mihael Hategan wrote:

            </pre>
            <blockquote type="cite">
              <pre wrap="">Should pick that one. If not ~/.globus/cog.properties ->
tcp.port.range=begin,end

On Tue, 2007-09-18 at 07:42 +0000, Ben Clifford wrote:
              </pre>
              <blockquote type="cite">
                <pre wrap="">Not sure if cog picks up the GLOBUS_whatever environment variables.
Mihael
presumably knows.

On Mon, 17 Sep 2007, Ioan Raicu wrote:

                </pre>
                <blockquote type="cite">
                  <pre wrap="">There is a firewall on viper.  Ports 50000 - 60000 are open for 
TCP.
You
might want to set the TCP_PORT_RANGE (I am not sure this is the
exact
environment variable, but something like that) to be between 
50K and
60K
ports
to ensure that GT4 uses one of these open ports.
Ioan

Veronika Nefedova wrote:
                  </pre>
                  <blockquote type="cite">
                    <pre wrap="">The same. You can check the job's status in its log on viper in
~nefedova/alamines/MolDyn-244-loops-20070917-1356-h95gxij8.log.

The job is still runnning (i.e. hanging) with the same symptom as
before:
the first jobs is done and then nothing else gets submitted (the
submit host
doesn't receive any notification that the job has finished).

NIka

On Sep 17, 2007, at 9:51 AM, Mihael Hategan wrote:

                    </pre>
                    <blockquote type="cite">
                      <pre wrap="">On Mon, 2007-09-17 at 09:41 -0500, Veronika Nefedova wrote:
                      </pre>
                      <blockquote type="cite">
                        <pre wrap="">I did 'svn up' in cog directory and then did 'ant dist' in the
same
directory.
                        </pre>
                      </blockquote>
                      <pre wrap="">'ant dist' should be done in the swift directory.

                      </pre>
                      <blockquote type="cite">
                        <pre wrap="">My 'svn info' gives me r1740.

On Sep 17, 2007, at 8:55 AM, Mihael Hategan wrote:

                        </pre>
                        <blockquote type="cite">
                          <pre wrap="">Did you update cog?

On Mon, 2007-09-17 at 08:38 -0500, Veronika Nefedova wrote:
                          </pre>
                          <blockquote type="cite">
                            <pre wrap="">No, I've tried with r1740, it still hanged (timed out).
the log is on viper:/home/nefedova/alamines/MolDyn-244-
loops-20070914-1834-pvhyji75.log

NIka

On Sep 15, 2007, at 10:59 AM, Mihael Hategan wrote:

                            </pre>
                            <blockquote type="cite">
                              <pre wrap="">On Sat, 2007-09-15 at 09:06 +0000, Ben Clifford wrote:
                              </pre>
                              <blockquote type="cite">
                                <pre wrap="">On Fri, 14 Sep 2007, Mihael Hategan wrote:

                                </pre>
                                <blockquote type="cite">
                                  <pre wrap="">On Thu, 2007-09-13 at 16:41 -0500, Mihael Hategan
wrote:
                                  </pre>
                                  <blockquote type="cite">
                                    <pre wrap="">Ok, so there's something in.
                                    </pre>
                                  </blockquote>
                                  <pre wrap="">That something was throttling a bit too much (not
just
jobs,
but all
tasks on that site). I need to take a second look at
it.
                                  </pre>
                                </blockquote>
                                <pre wrap="">Is that fixed by cog r1740? It looks like that commit
is
intended to.
                                </pre>
                              </blockquote>
                              <pre wrap="">It's an attempt to fix it, but it needs to be confirmed
by
Nika.

_______________________________________________
Swift-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

                              </pre>
                            </blockquote>
                          </blockquote>
                        </blockquote>
                      </blockquote>
                    </blockquote>
                    <pre wrap="">_______________________________________________
Swift-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

                    </pre>
                  </blockquote>
                </blockquote>
                <pre wrap="">_______________________________________________
Swift-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

                </pre>
              </blockquote>
              <pre wrap="">_______________________________________________
Swift-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

              </pre>
            </blockquote>
          </blockquote>
          <pre wrap="">_______________________________________________
Swift-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>


          </pre>
        </blockquote>
      </blockquote>
      <pre wrap="">
      </pre>
    </blockquote>
    <pre wrap="">_______________________________________________
Swift-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a>
<a class="moz-txt-link-freetext" href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a>

    </pre>
  </blockquote>
  <pre wrap=""><!---->
  </pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">-- 
============================================
Ioan Raicu
Ph.D. Student
============================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
============================================
Email: <a class="moz-txt-link-abbreviated" href="mailto:iraicu@cs.uchicago.edu">iraicu@cs.uchicago.edu</a>
Web:   <a class="moz-txt-link-freetext" href="http://www.cs.uchicago.edu/~iraicu">http://www.cs.uchicago.edu/~iraicu</a>
       <a class="moz-txt-link-freetext" href="http://dsl.cs.uchicago.edu/">http://dsl.cs.uchicago.edu/</a>
============================================
============================================</pre>
</body>
</html>