From maltaweel at anl.gov  Thu Oct  1 12:19:25 2009
From: maltaweel at anl.gov (Altaweel, Mark R.)
Date: Thu, 1 Oct 2009 12:19:25 -0500
Subject: [Swift-user] Using swift for PCs
Message-ID: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>

Hi,

I just started trying to use swift, and we are investigating to see if swift can be useful for our applications. Our users apply windows, mac, and linux/unix environments for their work, so we need to have swift work on all three platforms. I got it to run on the mac and linux side of things, but I am having a little trouble running it on windows.

First, I am trying to run swift-0.9 through cygwin. I followed the install instructions for swift and when I tried to run any of the demos (e.g., first.swift) I get this error:

SWIFT_HOME is not set, and all attempts at guessing it failed.
-------------------------------------------------------------------------------------------

However, I did set SWIFT_HOME in the System. I am not sure what the problem is. Any hints would be appreciated.

Thanks.

Mark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091001/f654569c/attachment.html>

From wilde at mcs.anl.gov  Thu Oct  1 13:29:54 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Thu, 01 Oct 2009 13:29:54 -0500
Subject: [Swift-user] Using swift for PCs
In-Reply-To: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>
References: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>
Message-ID: <4AC4F522.3070003@mcs.anl.gov>

Mark, the swift command should set $SWIFT_HOME by finding where it was 
executed from. That logic seems not to be working under cygwin, likely 
due to some shell or command differences. We'll need to investigate, but 
perhaps this gives you a clue to help you to down the problem.

- Mike


On 10/1/09 12:19 PM, Altaweel, Mark R. wrote:
> Hi,
>  
> I just started trying to use swift, and we are investigating to see if 
> swift can be useful for our applications. Our users apply windows, mac, 
> and linux/unix environments for their work, so we need to have swift 
> work on all three platforms. I got it to run on the mac and linux side 
> of things, but I am having a little trouble running it on windows.
>  
> First, I am trying to run swift-0.9 through cygwin. I followed the 
> install instructions for swift and when I tried to run any of the demos 
> (e.g., first.swift) I get this error:
>  
> SWIFT_HOME is not set, and all attempts at guessing it failed.
> -------------------------------------------------------------------------------------------
>  
> However, I did set SWIFT_HOME in the System. I am not sure what the 
> problem is. Any hints would be appreciated.
>  
> Thanks.
>  
> Mark
>  
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user


From foster at anl.gov  Thu Oct  1 13:30:25 2009
From: foster at anl.gov (Ian Foster)
Date: Thu, 1 Oct 2009 13:30:25 -0500
Subject: [Swift-user] Using swift for PCs
In-Reply-To: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>
References: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>
Message-ID: <C190F87A-1757-4684-86BB-2B3E4DFB683E@anl.gov>

One option is that we can set them up to log in to a Linux server and  
run from there?


On Oct 1, 2009, at 12:19 PM, Altaweel, Mark R. wrote:

> Hi,
>
> I just started trying to use swift, and we are investigating to see  
> if swift can be useful for our applications. Our users apply  
> windows, mac, and linux/unix environments for their work, so we need  
> to have swift work on all three platforms. I got it to run on the  
> mac and linux side of things, but I am having a little trouble  
> running it on windows.
>
> First, I am trying to run swift-0.9 through cygwin. I followed the  
> install instructions for swift and when I tried to run any of the  
> demos (e.g., first.swift) I get this error:
>
> SWIFT_HOME is not set, and all attempts at guessing it failed.
> -------------------------------------------------------------------------------------------
>
> However, I did set SWIFT_HOME in the System. I am not sure what the  
> problem is. Any hints would be appreciated.
>
> Thanks.
>
> Mark
>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091001/b91c9a3d/attachment.html>

From wilde at mcs.anl.gov  Thu Oct  1 13:42:07 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Thu, 01 Oct 2009 13:42:07 -0500
Subject: [Swift-user] Using swift for PCs
In-Reply-To: <C190F87A-1757-4684-86BB-2B3E4DFB683E@anl.gov>
References: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>
	<C190F87A-1757-4684-86BB-2B3E4DFB683E@anl.gov>
Message-ID: <4AC4F7FF.3080300@mcs.anl.gov>

Right, in our discussion yesterday we went through all the systems 
available, and Mark and Jonathan too have Linus systems and clusters.

They are just sanity testing how Swift runs on Windows because they have 
a large Windows user community.

- Mike


On 10/1/09 1:30 PM, Ian Foster wrote:
> One option is that we can set them up to log in to a Linux server and 
> run from there?
> 
> 
> On Oct 1, 2009, at 12:19 PM, Altaweel, Mark R. wrote:
> 
>> Hi,
>>  
>> I just started trying to use swift, and we are investigating to see if 
>> swift can be useful for our applications. Our users apply windows, 
>> mac, and linux/unix environments for their work, so we need to have 
>> swift work on all three platforms. I got it to run on the mac and 
>> linux side of things, but I am having a little trouble running it on 
>> windows.
>>  
>> First, I am trying to run swift-0.9 through cygwin. I followed the 
>> install instructions for swift and when I tried to run any of the 
>> demos (e.g., first.swift) I get this error:
>>  
>> SWIFT_HOME is not set, and all attempts at guessing it failed.
>> -------------------------------------------------------------------------------------------
>>  
>> However, I did set SWIFT_HOME in the System. I am not sure what the 
>> problem is. Any hints would be appreciated.
>>  
>> Thanks.
>>  
>> Mark
>>  
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu <mailto:Swift-user at ci.uchicago.edu>
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user


From maltaweel at anl.gov  Thu Oct  1 13:43:47 2009
From: maltaweel at anl.gov (Altaweel, Mark R.)
Date: Thu, 1 Oct 2009 13:43:47 -0500
Subject: [Swift-user] Using swift for PCs
In-Reply-To: <4AC4F7FF.3080300@mcs.anl.gov>
References: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>
	<C190F87A-1757-4684-86BB-2B3E4DFB683E@anl.gov>
	<4AC4F7FF.3080300@mcs.anl.gov>
Message-ID: <156080967112E24F88A6F6FF0E10DA6D7E552E@OZZY.anl.gov>

Yes, that's exactly the case. We need to make sure that a variety of users can run this, including those who simply want to run Repast on single workstations with multiple processors/cores.

Mark 

-----Original Message-----
From: Michael Wilde [mailto:wilde at mcs.anl.gov] 
Sent: Thursday, October 01, 2009 1:42 PM
To: Ian Foster
Cc: Altaweel, Mark R.; 'swift-user at ci.uchicago.edu'
Subject: Re: [Swift-user] Using swift for PCs

Right, in our discussion yesterday we went through all the systems available, and Mark and Jonathan too have Linus systems and clusters.

They are just sanity testing how Swift runs on Windows because they have a large Windows user community.

- Mike


On 10/1/09 1:30 PM, Ian Foster wrote:
> One option is that we can set them up to log in to a Linux server and 
> run from there?
> 
> 
> On Oct 1, 2009, at 12:19 PM, Altaweel, Mark R. wrote:
> 
>> Hi,
>>  
>> I just started trying to use swift, and we are investigating to see 
>> if swift can be useful for our applications. Our users apply windows, 
>> mac, and linux/unix environments for their work, so we need to have 
>> swift work on all three platforms. I got it to run on the mac and 
>> linux side of things, but I am having a little trouble running it on 
>> windows.
>>  
>> First, I am trying to run swift-0.9 through cygwin. I followed the 
>> install instructions for swift and when I tried to run any of the 
>> demos (e.g., first.swift) I get this error:
>>  
>> SWIFT_HOME is not set, and all attempts at guessing it failed.
>> ---------------------------------------------------------------------
>> ----------------------
>>  
>> However, I did set SWIFT_HOME in the System. I am not sure what the 
>> problem is. Any hints would be appreciated.
>>  
>> Thanks.
>>  
>> Mark
>>  
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu <mailto:Swift-user at ci.uchicago.edu>
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> 
> 
> ----------------------------------------------------------------------
> --
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user


From hockyg at uchicago.edu  Thu Oct  1 13:46:55 2009
From: hockyg at uchicago.edu (Glen Hocky)
Date: Thu, 1 Oct 2009 14:46:55 -0400
Subject: [Swift-user] Using swift for PCs
In-Reply-To: <156080967112E24F88A6F6FF0E10DA6D7E552E@OZZY.anl.gov>
References: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>
	<C190F87A-1757-4684-86BB-2B3E4DFB683E@anl.gov>
	<4AC4F7FF.3080300@mcs.anl.gov>
	<156080967112E24F88A6F6FF0E10DA6D7E552E@OZZY.anl.gov>
Message-ID: <e71fa8ae0910011146m37ccc2f7uc248cfddf3a6f1eb@mail.gmail.com>

I spent a few hours trying to get swift 0.9 to work in Cygwin w/ Alex Moore
in David Biron's group. We were never successful, so it seems some
especially advanced configuration is necessary.

Glen

On Thu, Oct 1, 2009 at 2:43 PM, Altaweel, Mark R. <maltaweel at anl.gov> wrote:

> Yes, that's exactly the case. We need to make sure that a variety of users
> can run this, including those who simply want to run Repast on single
> workstations with multiple processors/cores.
>
> Mark
>
> -----Original Message-----
> From: Michael Wilde [mailto:wilde at mcs.anl.gov]
> Sent: Thursday, October 01, 2009 1:42 PM
> To: Ian Foster
> Cc: Altaweel, Mark R.; 'swift-user at ci.uchicago.edu'
> Subject: Re: [Swift-user] Using swift for PCs
>
> Right, in our discussion yesterday we went through all the systems
> available, and Mark and Jonathan too have Linus systems and clusters.
>
> They are just sanity testing how Swift runs on Windows because they have a
> large Windows user community.
>
> - Mike
>
>
>
>
> On 10/1/09 1:30 PM, Ian Foster wrote:
> > One option is that we can set them up to log in to a Linux server and
> > run from there?
> >
> >
> > On Oct 1, 2009, at 12:19 PM, Altaweel, Mark R. wrote:
> >
> >> Hi,
> >>
> >> I just started trying to use swift, and we are investigating to see
> >> if swift can be useful for our applications. Our users apply windows,
> >> mac, and linux/unix environments for their work, so we need to have
> >> swift work on all three platforms. I got it to run on the mac and
> >> linux side of things, but I am having a little trouble running it on
> >> windows.
> >>
> >> First, I am trying to run swift-0.9 through cygwin. I followed the
> >> install instructions for swift and when I tried to run any of the
> >> demos (e.g., first.swift) I get this error:
> >>
> >> SWIFT_HOME is not set, and all attempts at guessing it failed.
> >> ---------------------------------------------------------------------
> >> ----------------------
> >>
> >> However, I did set SWIFT_HOME in the System. I am not sure what the
> >> problem is. Any hints would be appreciated.
> >>
> >> Thanks.
> >>
> >> Mark
> >>
> >> _______________________________________________
> >> Swift-user mailing list
> >> Swift-user at ci.uchicago.edu <mailto:Swift-user at ci.uchicago.edu>
> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> >
> >
> > ----------------------------------------------------------------------
> > --
> >
> > _______________________________________________
> > Swift-user mailing list
> > Swift-user at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091001/5656989c/attachment.html>

From iraicu at cs.uchicago.edu  Thu Oct  1 14:20:37 2009
From: iraicu at cs.uchicago.edu (Ioan Raicu)
Date: Thu, 01 Oct 2009 14:20:37 -0500
Subject: [Swift-user] Using swift for PCs
In-Reply-To: <e71fa8ae0910011146m37ccc2f7uc248cfddf3a6f1eb@mail.gmail.com>
References: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>	<C190F87A-1757-4684-86BB-2B3E4DFB683E@anl.gov>	<4AC4F7FF.3080300@mcs.anl.gov>	<156080967112E24F88A6F6FF0E10DA6D7E552E@OZZY.anl.gov>
	<e71fa8ae0910011146m37ccc2f7uc248cfddf3a6f1eb@mail.gmail.com>
Message-ID: <4AC50105.2080609@cs.uchicago.edu>

Hi,
I just played with the new virtualization tool called VirtualBox, made 
by Sun.
http://www.virtualbox.org/

Its has similar functionality as VMWare, but its much lighter weight 
(40MB install), its free, and works across Windows, Macs, and Linux 
(running in user space, so there is no special requirements from the 
underlying OS or even hardware). If Cygwin doesn't work out, why not try 
setting up a minimal install of Linux with Swift pre-installed in 
Virtual Box? I also think Virtual Box has some nice file sharing 
features, so the Linux install can access data from Windows.

Just another idea to try out.

Ioan

Glen Hocky wrote:
> I spent a few hours trying to get swift 0.9 to work in Cygwin w/ Alex 
> Moore in David Biron's group. We were never successful, so it seems 
> some especially advanced configuration is necessary.
>
> Glen
>
> On Thu, Oct 1, 2009 at 2:43 PM, Altaweel, Mark R. <maltaweel at anl.gov 
> <mailto:maltaweel at anl.gov>> wrote:
>
>     Yes, that's exactly the case. We need to make sure that a variety
>     of users can run this, including those who simply want to run
>     Repast on single workstations with multiple processors/cores.
>
>     Mark
>
>     -----Original Message-----
>     From: Michael Wilde [mailto:wilde at mcs.anl.gov
>     <mailto:wilde at mcs.anl.gov>]
>     Sent: Thursday, October 01, 2009 1:42 PM
>     To: Ian Foster
>     Cc: Altaweel, Mark R.; 'swift-user at ci.uchicago.edu
>     <mailto:swift-user at ci.uchicago.edu>'
>     Subject: Re: [Swift-user] Using swift for PCs
>
>     Right, in our discussion yesterday we went through all the systems
>     available, and Mark and Jonathan too have Linus systems and clusters.
>
>     They are just sanity testing how Swift runs on Windows because
>     they have a large Windows user community.
>
>     - Mike
>
>
>
>
>     On 10/1/09 1:30 PM, Ian Foster wrote:
>     > One option is that we can set them up to log in to a Linux
>     server and
>     > run from there?
>     >
>     >
>     > On Oct 1, 2009, at 12:19 PM, Altaweel, Mark R. wrote:
>     >
>     >> Hi,
>     >>
>     >> I just started trying to use swift, and we are investigating to see
>     >> if swift can be useful for our applications. Our users apply
>     windows,
>     >> mac, and linux/unix environments for their work, so we need to have
>     >> swift work on all three platforms. I got it to run on the mac and
>     >> linux side of things, but I am having a little trouble running
>     it on
>     >> windows.
>     >>
>     >> First, I am trying to run swift-0.9 through cygwin. I followed the
>     >> install instructions for swift and when I tried to run any of the
>     >> demos (e.g., first.swift) I get this error:
>     >>
>     >> SWIFT_HOME is not set, and all attempts at guessing it failed.
>     >>
>     ---------------------------------------------------------------------
>     >> ----------------------
>     >>
>     >> However, I did set SWIFT_HOME in the System. I am not sure what the
>     >> problem is. Any hints would be appreciated.
>     >>
>     >> Thanks.
>     >>
>     >> Mark
>     >>
>     >> _______________________________________________
>     >> Swift-user mailing list
>     >> Swift-user at ci.uchicago.edu <mailto:Swift-user at ci.uchicago.edu>
>     <mailto:Swift-user at ci.uchicago.edu
>     <mailto:Swift-user at ci.uchicago.edu>>
>     >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>     >
>     >
>     >
>     ----------------------------------------------------------------------
>     > --
>     >
>     > _______________________________________________
>     > Swift-user mailing list
>     > Swift-user at ci.uchicago.edu <mailto:Swift-user at ci.uchicago.edu>
>     > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>     _______________________________________________
>     Swift-user mailing list
>     Swift-user at ci.uchicago.edu <mailto:Swift-user at ci.uchicago.edu>
>     http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user

-- 
=================================================================
Ioan Raicu, Ph.D.
NSF/CRA Computing Innovation Fellow
=================================================================
Center for Ultra-scale Computing and Information Security (CUCIS)
Department of Electrical Engineering and Computer Science
Northwestern University
2145 Sheridan Rd, Tech M384 
Evanston, IL 60208-3118
=================================================================
Cel:   1-847-722-0876
Tel:   1-847-491-8163
Email: iraicu at eecs.northwestern.edu
Web:   http://www.eecs.northwestern.edu/~iraicu/
       http://cucis.ece.northwestern.edu/
=================================================================
=================================================================


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091001/e5358a48/attachment.html>

From hockyg at uchicago.edu  Thu Oct  1 14:33:25 2009
From: hockyg at uchicago.edu (Glen Hocky)
Date: Thu, 1 Oct 2009 15:33:25 -0400
Subject: [Swift-user] Using swift for PCs
In-Reply-To: <4AC50105.2080609@cs.uchicago.edu>
References: <156080967112E24F88A6F6FF0E10DA6D7E552A@OZZY.anl.gov>
	<C190F87A-1757-4684-86BB-2B3E4DFB683E@anl.gov>
	<4AC4F7FF.3080300@mcs.anl.gov>
	<156080967112E24F88A6F6FF0E10DA6D7E552E@OZZY.anl.gov>
	<e71fa8ae0910011146m37ccc2f7uc248cfddf3a6f1eb@mail.gmail.com>
	<4AC50105.2080609@cs.uchicago.edu>
Message-ID: <e71fa8ae0910011233x5a5c215cw5cf4204c69448924@mail.gmail.com>

I'm also a big fan of VirtualBox and one advantage of this would be that
VirtualBox runs great on all 3 platforms (i've tested) so you could have
everyone running the same virtual image, so configuration on all hosts would
be equivalent.

Glen

On Thu, Oct 1, 2009 at 3:20 PM, Ioan Raicu <iraicu at cs.uchicago.edu> wrote:

>  Hi,
> I just played with the new virtualization tool called VirtualBox, made by
> Sun.
> http://www.virtualbox.org/
>
> Its has similar functionality as VMWare, but its much lighter weight (40MB
> install), its free, and works across Windows, Macs, and Linux (running in
> user space, so there is no special requirements from the underlying OS or
> even hardware). If Cygwin doesn't work out, why not try setting up a minimal
> install of Linux with Swift pre-installed in Virtual Box? I also think
> Virtual Box has some nice file sharing features, so the Linux install can
> access data from Windows.
>
> Just another idea to try out.
>
> Ioan
>
> Glen Hocky wrote:
>
> I spent a few hours trying to get swift 0.9 to work in Cygwin w/ Alex Moore
> in David Biron's group. We were never successful, so it seems some
> especially advanced configuration is necessary.
>
> Glen
>
> On Thu, Oct 1, 2009 at 2:43 PM, Altaweel, Mark R. <maltaweel at anl.gov>wrote:
>
>> Yes, that's exactly the case. We need to make sure that a variety of users
>> can run this, including those who simply want to run Repast on single
>> workstations with multiple processors/cores.
>>
>> Mark
>>
>> -----Original Message-----
>> From: Michael Wilde [mailto:wilde at mcs.anl.gov]
>> Sent: Thursday, October 01, 2009 1:42 PM
>> To: Ian Foster
>> Cc: Altaweel, Mark R.; 'swift-user at ci.uchicago.edu'
>> Subject: Re: [Swift-user] Using swift for PCs
>>
>> Right, in our discussion yesterday we went through all the systems
>> available, and Mark and Jonathan too have Linus systems and clusters.
>>
>> They are just sanity testing how Swift runs on Windows because they have a
>> large Windows user community.
>>
>> - Mike
>>
>>
>>
>>
>> On 10/1/09 1:30 PM, Ian Foster wrote:
>> > One option is that we can set them up to log in to a Linux server and
>> > run from there?
>> >
>> >
>> > On Oct 1, 2009, at 12:19 PM, Altaweel, Mark R. wrote:
>> >
>> >> Hi,
>> >>
>> >> I just started trying to use swift, and we are investigating to see
>> >> if swift can be useful for our applications. Our users apply windows,
>> >> mac, and linux/unix environments for their work, so we need to have
>> >> swift work on all three platforms. I got it to run on the mac and
>> >> linux side of things, but I am having a little trouble running it on
>> >> windows.
>> >>
>> >> First, I am trying to run swift-0.9 through cygwin. I followed the
>> >> install instructions for swift and when I tried to run any of the
>> >> demos (e.g., first.swift) I get this error:
>> >>
>> >> SWIFT_HOME is not set, and all attempts at guessing it failed.
>> >> ---------------------------------------------------------------------
>> >> ----------------------
>> >>
>> >> However, I did set SWIFT_HOME in the System. I am not sure what the
>> >> problem is. Any hints would be appreciated.
>> >>
>> >> Thanks.
>> >>
>> >> Mark
>> >>
>> >> _______________________________________________
>> >> Swift-user mailing list
>> >> Swift-user at ci.uchicago.edu <mailto:Swift-user at ci.uchicago.edu>
>> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>> >
>> >
>> > ----------------------------------------------------------------------
>> > --
>> >
>> > _______________________________________________
>> > Swift-user mailing list
>> > Swift-user at ci.uchicago.edu
>> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>
>
> ------------------------------
>
> _______________________________________________
> Swift-user mailing listSwift-user at ci.uchicago.eduhttp://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>
>
> --
> =================================================================
> Ioan Raicu, Ph.D.
> NSF/CRA Computing Innovation Fellow
> =================================================================
> Center for Ultra-scale Computing and Information Security (CUCIS)
> Department of Electrical Engineering and Computer Science
> Northwestern University
> 2145 Sheridan Rd, Tech M384
> Evanston, IL 60208-3118
> =================================================================
> Cel:   1-847-722-0876
> Tel:   1-847-491-8163
> Email: iraicu at eecs.northwestern.edu
> Web:   http://www.eecs.northwestern.edu/~iraicu/ <http://www.eecs.northwestern.edu/%7Eiraicu/>
>        http://cucis.ece.northwestern.edu/
> =================================================================
> =================================================================
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091001/8e080ccc/attachment.html>

From iraicu at cs.uchicago.edu  Tue Oct 13 17:15:19 2009
From: iraicu at cs.uchicago.edu (Ioan Raicu)
Date: Tue, 13 Oct 2009 17:15:19 -0500
Subject: [Swift-user] CFP: ACM Int. Symposium High Performance Distributed
 Computing (HPDC) 2010
Message-ID: <4AD4FBF7.4070601@cs.uchicago.edu>

ACM HPDC 2010 Call For Papers

19th ACM International Symposium on 
High Performance Distributed Computing 

Chicago, Illinois
June 21-25, 2010

http://hpdc2010.eecs.northwestern.edu

The ACM International Symposium on High Performance Distributed
Computing (HPDC) is the premier venue for presenting the latest
research on the design, implementation, evaluation, and use of
parallel and distributed systems for high performance and high end
computing.  The 19th installment of HPDC will take place in the heart
of the Chicago, Illinois, the third largest city in the United States
and a major technological and cultural capital.  The conference will be
held on June 23-25 (Wednesday through Friday) with affiliated
workshops occurring on June 21-22 (Monday and Tuesday).

Submissions are welcomed on all forms of high performance distributed
computing, including grids, clouds, clusters, service-oriented
computing, utility computing, peer-to-peer systems, and global
computing ensembles. New scholarly research showing empirical and
reproducible results in architectures, systems, and networks is
strongly encouraged, as are experience reports of applications and
deployments that can provide insights for future high performance
distributed computing research.

All papers will be rigorously reviewed by a distinguished program
committee, with a strong focus on the combination of rigorous
scientific results and likely high impact within high performance
distributed computing.  Research papers must clearly demonstrate
research contributions and novelty while experience reports must
clearly describe lessons learned and demonstrate impact. Topics of
interest include (but are not limited to) the following, in the
context of high performance distributed computing and high end
computing:

  * Systems 
  * Architectures
  * Algorithms
  * Networking
  * Programming languages and environments
  * Data management
  * I/O and file systems
  * Virtualization
  * Resource management, scheduling, and load-balancing
  * Performance modeling, simulation, and prediction
  * Fault tolerance, reliability and availability
  * Security, configuration, policy, and management issues
  * Multicore issues and opportunities
  * Models and use cases for utility, grid, and cloud computing

Both full papers and short papers (for poster presentation and/or
demonstrations) may be submitted.

IMPORTANT DATES

Paper Abstract submissions:     January 15, 2010
Paper submissions:              January 22, 2010
Author notification: 		March 30, 2010
Final manuscripts:	 	April 23, 2010

SUBMISSIONS

Authors are invited to submit full papers of at most 12 pages or short
papers of at most 4 pages.  The page limits include all figures and
references.  Papers should be formatted in the ACM proceedings style
(e.g., http://www.acm.org/sigs/publications/proceedings-templates).
Reviewing is single-blind.  Papers must be self-contained and provide
the technical substance required for the program committee to evaluate
the paper's contribution, including how it differs from prior
work. All papers will be reviewed and judged on correctness,
originality, technical strength, significance, quality of
presentation, and interest and relevance to the conference. Submitted
papers must be original work that has not appeared in and is not under
consideration for another conference or a journal.  There will be NO
DEADLINE EXTENSIONS.

PUBLICATION

Accepted full and short papers will appear in the conference proceedings.

WORKSHOPS

A separate call for workshops is available at 
http://hpdc2010.eecs.northwestern.edu/hpdc2010-cfw.txt.  The deadline 
for workshop proposals is November 2, 2009.


GENERAL CO-CHAIRS
Kate Keahey, Argonne National Labs
Salim Hariri, University of Arizona

STEERING COMMITTEE
Salim Hariri, Univ. of Arizona (Chair)
Andrew A. Chien, Intel / UCSD
Henri Bal, Vrije University
Franck Cappello, INRIA
Jack Dongarra, Univ. of Tennessee
Ian Foster, ANL& Univ. of Chicago
Andrew Grimshaw, Univ. of Virginia
Carl Kesselman, USC/ISI
Dieter Kranzlmueller, Ludwig-Maximilians-Univ. Muenchen
Miron Livny, Univ. of Wisconsin
Manish Parashar, Rutgers University
Karsten Schwan, Georgia Tech
David Walker, Univ. of Cardiff
Rich Wolski, UCSB

PROGRAM CHAIR
Peter Dinda, Northwestern University

PROGRAM COMMITTEE
Ron Brightwell, Sandia National Labs
Fabian Bustamante, Northwestern University
Henri Bal, Vrije Universiteit
Frank Cappello, INRIA
Claris Castillo, IBM Research
Henri Casanova, University of Hawaii
Abhishek Chandra, University of Minnesota
Chris Colohan, Google
Brian Cooper, Yahoo Research
Wu-chun Feng, Virginia Tech
Jose Fortes, University of Florida
Ian Foster, University of Chicago / Argonne
Geoffrey Fox, Indiana University
Michael Gerndt, TU-Munich
Andrew Grimshaw, University of Virginia
Thilo Kielmann, Vrije Universiteit
Arthur Maccabe, Oak Ridge National Labs
Satoshi Matsuoka, Toyota Institute of Technology
Jose Moreira, IBM Research
Klara Nahrstedt, UIUC
Dushyanth Narayanan, Microsoft Research
Manish Parashar, Rutgers University
Joel Saltz, Emory University
Karsten Schwan, Georgia Tech
Thomas Stricker, Google
Jaspal Subhlok, University of Houston
Michela Taufer, University of Delaware
Valerie Taylor, TAMU
Douglas Thain, University of Notre Dame
Jon Weissman, University of Minnesota
Rich Wolski, UCSB and Eucalyptus Systems
Dongyan Xu, Purdue University
Ken Yocum, UCSD

WORKSHOP CHAIR
Douglas Thain, University of Notre Dame

PUBLICITY CO-CHAIRS
Martin Swany, U. Delaware
Morris Riedel, Juelich Supercomputing Centre
Renato Ferreira, Universidade Federal de Minas Gerais
Kento Aida, NII and Tokyo Institute of Technology

LOCAL ARRANGEMENTS CHAIR
Zhiling Lan, IIT

STUDENT ACTIVITIES CO-CHAIRS
John Lange, Northwestern University
Ioan Raicu, Northwestern University


-- 
=================================================================
Ioan Raicu, Ph.D.
NSF/CRA Computing Innovation Fellow
=================================================================
Center for Ultra-scale Computing and Information Security (CUCIS)
Department of Electrical Engineering and Computer Science
Northwestern University
2145 Sheridan Rd, Tech M384 
Evanston, IL 60208-3118
=================================================================
Cel:   1-847-722-0876
Tel:   1-847-491-8163
Email: iraicu at eecs.northwestern.edu
Web:   http://www.eecs.northwestern.edu/~iraicu/
       http://cucis.ece.northwestern.edu/
=================================================================
=================================================================


From skenny at uchicago.edu  Wed Oct 14 15:39:15 2009
From: skenny at uchicago.edu (skenny at uchicago.edu)
Date: Wed, 14 Oct 2009 15:39:15 -0500 (CDT)
Subject: [Swift-user] Re: [Swift-devel] burnin' up ranger w/the
 latest coasters
In-Reply-To: <20091013111417.CDU59058@m4500-02.uchicago.edu>
References: <20091013111417.CDU59058@m4500-02.uchicago.edu>
Message-ID: <20091014153915.CDW69329@m4500-02.uchicago.edu>

for those interested, here are the config files used for this run:

swift.properties:

sites.file=config/coaster_ranger.xml
tc.file=/ci/projects/cnari/config/tc.data
lazy.errors=false
caching.algorithm=LRU
pgraph=false
pgraph.graph.options=splines="compound", rankdir="TB"
pgraph.node.options=color="seagreen", style="filled"
clustering.enabled=false
clustering.queue.delay=4
clustering.min.time=60
kickstart.enabled=maybe
kickstart.always.transfer=false
wrapperlog.always.transfer=false
throttle.submit=3
throttle.host.submit=8
throttle.score.job.factor=64
throttle.transfers=16
throttle.file.operations=16
sitedir.keep=false
execution.retries=3
replication.enabled=false
replication.min.queue.time=60
replication.limit=3
foreach.max.threads=16384

coaster_ranger.xml:

<config>
 <import file="sys.xml"/>
<import file="vdl-lib.xml"/>
<set name="username">
  <arg><vdl:new type="string" value="user"/></arg>
</set>
<pool handle="RANGER">
    <profile namespace="karajan"
key="jobThrottle">1000.0</profile>
    <filesystem provider="coaster"
url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
    <profile namespace="globus" key="queue">normal</profile>
    <profile namespace="globus" key="workersPerNode">32</profile>
    <profile namespace="globus" key="nodeGranularity">1</profile>
    <profile namespace="globus" key="slots">16</profile>
    <profile namespace="globus" key="maxNodes">8192</profile>
    <profile namespace="globus" key="maxTime">72000</profile>
    <profile namespace="globus"
key="project">TG-DBS080004N</profile>
    <execution provider="coaster"
url="gatekeeper.ranger.tacc.teragrid.org"
jobManager="gt2:gt2:SGE"/>
    <!--
workdirectory>/work/00043/tg457040/sidgrid_out/{username}</workdirectory
-->
   
<workdirectory>/work/00926/tg459516/sidgrid_out/{username}</workdirectory>
  </pool>
</config>


---- Original message ----
>Date: Tue, 13 Oct 2009 11:14:17 -0500 (CDT)
>From: <skenny at uchicago.edu>  
>Subject: [Swift-devel] burnin' up ranger w/the latest coasters  
>To: swift-devel at ci.uchicago.edu
>
>Final status:  Finished successfully:131072 
>
>re-running some of the workflows from our recent SEM
>paper with the latest swift...sadly, queue time on ranger has
>only gone up since those initial runs...but luckily coasters
>has speeded things up, so it ends up evening out for time to
>solution :)
>
>not sure i fully understand the plot:
>
>http://www.ci.uchicago.edu/~skenny/workflows/sem_131k/
>
>log is here:
>
>/ci/projects/cnari/logs/skenny/4reg_2cond-20091012-1607-ugidm2s2.log
>_______________________________________________
>Swift-devel mailing list
>Swift-devel at ci.uchicago.edu
>http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel


From iraicu at cs.uchicago.edu  Wed Oct 14 16:41:18 2009
From: iraicu at cs.uchicago.edu (Ioan Raicu)
Date: Wed, 14 Oct 2009 16:41:18 -0500
Subject: [Swift-user] Call for Workshops: ACM Int. Symposium High
 Performance Distributed Computing (HPDC) 2010
Message-ID: <4AD6457E.8040004@cs.uchicago.edu>

HPDC 2010 - Call for Workshops

http://hpdc2010.eecs.northwestern.edu

We invite proposals for workshops to be held with the ACM Symposium on
High Performance Distributed Computing to be held in Chicago, Illinois
in June 2010.  Workshops will be held June 21-22, preceding the main
conference sessions June 23-25.

Workshops provide forums for discussion among researchers and
practitioners on focused topics or emerging research areas. Workshops
may be organized in whatever way is appropriate to the topic, possibly
including invited talks, panel discussions, presentation of work in
progress, or full peer-reviewer papers. Each workshop will be a full
day event hosting 20-40 participants.

A workshop must be proposed in writing and sent to dthain at nd.edu. A workshop
proposal should consist of:

 * Name of the workshop.
 * A few paragraphs describing the theme of the workshop 
   and how it relates to the  overall conference.
 * Data about previous offerings of the workshop, including attendance, 
   number of papers or presentations submitted and accepted.
 * Names and affiliations of the workshop organizers, and if applicable, 
   a significant portion of the program committee.
 * Plan for attracting submissions and attendees.
 * Timeline for milestones such as call for papers, 
   submission deadline, and so forth.

Workshop Proposal Deadline:      November 2, 2009
Workshop Notification:           November 9, 2009
Workshop Calls Online:           November 23, 2009
Workshop Proceedings Due:        April 23, 2010


-- 
=================================================================
Ioan Raicu, Ph.D.
NSF/CRA Computing Innovation Fellow
=================================================================
Center for Ultra-scale Computing and Information Security (CUCIS)
Department of Electrical Engineering and Computer Science
Northwestern University
2145 Sheridan Rd, Tech M384 
Evanston, IL 60208-3118
=================================================================
Cel:   1-847-722-0876
Tel:   1-847-491-8163
Email: iraicu at eecs.northwestern.edu
Web:   http://www.eecs.northwestern.edu/~iraicu/
       https://wiki.cucis.eecs.northwestern.edu/
=================================================================
=================================================================


From fedorov at bwh.harvard.edu  Mon Oct 19 22:35:21 2009
From: fedorov at bwh.harvard.edu (Andriy Fedorov)
Date: Mon, 19 Oct 2009 23:35:21 -0400
Subject: [Swift-user] Tuning parameters of coaster execution
Message-ID: <82f536810910192035o1eaf761chfff2e006e31fb51a@mail.gmail.com>

Hi,

I am trying to understand how to set correctly the coaster-related
parameters to optimize execution of my workflow. A single task I have
takes around 1-2 minutes. I set maxWalltime to 2 minutes, and there 40
of these tasks in my toy workflow. Coasters are configured as
gt2:gt2:pbs. When I run it with the default parameters, the workflow
completes (this is great!).

Now I am trying to understand what's going on and how to improve the
performance. Looking at the scheduler queue, I see that two jobs are
submitted in the beginning of the execution for 18 min each, one with
1 node, and one with 2 nodes. All of the execution is happening in
these two jobs (the number of jobs submitted is just two, for 40 taks,
so looks like things work). First question: why does it happen this
way? (two jobs, 18 minutes each, specific node allocation) I assume
only one of them (2-node) is executing worker tasks, but in this case
why allocation time is 18 minutes, not 20 (each worker walltime is 2
min)?

Second question: how do I make coaster to request more nodes? I tried
to increase nodeGranularity to 10. This resulted in only one (!) job
with 10 nodes and 20 min walltime showing up on the scheduler. But it
looks like the jobs are still executed 2 at a time!

Progress:  Selecting site:38  Active:2

According to documentation, default workersPerNode=1, so I would
expect at least 10 to be active. Again, I don't understand what's
going on uder the hood....

Can coaster experts give me some guidance what is going on, and how to
intelligently set the parameters?

Thanks!

--
Andriy Fedorov, Ph.D.

Research Fellow
Brigham and Women's Hospital
Harvard Medical School
75 Francis Street
Boston, MA 02115 USA
fedorov at bwh.harvard.edu


From wilde at mcs.anl.gov  Mon Oct 19 23:26:02 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Mon, 19 Oct 2009 23:26:02 -0500
Subject: [Swift-user] Tuning parameters of coaster execution
In-Reply-To: <82f536810910192035o1eaf761chfff2e006e31fb51a@mail.gmail.com>
References: <82f536810910192035o1eaf761chfff2e006e31fb51a@mail.gmail.com>
Message-ID: <4ADD3BDA.6050809@mcs.anl.gov>

Hi Andriy,

We'll need to wait for Mihael to advise you on this, but there's a few 
messages and threads in swift-devel that may be useful:

Ranger block scheduling:

http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-September/005985.html
http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-September/005986.html

Using Ranger with the latest coasters:

http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-October/005994.html

Also, the following maybe helpful to force a specific number of coasters 
to start and/or jobs to run on them, but I dont know how these settings 
interact with the coaster "block" settings:

---
To adjust the throttle, you can use this in your sites.xml <pool> element:

       <profile namespace="karajan" key="jobThrottle">2.55</profile>
       <profile namespace="karajan" key="initialScore">10000</profile>

The #jobs per site is then throttled to (jobThrottle * 100) + 1 = 256
when initialScore is large enough (and 10000 is).

Eg, if you had have cores, set jobThrottle to 0.49
For 200 cores, use 1.99
etc.

If you know how many cores you have available, always set initialScore
to 10000 which bypasses the Swift "slow start".
---

Mihael, can you create a few examples of consistent parameter settings 
that work well together for a few illustrative configurations?

- Mike


On 10/19/09 10:35 PM, Andriy Fedorov wrote:
> Hi,
> 
> I am trying to understand how to set correctly the coaster-related
> parameters to optimize execution of my workflow. A single task I have
> takes around 1-2 minutes. I set maxWalltime to 2 minutes, and there 40
> of these tasks in my toy workflow. Coasters are configured as
> gt2:gt2:pbs. When I run it with the default parameters, the workflow
> completes (this is great!).
> 
> Now I am trying to understand what's going on and how to improve the
> performance. Looking at the scheduler queue, I see that two jobs are
> submitted in the beginning of the execution for 18 min each, one with
> 1 node, and one with 2 nodes. All of the execution is happening in
> these two jobs (the number of jobs submitted is just two, for 40 taks,
> so looks like things work). First question: why does it happen this
> way? (two jobs, 18 minutes each, specific node allocation) I assume
> only one of them (2-node) is executing worker tasks, but in this case
> why allocation time is 18 minutes, not 20 (each worker walltime is 2
> min)?
> 
> Second question: how do I make coaster to request more nodes? I tried
> to increase nodeGranularity to 10. This resulted in only one (!) job
> with 10 nodes and 20 min walltime showing up on the scheduler. But it
> looks like the jobs are still executed 2 at a time!
> 
> Progress:  Selecting site:38  Active:2
> 
> According to documentation, default workersPerNode=1, so I would
> expect at least 10 to be active. Again, I don't understand what's
> going on uder the hood....
> 
> Can coaster experts give me some guidance what is going on, and how to
> intelligently set the parameters?
> 
> Thanks!
> 
> --
> Andriy Fedorov, Ph.D.
> 
> Research Fellow
> Brigham and Women's Hospital
> Harvard Medical School
> 75 Francis Street
> Boston, MA 02115 USA
> fedorov at bwh.harvard.edu
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user


From fedorov at bwh.harvard.edu  Tue Oct 20 08:51:24 2009
From: fedorov at bwh.harvard.edu (Andriy Fedorov)
Date: Tue, 20 Oct 2009 09:51:24 -0400
Subject: [Swift-user] Tuning parameters of coaster execution
In-Reply-To: <4ADD3BDA.6050809@mcs.anl.gov>
References: <82f536810910192035o1eaf761chfff2e006e31fb51a@mail.gmail.com>
	<4ADD3BDA.6050809@mcs.anl.gov>
Message-ID: <82f536810910200651o21a755e1m57e7c5ef8dc6d3e4@mail.gmail.com>

Mike,

Very helpful pointers. I did search, but on swift-users, not swift-devel.

Yes, it would be great if there were some typical configuration
examples. For me, as an application developer, it is not obvious to
figure these out, even though I have some experience with TeraGrid and
Globus... Let's see what's Mihael's opinion. I will work from the
examples included in the posts you suggested meanwhile.

Thanks!
--
Andriy Fedorov, Ph.D.

Research Fellow
Brigham and Women's Hospital
Harvard Medical School
75 Francis Street
Boston, MA 02115 USA
fedorov at bwh.harvard.edu


On Tue, Oct 20, 2009 at 00:26, Michael Wilde <wilde at mcs.anl.gov> wrote:
> Hi Andriy,
>
> We'll need to wait for Mihael to advise you on this, but there's a few
> messages and threads in swift-devel that may be useful:
>
> Ranger block scheduling:
>
> http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-September/005985.html
> http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-September/005986.html
>
> Using Ranger with the latest coasters:
>
> http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-October/005994.html
>
> Also, the following maybe helpful to force a specific number of coasters to
> start and/or jobs to run on them, but I dont know how these settings
> interact with the coaster "block" settings:
>
> ---
> To adjust the throttle, you can use this in your sites.xml <pool> element:
>
> ? ? ?<profile namespace="karajan" key="jobThrottle">2.55</profile>
> ? ? ?<profile namespace="karajan" key="initialScore">10000</profile>
>
> The #jobs per site is then throttled to (jobThrottle * 100) + 1 = 256
> when initialScore is large enough (and 10000 is).
>
> Eg, if you had have cores, set jobThrottle to 0.49
> For 200 cores, use 1.99
> etc.
>
> If you know how many cores you have available, always set initialScore
> to 10000 which bypasses the Swift "slow start".
> ---
>
> Mihael, can you create a few examples of consistent parameter settings that
> work well together for a few illustrative configurations?
>
> - Mike
>
>
> On 10/19/09 10:35 PM, Andriy Fedorov wrote:
>>
>> Hi,
>>
>> I am trying to understand how to set correctly the coaster-related
>> parameters to optimize execution of my workflow. A single task I have
>> takes around 1-2 minutes. I set maxWalltime to 2 minutes, and there 40
>> of these tasks in my toy workflow. Coasters are configured as
>> gt2:gt2:pbs. When I run it with the default parameters, the workflow
>> completes (this is great!).
>>
>> Now I am trying to understand what's going on and how to improve the
>> performance. Looking at the scheduler queue, I see that two jobs are
>> submitted in the beginning of the execution for 18 min each, one with
>> 1 node, and one with 2 nodes. All of the execution is happening in
>> these two jobs (the number of jobs submitted is just two, for 40 taks,
>> so looks like things work). First question: why does it happen this
>> way? (two jobs, 18 minutes each, specific node allocation) I assume
>> only one of them (2-node) is executing worker tasks, but in this case
>> why allocation time is 18 minutes, not 20 (each worker walltime is 2
>> min)?
>>
>> Second question: how do I make coaster to request more nodes? I tried
>> to increase nodeGranularity to 10. This resulted in only one (!) job
>> with 10 nodes and 20 min walltime showing up on the scheduler. But it
>> looks like the jobs are still executed 2 at a time!
>>
>> Progress: ?Selecting site:38 ?Active:2
>>
>> According to documentation, default workersPerNode=1, so I would
>> expect at least 10 to be active. Again, I don't understand what's
>> going on uder the hood....
>>
>> Can coaster experts give me some guidance what is going on, and how to
>> intelligently set the parameters?
>>
>> Thanks!
>>
>> --
>> Andriy Fedorov, Ph.D.
>>
>> Research Fellow
>> Brigham and Women's Hospital
>> Harvard Medical School
>> 75 Francis Street
>> Boston, MA 02115 USA
>> fedorov at bwh.harvard.edu
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>


From hategan at mcs.anl.gov  Tue Oct 20 10:55:47 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Tue, 20 Oct 2009 10:55:47 -0500
Subject: [Swift-user] Tuning parameters of coaster execution
In-Reply-To: <82f536810910192035o1eaf761chfff2e006e31fb51a@mail.gmail.com>
References: <82f536810910192035o1eaf761chfff2e006e31fb51a@mail.gmail.com>
Message-ID: <1256054147.22279.18.camel@localhost>

On Mon, 2009-10-19 at 23:35 -0400, Andriy Fedorov wrote:
> Hi,
> 
> I am trying to understand how to set correctly the coaster-related
> parameters to optimize execution of my workflow. A single task I have
> takes around 1-2 minutes. I set maxWalltime to 2 minutes, and there 40
> of these tasks in my toy workflow. Coasters are configured as
> gt2:gt2:pbs. When I run it with the default parameters, the workflow
> completes (this is great!).
> 
> Now I am trying to understand what's going on and how to improve the
> performance. Looking at the scheduler queue, I see that two jobs are
> submitted in the beginning of the execution for 18 min each, one with
> 1 node, and one with 2 nodes. All of the execution is happening in
> these two jobs (the number of jobs submitted is just two, for 40 taks,
> so looks like things work). First question: why does it happen this
> way? (two jobs, 18 minutes each, specific node allocation) I assume
> only one of them (2-node) is executing worker tasks, but in this case
> why allocation time is 18 minutes, not 20 (each worker walltime is 2
> min)?
> 
> Second question: how do I make coaster to request more nodes? I tried
> to increase nodeGranularity to 10. This resulted in only one (!) job
> with 10 nodes and 20 min walltime showing up on the scheduler. But it
> looks like the jobs are still executed 2 at a time!

You need a more recent version of the code.

A few weeks ago the "parallelism" option was added. By default it's set
to try to allocate as many nodes as there are jobs (parallelism=0.0),
whereas the behavior you see would have parallelism=1.0. I should change
the way the numbers are specified. It's not exactly intuitive unless you
look at how it works.

Anyway, it boils down to the notion of job size and block size. The
block size is defined as workers*bwalltime^parallelism, while the job
size is jwalltime^parallelism. At any given time you can fit roughly
workers*bwalltime^parallelism/jwalltime^parallelism jobs in a block.

You can see that with parallelism=0, that reduces to
workers/count(jobs).

Conversely, with parallelism=1 the jobs size is jwalltime and if your
block had bwalltime you could fit workers*bwalltime/jwalltime jobs in
it.

At the same time, bwalltime is controlled by the overallocation factors.
Once the block walltime is decided, the width (number of workers) is
picked based on the job sizes that need to be fit (according to the
above scheme).

Anyway, to sum it up, use a more recent version.


From fedorov at bwh.harvard.edu  Tue Oct 20 11:04:46 2009
From: fedorov at bwh.harvard.edu (Andriy Fedorov)
Date: Tue, 20 Oct 2009 12:04:46 -0400
Subject: [Swift-user] Tuning parameters of coaster execution
In-Reply-To: <1256054147.22279.18.camel@localhost>
References: <82f536810910192035o1eaf761chfff2e006e31fb51a@mail.gmail.com>
	<1256054147.22279.18.camel@localhost>
Message-ID: <82f536810910200904x584d8ca3m2da7fab8dc660b1d@mail.gmail.com>

On Tue, Oct 20, 2009 at 11:55, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> You need a more recent version of the code.
>

Mihael, I actually updated svn for both cog and swift yesterday prior
to running the tests. Here's what swift reports I have right now:

Swift svn swift-r3170 cog-r2529


> A few weeks ago the "parallelism" option was added. By default it's set
> to try to allocate as many nodes as there are jobs (parallelism=0.0),
> whereas the behavior you see would have parallelism=1.0. I should change
> the way the numbers are specified. It's not exactly intuitive unless you
> look at how it works.
>
> Anyway, it boils down to the notion of job size and block size. The
> block size is defined as workers*bwalltime^parallelism, while the job
> size is jwalltime^parallelism. At any given time you can fit roughly
> workers*bwalltime^parallelism/jwalltime^parallelism jobs in a block.
>
> You can see that with parallelism=0, that reduces to
> workers/count(jobs).
>
> Conversely, with parallelism=1 the jobs size is jwalltime and if your
> block had bwalltime you could fit workers*bwalltime/jwalltime jobs in
> it.
>
> At the same time, bwalltime is controlled by the overallocation factors.
> Once the block walltime is decided, the width (number of workers) is
> picked based on the job sizes that need to be fit (according to the
> above scheme).
>
> Anyway, to sum it up, use a more recent version.
>
>


From hategan at mcs.anl.gov  Tue Oct 20 11:23:22 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Tue, 20 Oct 2009 11:23:22 -0500
Subject: [Swift-user] Tuning parameters of coaster execution
In-Reply-To: <82f536810910200904x584d8ca3m2da7fab8dc660b1d@mail.gmail.com>
References: <82f536810910192035o1eaf761chfff2e006e31fb51a@mail.gmail.com>
	<1256054147.22279.18.camel@localhost>
	<82f536810910200904x584d8ca3m2da7fab8dc660b1d@mail.gmail.com>
Message-ID: <1256055802.24685.13.camel@localhost>

On Tue, 2009-10-20 at 12:04 -0400, Andriy Fedorov wrote:
> On Tue, Oct 20, 2009 at 11:55, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > You need a more recent version of the code.
> >
> 
> Mihael, I actually updated svn for both cog and swift yesterday prior
> to running the tests. Here's what swift reports I have right now:
> 
> Swift svn swift-r3170 cog-r2529

Given that even when you have granularity=10 you still see 2 jobs, I
suspect you are using swift site throttling parameters that force that.
I would set the jobThrottle higher and possibly the initial score
higher.

For troubleshooting, what you could do is, on the remote side, say cat
~/.globus/coasters/coasters.log|grep "BlockQueueProcessor">bqp.log and
post that. Also, you could set the remoteMonitorEnabled profile to
"true" to get visual feedback of what's happening.

The allocation time is 18 minutes because the new stuff doesn't
overallocate using a fixed multiplier (though you can force it to do
so). For small jobs (walltime = 1s) the multiplier is set by
lowOverallocation (10.0 by default) while for large jobs (walltime ->
+inf) the multiplier is 1, with an exponential decay in-between.

If you want to always have blocks being 10 times the job walltime, you
can set highOverallocation to 10.


From HodgessE at uhd.edu  Tue Oct 20 21:41:30 2009
From: HodgessE at uhd.edu (Hodgess, Erin)
Date: Tue, 20 Oct 2009 21:41:30 -0500
Subject: [Swift-user] using swift on a cluster
Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus>

Hi Swift Users:

I'm on a cluster and would like to use swift on the different sites on the cluster.

How would I do that, please?

Thanks,
Erin


Erin M. Hodgess, PhD
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: hodgesse at uhd.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091020/7125dc98/attachment.html>

From wilde at mcs.anl.gov  Tue Oct 20 22:49:45 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Tue, 20 Oct 2009 22:49:45 -0500
Subject: [Swift-user] using swift on a cluster
In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus>
References: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus>
Message-ID: <4ADE84D9.6020508@mcs.anl.gov>

Hi Erin,

I'm assuming you meant "use Swift to run jobs on the compute nodes of 
the cluster"?

If so, you first need to find out what scheduler (also called "batch 
system" or "local resource manager") the cluster is running.

Thats typical one of these: PBS, Condor, or SGE.

Either ask your system administrator, or see if the "man" command or 
similar probes give you a clue:

Condor: condor_q -version

condor_q -version
$CondorVersion: 7.2.4 Jun 16 2009 BuildID: 159529 $
$CondorPlatform: I386-LINUX_RHEL5 $

PBS: man qstat:

   qstat(1B)  PBS

SGE: man qstat:

   QSTAT(1)   Sun Grid Engine User Commands


If its PBS or Condor, then the Swift user guide gives the sites.xml 
entries to use.

Tell us what you find, then try following the instructions in the user 
guide, and follow up with questions as needed.

- Mike


On 10/20/09 9:41 PM, Hodgess, Erin wrote:
> Hi Swift Users:
> 
> I'm on a cluster and would like to use swift on the different sites on 
> the cluster.
> 
> How would I do that, please?
> 
> Thanks,
> Erin
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user


From HodgessE at uhd.edu  Wed Oct 21 03:07:22 2009
From: HodgessE at uhd.edu (Hodgess, Erin)
Date: Wed, 21 Oct 2009 03:07:22 -0500
Subject: [Swift-user] using swift on a cluster
References: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus>
	<4ADE84D9.6020508@mcs.anl.gov>
Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B5@BALI.uhd.campus>

Hello!

We are indeed using condor.

I wanted to try a small test run, but am running into trouble:

[hodgess at grid bin]$ cat myjob.submit
executable=/usr/bin/id
output=results.output
error=results.error
log=results.log
queue
[hodgess at grid bin]$ condor_submit myjob.submit
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 15.
[hodgess at grid bin]$ ls results*
results.error  results.log  results.output
You have new mail in /var/spool/mail/hodgess
[hodgess at grid bin]$ cat results.log
000 (015.000.000) 10/21 03:06:03 Job submitted from host: <192.168.1.11:46274>
...
001 (015.000.000) 10/21 03:06:05 Job executing on host: <10.1.255.244:44508>
...
002 (015.000.000) 10/21 03:06:05 (1) Job not properly linked for Condor.
...
009 (015.000.000) 10/21 03:06:05 Job was aborted by the user.
...
[hodgess at grid bin]$

I'm not sure why the job is not linked.

Any suggestions would be much appreciated.

Thanks,
Erin


Erin M. Hodgess, PhD
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: hodgesse at uhd.edu


-----Original Message-----
From: Michael Wilde [mailto:wilde at mcs.anl.gov]
Sent: Tue 10/20/2009 10:49 PM
To: Hodgess, Erin
Cc: swift-user at ci.uchicago.edu
Subject: Re: [Swift-user] using swift on a cluster
 
Hi Erin,

I'm assuming you meant "use Swift to run jobs on the compute nodes of 
the cluster"?

If so, you first need to find out what scheduler (also called "batch 
system" or "local resource manager") the cluster is running.

Thats typical one of these: PBS, Condor, or SGE.

Either ask your system administrator, or see if the "man" command or 
similar probes give you a clue:

Condor: condor_q -version

condor_q -version
$CondorVersion: 7.2.4 Jun 16 2009 BuildID: 159529 $
$CondorPlatform: I386-LINUX_RHEL5 $

PBS: man qstat:

   qstat(1B)  PBS

SGE: man qstat:

   QSTAT(1)   Sun Grid Engine User Commands


If its PBS or Condor, then the Swift user guide gives the sites.xml 
entries to use.

Tell us what you find, then try following the instructions in the user 
guide, and follow up with questions as needed.

- Mike


On 10/20/09 9:41 PM, Hodgess, Erin wrote:
> Hi Swift Users:
> 
> I'm on a cluster and would like to use swift on the different sites on 
> the cluster.
> 
> How would I do that, please?
> 
> Thanks,
> Erin
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091021/776430bb/attachment.html>

From HodgessE at uhd.edu  Wed Oct 21 03:17:00 2009
From: HodgessE at uhd.edu (Hodgess, Erin)
Date: Wed, 21 Oct 2009 03:17:00 -0500
Subject: [Swift-user] using swift on a cluster
References: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus><4ADE84D9.6020508@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B5@BALI.uhd.campus>
Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B6@BALI.uhd.campus>

Aha!

I needed the universe=vanilla line.


Erin M. Hodgess, PhD
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: hodgesse at uhd.edu


-----Original Message-----
From: swift-user-bounces at ci.uchicago.edu on behalf of Hodgess, Erin
Sent: Wed 10/21/2009 3:07 AM
To: Michael Wilde
Cc: swift-user at ci.uchicago.edu
Subject: RE: [Swift-user] using swift on a cluster
 
Hello!

We are indeed using condor.

I wanted to try a small test run, but am running into trouble:

[hodgess at grid bin]$ cat myjob.submit
executable=/usr/bin/id
output=results.output
error=results.error
log=results.log
queue
[hodgess at grid bin]$ condor_submit myjob.submit
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 15.
[hodgess at grid bin]$ ls results*
results.error  results.log  results.output
You have new mail in /var/spool/mail/hodgess
[hodgess at grid bin]$ cat results.log
000 (015.000.000) 10/21 03:06:03 Job submitted from host: <192.168.1.11:46274>
...
001 (015.000.000) 10/21 03:06:05 Job executing on host: <10.1.255.244:44508>
...
002 (015.000.000) 10/21 03:06:05 (1) Job not properly linked for Condor.
...
009 (015.000.000) 10/21 03:06:05 Job was aborted by the user.
...
[hodgess at grid bin]$

I'm not sure why the job is not linked.

Any suggestions would be much appreciated.

Thanks,
Erin


Erin M. Hodgess, PhD
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: hodgesse at uhd.edu


-----Original Message-----
From: Michael Wilde [mailto:wilde at mcs.anl.gov]
Sent: Tue 10/20/2009 10:49 PM
To: Hodgess, Erin
Cc: swift-user at ci.uchicago.edu
Subject: Re: [Swift-user] using swift on a cluster
 
Hi Erin,

I'm assuming you meant "use Swift to run jobs on the compute nodes of 
the cluster"?

If so, you first need to find out what scheduler (also called "batch 
system" or "local resource manager") the cluster is running.

Thats typical one of these: PBS, Condor, or SGE.

Either ask your system administrator, or see if the "man" command or 
similar probes give you a clue:

Condor: condor_q -version

condor_q -version
$CondorVersion: 7.2.4 Jun 16 2009 BuildID: 159529 $
$CondorPlatform: I386-LINUX_RHEL5 $

PBS: man qstat:

   qstat(1B)  PBS

SGE: man qstat:

   QSTAT(1)   Sun Grid Engine User Commands


If its PBS or Condor, then the Swift user guide gives the sites.xml 
entries to use.

Tell us what you find, then try following the instructions in the user 
guide, and follow up with questions as needed.

- Mike


On 10/20/09 9:41 PM, Hodgess, Erin wrote:
> Hi Swift Users:
> 
> I'm on a cluster and would like to use swift on the different sites on 
> the cluster.
> 
> How would I do that, please?
> 
> Thanks,
> Erin
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091021/f3a4d729/attachment.html>

From wilde at mcs.anl.gov  Wed Oct 21 07:02:12 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Wed, 21 Oct 2009 07:02:12 -0500
Subject: [Swift-user] using swift on a cluster
In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B6@BALI.uhd.campus>
References: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus><4ADE84D9.6020508@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B5@BALI.uhd.campus>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B6@BALI.uhd.campus>
Message-ID: <4ADEF844.5020202@mcs.anl.gov>

For running Swift locally on a Condor cluster, use a sites.xml based on 
this example:

<execution provider="condor" url="none" />

<config>

   <pool handle="localhost">
     <gridftp url="local://localhost" />
     <execution provider="local" url="none" />
     <workdirectory>/home/erin/swiftwork</workdirectory>
     <profile namespace="karajan" key="jobThrottle">.03</profile>
     <profile namespace="karajan" key="initialScore">10000</profile>
   </pool>

   <pool handle="condor">
     <execution provider="condor" url="none"/>
     <gridftp url="local://localhost"/>
     <workdirectory>/home/erin/swiftwork</workdirectory>
     <profile namespace="karajan" key="jobThrottle">.19</profile>
     <profile namespace="karajan" key="initialScore">10000</profile>
   </pool>

</config>

The jobThrottle values above will enable Swift to run up to 4 jobs at a 
time on localhost and 20 jobs at a time on the Condor cluster.

Use tc.data to catalog applications on pool or the other.

Set jobThrottle as desired to control execution parallelism.

#jobs run in parallel is (jobThrottle * 100)+1

initialScore=10000 overrides Swift's "start slow" approach to sensing 
the site's responsiveness.

- Mike

On 10/21/09 3:17 AM, Hodgess, Erin wrote:
> Aha!
> 
> I needed the universe=vanilla line.
> 
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> 
> -----Original Message-----
> From: swift-user-bounces at ci.uchicago.edu on behalf of Hodgess, Erin
> Sent: Wed 10/21/2009 3:07 AM
> To: Michael Wilde
> Cc: swift-user at ci.uchicago.edu
> Subject: RE: [Swift-user] using swift on a cluster
> 
> Hello!
> 
> We are indeed using condor.
> 
> I wanted to try a small test run, but am running into trouble:
> 
> [hodgess at grid bin]$ cat myjob.submit
> executable=/usr/bin/id
> output=results.output
> error=results.error
> log=results.log
> queue
> [hodgess at grid bin]$ condor_submit myjob.submit
> Submitting job(s).
> Logging submit event(s).
> 1 job(s) submitted to cluster 15.
> [hodgess at grid bin]$ ls results*
> results.error  results.log  results.output
> You have new mail in /var/spool/mail/hodgess
> [hodgess at grid bin]$ cat results.log
> 000 (015.000.000) 10/21 03:06:03 Job submitted from host: 
> <192.168.1.11:46274>
> ...
> 001 (015.000.000) 10/21 03:06:05 Job executing on host: <10.1.255.244:44508>
> ...
> 002 (015.000.000) 10/21 03:06:05 (1) Job not properly linked for Condor.
> ...
> 009 (015.000.000) 10/21 03:06:05 Job was aborted by the user.
> ...
> [hodgess at grid bin]$
> 
> I'm not sure why the job is not linked.
> 
> Any suggestions would be much appreciated.
> 
> Thanks,
> Erin
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> 
> -----Original Message-----
> From: Michael Wilde [mailto:wilde at mcs.anl.gov]
> Sent: Tue 10/20/2009 10:49 PM
> To: Hodgess, Erin
> Cc: swift-user at ci.uchicago.edu
> Subject: Re: [Swift-user] using swift on a cluster
> 
> Hi Erin,
> 
> I'm assuming you meant "use Swift to run jobs on the compute nodes of
> the cluster"?
> 
> If so, you first need to find out what scheduler (also called "batch
> system" or "local resource manager") the cluster is running.
> 
> Thats typical one of these: PBS, Condor, or SGE.
> 
> Either ask your system administrator, or see if the "man" command or
> similar probes give you a clue:
> 
> Condor: condor_q -version
> 
> condor_q -version
> $CondorVersion: 7.2.4 Jun 16 2009 BuildID: 159529 $
> $CondorPlatform: I386-LINUX_RHEL5 $
> 
> PBS: man qstat:
> 
>    qstat(1B)  PBS
> 
> SGE: man qstat:
> 
>    QSTAT(1)   Sun Grid Engine User Commands
> 
> 
> If its PBS or Condor, then the Swift user guide gives the sites.xml
> entries to use.
> 
> Tell us what you find, then try following the instructions in the user
> guide, and follow up with questions as needed.
> 
> - Mike
> 
> 
> On 10/20/09 9:41 PM, Hodgess, Erin wrote:
>  > Hi Swift Users:
>  >
>  > I'm on a cluster and would like to use swift on the different sites on
>  > the cluster.
>  >
>  > How would I do that, please?
>  >
>  > Thanks,
>  > Erin
>  >
>  >
>  > Erin M. Hodgess, PhD
>  > Associate Professor
>  > Department of Computer and Mathematical Sciences
>  > University of Houston - Downtown
>  > mailto: hodgesse at uhd.edu
>  >
>  >
>  > ------------------------------------------------------------------------
>  >
>  > _______________________________________________
>  > Swift-user mailing list
>  > Swift-user at ci.uchicago.edu
>  > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> 
> 


From HodgessE at uhd.edu  Wed Oct 21 09:03:24 2009
From: HodgessE at uhd.edu (Hodgess, Erin)
Date: Wed, 21 Oct 2009 09:03:24 -0500
Subject: [Swift-user] using swift on a cluster
References: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus><4ADE84D9.6020508@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B5@BALI.uhd.campus>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B6@BALI.uhd.campus>
	<4ADEF844.5020202@mcs.anl.gov>
Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B7@BALI.uhd.campus>

Here is the output:


[hodgess at grid bin]$ swift -tc.file tc.data -sites.file sites.xml firstR.swift
Swift 0.9 swift-r2860 cog-r2388

RunID: 20091021-0901-aku7y862
Progress:
Execution failed:
        No service contacts available
[hodgess at grid bin]$


Erin M. Hodgess, PhD
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: hodgesse at uhd.edu


-----Original Message-----
From: Michael Wilde [mailto:wilde at mcs.anl.gov]
Sent: Wed 10/21/2009 7:02 AM
To: Hodgess, Erin
Cc: swift-user at ci.uchicago.edu
Subject: Re: [Swift-user] using swift on a cluster
 
For running Swift locally on a Condor cluster, use a sites.xml based on 
this example:

<execution provider="condor" url="none" />

<config>

   <pool handle="localhost">
     <gridftp url="local://localhost" />
     <execution provider="local" url="none" />
     <workdirectory>/home/erin/swiftwork</workdirectory>
     <profile namespace="karajan" key="jobThrottle">.03</profile>
     <profile namespace="karajan" key="initialScore">10000</profile>
   </pool>

   <pool handle="condor">
     <execution provider="condor" url="none"/>
     <gridftp url="local://localhost"/>
     <workdirectory>/home/erin/swiftwork</workdirectory>
     <profile namespace="karajan" key="jobThrottle">.19</profile>
     <profile namespace="karajan" key="initialScore">10000</profile>
   </pool>

</config>

The jobThrottle values above will enable Swift to run up to 4 jobs at a 
time on localhost and 20 jobs at a time on the Condor cluster.

Use tc.data to catalog applications on pool or the other.

Set jobThrottle as desired to control execution parallelism.

#jobs run in parallel is (jobThrottle * 100)+1

initialScore=10000 overrides Swift's "start slow" approach to sensing 
the site's responsiveness.

- Mike

On 10/21/09 3:17 AM, Hodgess, Erin wrote:
> Aha!
> 
> I needed the universe=vanilla line.
> 
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> 
> -----Original Message-----
> From: swift-user-bounces at ci.uchicago.edu on behalf of Hodgess, Erin
> Sent: Wed 10/21/2009 3:07 AM
> To: Michael Wilde
> Cc: swift-user at ci.uchicago.edu
> Subject: RE: [Swift-user] using swift on a cluster
> 
> Hello!
> 
> We are indeed using condor.
> 
> I wanted to try a small test run, but am running into trouble:
> 
> [hodgess at grid bin]$ cat myjob.submit
> executable=/usr/bin/id
> output=results.output
> error=results.error
> log=results.log
> queue
> [hodgess at grid bin]$ condor_submit myjob.submit
> Submitting job(s).
> Logging submit event(s).
> 1 job(s) submitted to cluster 15.
> [hodgess at grid bin]$ ls results*
> results.error  results.log  results.output
> You have new mail in /var/spool/mail/hodgess
> [hodgess at grid bin]$ cat results.log
> 000 (015.000.000) 10/21 03:06:03 Job submitted from host: 
> <192.168.1.11:46274>
> ...
> 001 (015.000.000) 10/21 03:06:05 Job executing on host: <10.1.255.244:44508>
> ...
> 002 (015.000.000) 10/21 03:06:05 (1) Job not properly linked for Condor.
> ...
> 009 (015.000.000) 10/21 03:06:05 Job was aborted by the user.
> ...
> [hodgess at grid bin]$
> 
> I'm not sure why the job is not linked.
> 
> Any suggestions would be much appreciated.
> 
> Thanks,
> Erin
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> 
> -----Original Message-----
> From: Michael Wilde [mailto:wilde at mcs.anl.gov]
> Sent: Tue 10/20/2009 10:49 PM
> To: Hodgess, Erin
> Cc: swift-user at ci.uchicago.edu
> Subject: Re: [Swift-user] using swift on a cluster
> 
> Hi Erin,
> 
> I'm assuming you meant "use Swift to run jobs on the compute nodes of
> the cluster"?
> 
> If so, you first need to find out what scheduler (also called "batch
> system" or "local resource manager") the cluster is running.
> 
> Thats typical one of these: PBS, Condor, or SGE.
> 
> Either ask your system administrator, or see if the "man" command or
> similar probes give you a clue:
> 
> Condor: condor_q -version
> 
> condor_q -version
> $CondorVersion: 7.2.4 Jun 16 2009 BuildID: 159529 $
> $CondorPlatform: I386-LINUX_RHEL5 $
> 
> PBS: man qstat:
> 
>    qstat(1B)  PBS
> 
> SGE: man qstat:
> 
>    QSTAT(1)   Sun Grid Engine User Commands
> 
> 
> If its PBS or Condor, then the Swift user guide gives the sites.xml
> entries to use.
> 
> Tell us what you find, then try following the instructions in the user
> guide, and follow up with questions as needed.
> 
> - Mike
> 
> 
> On 10/20/09 9:41 PM, Hodgess, Erin wrote:
>  > Hi Swift Users:
>  >
>  > I'm on a cluster and would like to use swift on the different sites on
>  > the cluster.
>  >
>  > How would I do that, please?
>  >
>  > Thanks,
>  > Erin
>  >
>  >
>  > Erin M. Hodgess, PhD
>  > Associate Professor
>  > Department of Computer and Mathematical Sciences
>  > University of Houston - Downtown
>  > mailto: hodgesse at uhd.edu
>  >
>  >
>  > ------------------------------------------------------------------------
>  >
>  > _______________________________________________
>  > Swift-user mailing list
>  > Swift-user at ci.uchicago.edu
>  > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091021/52cb76fb/attachment.html>

From wilde at mcs.anl.gov  Wed Oct 21 09:22:35 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Wed, 21 Oct 2009 09:22:35 -0500
Subject: [Swift-user] using swift on a cluster
In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B7@BALI.uhd.campus>
References: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus><4ADE84D9.6020508@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B5@BALI.uhd.campus>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B6@BALI.uhd.campus>
	<4ADEF844.5020202@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B7@BALI.uhd.campus>
Message-ID: <4ADF192B.8020804@mcs.anl.gov>

Erin, we need to look into this further.

Please make sure that you are running either Swift 0.9 or the latest 
source from svn. And tell us what revision you are running.

Also please post your tc.data and sites.xml (and log file is its small 
enought); see if there are any messages in the .log file that would 
clarify the error.

Make sure that your app is cataloged in tc.data as being on pool 
"condor". But I think if it were not, you'd see a different error.

It almost looks to me like Swift is looking for the GRAM service contact 
string, as if it thinks you are asking for Condor-G instead of local 
Condor, eg:

  <profile namespace="globus" key="jobType">grid</profile>
  <profile namespace="globus"
   key="gridResource">gt2 belhaven-1.renci.org/jobmanager-fork</profile>

Just as a test, try changing provider="condor" to "pbs" in sites.xml. If 
the error changes to something like "PBS not installed" or "qsub not 
found" then I would suspect this is the case.

Its possible you can add just the jobType element with the value set to 
vanilla instead of grid, but I am purely *guessing*; we'll look deeper 
as soon as you send the info above and we have time.

- Mike


On 10/21/09 9:03 AM, Hodgess, Erin wrote:
> Here is the output:
> 
> 
> [hodgess at grid bin]$ swift -tc.file tc.data -sites.file sites.xml 
> firstR.swift
> Swift 0.9 swift-r2860 cog-r2388
> 
> RunID: 20091021-0901-aku7y862
> Progress:
> Execution failed:
>         No service contacts available
> [hodgess at grid bin]$
> 
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> 
> -----Original Message-----
> From: Michael Wilde [mailto:wilde at mcs.anl.gov]
> Sent: Wed 10/21/2009 7:02 AM
> To: Hodgess, Erin
> Cc: swift-user at ci.uchicago.edu
> Subject: Re: [Swift-user] using swift on a cluster
> 
> For running Swift locally on a Condor cluster, use a sites.xml based on
> this example:
> 
> <execution provider="condor" url="none" />
> 
> <config>
> 
>    <pool handle="localhost">
>      <gridftp url="local://localhost" />
>      <execution provider="local" url="none" />
>      <workdirectory>/home/erin/swiftwork</workdirectory>
>      <profile namespace="karajan" key="jobThrottle">.03</profile>
>      <profile namespace="karajan" key="initialScore">10000</profile>
>    </pool>
> 
>    <pool handle="condor">
>      <execution provider="condor" url="none"/>
>      <gridftp url="local://localhost"/>
>      <workdirectory>/home/erin/swiftwork</workdirectory>
>      <profile namespace="karajan" key="jobThrottle">.19</profile>
>      <profile namespace="karajan" key="initialScore">10000</profile>
>    </pool>
> 
> </config>
> 
> The jobThrottle values above will enable Swift to run up to 4 jobs at a
> time on localhost and 20 jobs at a time on the Condor cluster.
> 
> Use tc.data to catalog applications on pool or the other.
> 
> Set jobThrottle as desired to control execution parallelism.
> 
> #jobs run in parallel is (jobThrottle * 100)+1
> 
> initialScore=10000 overrides Swift's "start slow" approach to sensing
> the site's responsiveness.
> 
> - Mike
> 
> On 10/21/09 3:17 AM, Hodgess, Erin wrote:
>  > Aha!
>  >
>  > I needed the universe=vanilla line.
>  >
>  >
>  >
>  > Erin M. Hodgess, PhD
>  > Associate Professor
>  > Department of Computer and Mathematical Sciences
>  > University of Houston - Downtown
>  > mailto: hodgesse at uhd.edu
>  >
>  >
>  >
>  > -----Original Message-----
>  > From: swift-user-bounces at ci.uchicago.edu on behalf of Hodgess, Erin
>  > Sent: Wed 10/21/2009 3:07 AM
>  > To: Michael Wilde
>  > Cc: swift-user at ci.uchicago.edu
>  > Subject: RE: [Swift-user] using swift on a cluster
>  >
>  > Hello!
>  >
>  > We are indeed using condor.
>  >
>  > I wanted to try a small test run, but am running into trouble:
>  >
>  > [hodgess at grid bin]$ cat myjob.submit
>  > executable=/usr/bin/id
>  > output=results.output
>  > error=results.error
>  > log=results.log
>  > queue
>  > [hodgess at grid bin]$ condor_submit myjob.submit
>  > Submitting job(s).
>  > Logging submit event(s).
>  > 1 job(s) submitted to cluster 15.
>  > [hodgess at grid bin]$ ls results*
>  > results.error  results.log  results.output
>  > You have new mail in /var/spool/mail/hodgess
>  > [hodgess at grid bin]$ cat results.log
>  > 000 (015.000.000) 10/21 03:06:03 Job submitted from host:
>  > <192.168.1.11:46274>
>  > ...
>  > 001 (015.000.000) 10/21 03:06:05 Job executing on host: 
> <10.1.255.244:44508>
>  > ...
>  > 002 (015.000.000) 10/21 03:06:05 (1) Job not properly linked for Condor.
>  > ...
>  > 009 (015.000.000) 10/21 03:06:05 Job was aborted by the user.
>  > ...
>  > [hodgess at grid bin]$
>  >
>  > I'm not sure why the job is not linked.
>  >
>  > Any suggestions would be much appreciated.
>  >
>  > Thanks,
>  > Erin
>  >
>  >
>  > Erin M. Hodgess, PhD
>  > Associate Professor
>  > Department of Computer and Mathematical Sciences
>  > University of Houston - Downtown
>  > mailto: hodgesse at uhd.edu
>  >
>  >
>  >
>  > -----Original Message-----
>  > From: Michael Wilde [mailto:wilde at mcs.anl.gov]
>  > Sent: Tue 10/20/2009 10:49 PM
>  > To: Hodgess, Erin
>  > Cc: swift-user at ci.uchicago.edu
>  > Subject: Re: [Swift-user] using swift on a cluster
>  >
>  > Hi Erin,
>  >
>  > I'm assuming you meant "use Swift to run jobs on the compute nodes of
>  > the cluster"?
>  >
>  > If so, you first need to find out what scheduler (also called "batch
>  > system" or "local resource manager") the cluster is running.
>  >
>  > Thats typical one of these: PBS, Condor, or SGE.
>  >
>  > Either ask your system administrator, or see if the "man" command or
>  > similar probes give you a clue:
>  >
>  > Condor: condor_q -version
>  >
>  > condor_q -version
>  > $CondorVersion: 7.2.4 Jun 16 2009 BuildID: 159529 $
>  > $CondorPlatform: I386-LINUX_RHEL5 $
>  >
>  > PBS: man qstat:
>  >
>  >    qstat(1B)  PBS
>  >
>  > SGE: man qstat:
>  >
>  >    QSTAT(1)   Sun Grid Engine User Commands
>  >
>  >
>  > If its PBS or Condor, then the Swift user guide gives the sites.xml
>  > entries to use.
>  >
>  > Tell us what you find, then try following the instructions in the user
>  > guide, and follow up with questions as needed.
>  >
>  > - Mike
>  >
>  >
>  > On 10/20/09 9:41 PM, Hodgess, Erin wrote:
>  >  > Hi Swift Users:
>  >  >
>  >  > I'm on a cluster and would like to use swift on the different sites on
>  >  > the cluster.
>  >  >
>  >  > How would I do that, please?
>  >  >
>  >  > Thanks,
>  >  > Erin
>  >  >
>  >  >
>  >  > Erin M. Hodgess, PhD
>  >  > Associate Professor
>  >  > Department of Computer and Mathematical Sciences
>  >  > University of Houston - Downtown
>  >  > mailto: hodgesse at uhd.edu
>  >  >
>  >  >
>  >  > 
> ------------------------------------------------------------------------
>  >  >
>  >  > _______________________________________________
>  >  > Swift-user mailing list
>  >  > Swift-user at ci.uchicago.edu
>  >  > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>  >
>  >
> 


From HodgessE at uhd.edu  Wed Oct 21 12:10:25 2009
From: HodgessE at uhd.edu (Hodgess, Erin)
Date: Wed, 21 Oct 2009 12:10:25 -0500
Subject: [Swift-user] using swift on a cluster
References: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus><4ADE84D9.6020508@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B5@BALI.uhd.campus>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B6@BALI.uhd.campus>
	<4ADEF844.5020202@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B7@BALI.uhd.campus>
	<4ADF192B.8020804@mcs.anl.gov>
Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C377C0@BALI.uhd.campus>

Hi again!

Here are the sites.xml and tc.data files.

Thanks,
Erin


[hodgess at grid bin]$ cat sites.xml
<execution provider="condor" url="none" />

<config>

   <pool handle="localhost">
     <gridftp url="local://localhost" />
     <execution provider="local" url="none" />
     <workdirectory>/home/hodgess/swiftwork</workdirectory>
     <profile namespace="karajan" key="jobThrottle">.03</profile>
     <profile namespace="karajan" key="initialScore">10000</profile>
   </pool>

   <pool handle="condor">
     <execution provider="condor" url="none"/>
     <gridftp url="local://localhost"/>
     <workdirectory>/home/hodgess/swiftwork</workdirectory>
     <profile namespace="karajan" key="jobThrottle">.19</profile>
     <profile namespace="karajan" key="initialScore">10000</profile>
   </pool>

</config>
[hodgess at grid bin]$ cat tc.data
localhost       convert /usr/bin/convert        INSTALLED       INTEL32::LINUX null
localhost       RInvoke /home/hodgess/R-2.9.2/bin/RInvoke.sh    INSTALLED      INTEL32::LINUX   null
condor  RInvoke /home/hodgess/R-2.9.2/bin/RInvoke.sh    INSTALLED       INTEL32::LINUX  null
[hodgess at grid bin]$ cat firstR.R
cat: firstR.R: No such file or directory
[hodgess at grid bin]$ cat firstR.swift
type file{}
app (file output) firstone (file scriptFile) {
    RInvoke  @filename(scriptFile) @filename(output);
    }


        file scriptFile <"a1.in" >;
        file output <"a1.out" >;
            output=firstone(scriptFile);
[hodgess at grid bin]$


Erin M. Hodgess, PhD
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: hodgesse at uhd.edu


-----Original Message-----
From: Michael Wilde [mailto:wilde at mcs.anl.gov]
Sent: Wed 10/21/2009 9:22 AM
To: Hodgess, Erin
Cc: swift-user at ci.uchicago.edu
Subject: Re: [Swift-user] using swift on a cluster
 
Erin, we need to look into this further.

Please make sure that you are running either Swift 0.9 or the latest 
source from svn. And tell us what revision you are running.

Also please post your tc.data and sites.xml (and log file is its small 
enought); see if there are any messages in the .log file that would 
clarify the error.

Make sure that your app is cataloged in tc.data as being on pool 
"condor". But I think if it were not, you'd see a different error.

It almost looks to me like Swift is looking for the GRAM service contact 
string, as if it thinks you are asking for Condor-G instead of local 
Condor, eg:

  <profile namespace="globus" key="jobType">grid</profile>
  <profile namespace="globus"
   key="gridResource">gt2 belhaven-1.renci.org/jobmanager-fork</profile>

Just as a test, try changing provider="condor" to "pbs" in sites.xml. If 
the error changes to something like "PBS not installed" or "qsub not 
found" then I would suspect this is the case.

Its possible you can add just the jobType element with the value set to 
vanilla instead of grid, but I am purely *guessing*; we'll look deeper 
as soon as you send the info above and we have time.

- Mike


On 10/21/09 9:03 AM, Hodgess, Erin wrote:
> Here is the output:
> 
> 
> [hodgess at grid bin]$ swift -tc.file tc.data -sites.file sites.xml 
> firstR.swift
> Swift 0.9 swift-r2860 cog-r2388
> 
> RunID: 20091021-0901-aku7y862
> Progress:
> Execution failed:
>         No service contacts available
> [hodgess at grid bin]$
> 
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> 
> -----Original Message-----
> From: Michael Wilde [mailto:wilde at mcs.anl.gov]
> Sent: Wed 10/21/2009 7:02 AM
> To: Hodgess, Erin
> Cc: swift-user at ci.uchicago.edu
> Subject: Re: [Swift-user] using swift on a cluster
> 
> For running Swift locally on a Condor cluster, use a sites.xml based on
> this example:
> 
> <execution provider="condor" url="none" />
> 
> <config>
> 
>    <pool handle="localhost">
>      <gridftp url="local://localhost" />
>      <execution provider="local" url="none" />
>      <workdirectory>/home/erin/swiftwork</workdirectory>
>      <profile namespace="karajan" key="jobThrottle">.03</profile>
>      <profile namespace="karajan" key="initialScore">10000</profile>
>    </pool>
> 
>    <pool handle="condor">
>      <execution provider="condor" url="none"/>
>      <gridftp url="local://localhost"/>
>      <workdirectory>/home/erin/swiftwork</workdirectory>
>      <profile namespace="karajan" key="jobThrottle">.19</profile>
>      <profile namespace="karajan" key="initialScore">10000</profile>
>    </pool>
> 
> </config>
> 
> The jobThrottle values above will enable Swift to run up to 4 jobs at a
> time on localhost and 20 jobs at a time on the Condor cluster.
> 
> Use tc.data to catalog applications on pool or the other.
> 
> Set jobThrottle as desired to control execution parallelism.
> 
> #jobs run in parallel is (jobThrottle * 100)+1
> 
> initialScore=10000 overrides Swift's "start slow" approach to sensing
> the site's responsiveness.
> 
> - Mike
> 
> On 10/21/09 3:17 AM, Hodgess, Erin wrote:
>  > Aha!
>  >
>  > I needed the universe=vanilla line.
>  >
>  >
>  >
>  > Erin M. Hodgess, PhD
>  > Associate Professor
>  > Department of Computer and Mathematical Sciences
>  > University of Houston - Downtown
>  > mailto: hodgesse at uhd.edu
>  >
>  >
>  >
>  > -----Original Message-----
>  > From: swift-user-bounces at ci.uchicago.edu on behalf of Hodgess, Erin
>  > Sent: Wed 10/21/2009 3:07 AM
>  > To: Michael Wilde
>  > Cc: swift-user at ci.uchicago.edu
>  > Subject: RE: [Swift-user] using swift on a cluster
>  >
>  > Hello!
>  >
>  > We are indeed using condor.
>  >
>  > I wanted to try a small test run, but am running into trouble:
>  >
>  > [hodgess at grid bin]$ cat myjob.submit
>  > executable=/usr/bin/id
>  > output=results.output
>  > error=results.error
>  > log=results.log
>  > queue
>  > [hodgess at grid bin]$ condor_submit myjob.submit
>  > Submitting job(s).
>  > Logging submit event(s).
>  > 1 job(s) submitted to cluster 15.
>  > [hodgess at grid bin]$ ls results*
>  > results.error  results.log  results.output
>  > You have new mail in /var/spool/mail/hodgess
>  > [hodgess at grid bin]$ cat results.log
>  > 000 (015.000.000) 10/21 03:06:03 Job submitted from host:
>  > <192.168.1.11:46274>
>  > ...
>  > 001 (015.000.000) 10/21 03:06:05 Job executing on host: 
> <10.1.255.244:44508>
>  > ...
>  > 002 (015.000.000) 10/21 03:06:05 (1) Job not properly linked for Condor.
>  > ...
>  > 009 (015.000.000) 10/21 03:06:05 Job was aborted by the user.
>  > ...
>  > [hodgess at grid bin]$
>  >
>  > I'm not sure why the job is not linked.
>  >
>  > Any suggestions would be much appreciated.
>  >
>  > Thanks,
>  > Erin
>  >
>  >
>  > Erin M. Hodgess, PhD
>  > Associate Professor
>  > Department of Computer and Mathematical Sciences
>  > University of Houston - Downtown
>  > mailto: hodgesse at uhd.edu
>  >
>  >
>  >
>  > -----Original Message-----
>  > From: Michael Wilde [mailto:wilde at mcs.anl.gov]
>  > Sent: Tue 10/20/2009 10:49 PM
>  > To: Hodgess, Erin
>  > Cc: swift-user at ci.uchicago.edu
>  > Subject: Re: [Swift-user] using swift on a cluster
>  >
>  > Hi Erin,
>  >
>  > I'm assuming you meant "use Swift to run jobs on the compute nodes of
>  > the cluster"?
>  >
>  > If so, you first need to find out what scheduler (also called "batch
>  > system" or "local resource manager") the cluster is running.
>  >
>  > Thats typical one of these: PBS, Condor, or SGE.
>  >
>  > Either ask your system administrator, or see if the "man" command or
>  > similar probes give you a clue:
>  >
>  > Condor: condor_q -version
>  >
>  > condor_q -version
>  > $CondorVersion: 7.2.4 Jun 16 2009 BuildID: 159529 $
>  > $CondorPlatform: I386-LINUX_RHEL5 $
>  >
>  > PBS: man qstat:
>  >
>  >    qstat(1B)  PBS
>  >
>  > SGE: man qstat:
>  >
>  >    QSTAT(1)   Sun Grid Engine User Commands
>  >
>  >
>  > If its PBS or Condor, then the Swift user guide gives the sites.xml
>  > entries to use.
>  >
>  > Tell us what you find, then try following the instructions in the user
>  > guide, and follow up with questions as needed.
>  >
>  > - Mike
>  >
>  >
>  > On 10/20/09 9:41 PM, Hodgess, Erin wrote:
>  >  > Hi Swift Users:
>  >  >
>  >  > I'm on a cluster and would like to use swift on the different sites on
>  >  > the cluster.
>  >  >
>  >  > How would I do that, please?
>  >  >
>  >  > Thanks,
>  >  > Erin
>  >  >
>  >  >
>  >  > Erin M. Hodgess, PhD
>  >  > Associate Professor
>  >  > Department of Computer and Mathematical Sciences
>  >  > University of Houston - Downtown
>  >  > mailto: hodgesse at uhd.edu
>  >  >
>  >  >
>  >  > 
> ------------------------------------------------------------------------
>  >  >
>  >  > _______________________________________________
>  >  > Swift-user mailing list
>  >  > Swift-user at ci.uchicago.edu
>  >  > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>  >
>  >
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091021/717d9493/attachment.html>

From wilde at mcs.anl.gov  Wed Oct 21 12:36:43 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Wed, 21 Oct 2009 12:36:43 -0500
Subject: [Swift-user] using swift on a cluster
In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C377C0@BALI.uhd.campus>
References: <70A5AC06FDB5E54482D19E1C04CDFCF307C377B3@BALI.uhd.campus><4ADE84D9.6020508@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B5@BALI.uhd.campus>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B6@BALI.uhd.campus>
	<4ADEF844.5020202@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377B7@BALI.uhd.campus>
	<4ADF192B.8020804@mcs.anl.gov>
	<70A5AC06FDB5E54482D19E1C04CDFCF307C377C0@BALI.uhd.campus>
Message-ID: <4ADF46AB.8030603@mcs.anl.gov>

Erin,

The first line of your sites.xml file seems to be left there in error:

  > [hodgess at grid bin]$ cat sites.xml
  > <execution provider="condor" url="none" />

Can you remove that and try again? Im not sure how that got parsed.

- Mike

On 10/21/09 12:10 PM, Hodgess, Erin wrote:
> Hi again!
> 
> Here are the sites.xml and tc.data files.
> 
> Thanks,
> Erin
> 
> 
> [hodgess at grid bin]$ cat sites.xml
> <execution provider="condor" url="none" />
> 
> <config>
> 
>    <pool handle="localhost">
>      <gridftp url="local://localhost" />
>      <execution provider="local" url="none" />
>      <workdirectory>/home/hodgess/swiftwork</workdirectory>
>      <profile namespace="karajan" key="jobThrottle">.03</profile>
>      <profile namespace="karajan" key="initialScore">10000</profile>
>    </pool>
> 
>    <pool handle="condor">
>      <execution provider="condor" url="none"/>
>      <gridftp url="local://localhost"/>
>      <workdirectory>/home/hodgess/swiftwork</workdirectory>
>      <profile namespace="karajan" key="jobThrottle">.19</profile>
>      <profile namespace="karajan" key="initialScore">10000</profile>
>    </pool>
> 
> </config>
> [hodgess at grid bin]$ cat tc.data
> localhost       convert /usr/bin/convert        INSTALLED       
> INTEL32::LINUX null
> localhost       RInvoke /home/hodgess/R-2.9.2/bin/RInvoke.sh    
> INSTALLED      INTEL32::LINUX   null
> condor  RInvoke /home/hodgess/R-2.9.2/bin/RInvoke.sh    INSTALLED       
> INTEL32::LINUX  null
> [hodgess at grid bin]$ cat firstR.R
> cat: firstR.R: No such file or directory
> [hodgess at grid bin]$ cat firstR.swift
> type file{}
> app (file output) firstone (file scriptFile) {
>     RInvoke  @filename(scriptFile) @filename(output);
>     }
> 
> 
>         file scriptFile <"a1.in" >;
>         file output <"a1.out" >;
>             output=firstone(scriptFile);
> [hodgess at grid bin]$
> 
> 
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
> 
> 
> 
> -----Original Message-----
> From: Michael Wilde [mailto:wilde at mcs.anl.gov]
> Sent: Wed 10/21/2009 9:22 AM
> To: Hodgess, Erin
> Cc: swift-user at ci.uchicago.edu
> Subject: Re: [Swift-user] using swift on a cluster
> 
> Erin, we need to look into this further.
> 
> Please make sure that you are running either Swift 0.9 or the latest
> source from svn. And tell us what revision you are running.
> 
> Also please post your tc.data and sites.xml (and log file is its small
> enought); see if there are any messages in the .log file that would
> clarify the error.
> 
> Make sure that your app is cataloged in tc.data as being on pool
> "condor". But I think if it were not, you'd see a different error.
> 
> It almost looks to me like Swift is looking for the GRAM service contact
> string, as if it thinks you are asking for Condor-G instead of local
> Condor, eg:
> 
>   <profile namespace="globus" key="jobType">grid</profile>
>   <profile namespace="globus"
>    key="gridResource">gt2 belhaven-1.renci.org/jobmanager-fork</profile>
> 
> Just as a test, try changing provider="condor" to "pbs" in sites.xml. If
> the error changes to something like "PBS not installed" or "qsub not
> found" then I would suspect this is the case.
> 
> Its possible you can add just the jobType element with the value set to
> vanilla instead of grid, but I am purely *guessing*; we'll look deeper
> as soon as you send the info above and we have time.
> 
> - Mike
> 
> 
> On 10/21/09 9:03 AM, Hodgess, Erin wrote:
>  > Here is the output:
>  >
>  >
>  > [hodgess at grid bin]$ swift -tc.file tc.data -sites.file sites.xml
>  > firstR.swift
>  > Swift 0.9 swift-r2860 cog-r2388
>  >
>  > RunID: 20091021-0901-aku7y862
>  > Progress:
>  > Execution failed:
>  >         No service contacts available
>  > [hodgess at grid bin]$
>  >
>  >
>  >
>  > Erin M. Hodgess, PhD
>  > Associate Professor
>  > Department of Computer and Mathematical Sciences
>  > University of Houston - Downtown
>  > mailto: hodgesse at uhd.edu
>  >
>  >
>  >
>  > -----Original Message-----
>  > From: Michael Wilde [mailto:wilde at mcs.anl.gov]
>  > Sent: Wed 10/21/2009 7:02 AM
>  > To: Hodgess, Erin
>  > Cc: swift-user at ci.uchicago.edu
>  > Subject: Re: [Swift-user] using swift on a cluster
>  >
>  > For running Swift locally on a Condor cluster, use a sites.xml based on
>  > this example:
>  >
>  > <execution provider="condor" url="none" />
>  >
>  > <config>
>  >
>  >    <pool handle="localhost">
>  >      <gridftp url="local://localhost" />
>  >      <execution provider="local" url="none" />
>  >      <workdirectory>/home/erin/swiftwork</workdirectory>
>  >      <profile namespace="karajan" key="jobThrottle">.03</profile>
>  >      <profile namespace="karajan" key="initialScore">10000</profile>
>  >    </pool>
>  >
>  >    <pool handle="condor">
>  >      <execution provider="condor" url="none"/>
>  >      <gridftp url="local://localhost"/>
>  >      <workdirectory>/home/erin/swiftwork</workdirectory>
>  >      <profile namespace="karajan" key="jobThrottle">.19</profile>
>  >      <profile namespace="karajan" key="initialScore">10000</profile>
>  >    </pool>
>  >
>  > </config>
>  >
>  > The jobThrottle values above will enable Swift to run up to 4 jobs at a
>  > time on localhost and 20 jobs at a time on the Condor cluster.
>  >
>  > Use tc.data to catalog applications on pool or the other.
>  >
>  > Set jobThrottle as desired to control execution parallelism.
>  >
>  > #jobs run in parallel is (jobThrottle * 100)+1
>  >
>  > initialScore=10000 overrides Swift's "start slow" approach to sensing
>  > the site's responsiveness.
>  >
>  > - Mike
>  >
>  > On 10/21/09 3:17 AM, Hodgess, Erin wrote:
>  >  > Aha!
>  >  >
>  >  > I needed the universe=vanilla line.
>  >  >
>  >  >
>  >  >
>  >  > Erin M. Hodgess, PhD
>  >  > Associate Professor
>  >  > Department of Computer and Mathematical Sciences
>  >  > University of Houston - Downtown
>  >  > mailto: hodgesse at uhd.edu
>  >  >
>  >  >
>  >  >
>  >  > -----Original Message-----
>  >  > From: swift-user-bounces at ci.uchicago.edu on behalf of Hodgess, Erin
>  >  > Sent: Wed 10/21/2009 3:07 AM
>  >  > To: Michael Wilde
>  >  > Cc: swift-user at ci.uchicago.edu
>  >  > Subject: RE: [Swift-user] using swift on a cluster
>  >  >
>  >  > Hello!
>  >  >
>  >  > We are indeed using condor.
>  >  >
>  >  > I wanted to try a small test run, but am running into trouble:
>  >  >
>  >  > [hodgess at grid bin]$ cat myjob.submit
>  >  > executable=/usr/bin/id
>  >  > output=results.output
>  >  > error=results.error
>  >  > log=results.log
>  >  > queue
>  >  > [hodgess at grid bin]$ condor_submit myjob.submit
>  >  > Submitting job(s).
>  >  > Logging submit event(s).
>  >  > 1 job(s) submitted to cluster 15.
>  >  > [hodgess at grid bin]$ ls results*
>  >  > results.error  results.log  results.output
>  >  > You have new mail in /var/spool/mail/hodgess
>  >  > [hodgess at grid bin]$ cat results.log
>  >  > 000 (015.000.000) 10/21 03:06:03 Job submitted from host:
>  >  > <192.168.1.11:46274>
>  >  > ...
>  >  > 001 (015.000.000) 10/21 03:06:05 Job executing on host:
>  > <10.1.255.244:44508>
>  >  > ...
>  >  > 002 (015.000.000) 10/21 03:06:05 (1) Job not properly linked for 
> Condor.
>  >  > ...
>  >  > 009 (015.000.000) 10/21 03:06:05 Job was aborted by the user.
>  >  > ...
>  >  > [hodgess at grid bin]$
>  >  >
>  >  > I'm not sure why the job is not linked.
>  >  >
>  >  > Any suggestions would be much appreciated.
>  >  >
>  >  > Thanks,
>  >  > Erin
>  >  >
>  >  >
>  >  > Erin M. Hodgess, PhD
>  >  > Associate Professor
>  >  > Department of Computer and Mathematical Sciences
>  >  > University of Houston - Downtown
>  >  > mailto: hodgesse at uhd.edu
>  >  >
>  >  >
>  >  >
>  >  > -----Original Message-----
>  >  > From: Michael Wilde [mailto:wilde at mcs.anl.gov]
>  >  > Sent: Tue 10/20/2009 10:49 PM
>  >  > To: Hodgess, Erin
>  >  > Cc: swift-user at ci.uchicago.edu
>  >  > Subject: Re: [Swift-user] using swift on a cluster
>  >  >
>  >  > Hi Erin,
>  >  >
>  >  > I'm assuming you meant "use Swift to run jobs on the compute nodes of
>  >  > the cluster"?
>  >  >
>  >  > If so, you first need to find out what scheduler (also called "batch
>  >  > system" or "local resource manager") the cluster is running.
>  >  >
>  >  > Thats typical one of these: PBS, Condor, or SGE.
>  >  >
>  >  > Either ask your system administrator, or see if the "man" command or
>  >  > similar probes give you a clue:
>  >  >
>  >  > Condor: condor_q -version
>  >  >
>  >  > condor_q -version
>  >  > $CondorVersion: 7.2.4 Jun 16 2009 BuildID: 159529 $
>  >  > $CondorPlatform: I386-LINUX_RHEL5 $
>  >  >
>  >  > PBS: man qstat:
>  >  >
>  >  >    qstat(1B)  PBS
>  >  >
>  >  > SGE: man qstat:
>  >  >
>  >  >    QSTAT(1)   Sun Grid Engine User Commands
>  >  >
>  >  >
>  >  > If its PBS or Condor, then the Swift user guide gives the sites.xml
>  >  > entries to use.
>  >  >
>  >  > Tell us what you find, then try following the instructions in the user
>  >  > guide, and follow up with questions as needed.
>  >  >
>  >  > - Mike
>  >  >
>  >  >
>  >  > On 10/20/09 9:41 PM, Hodgess, Erin wrote:
>  >  >  > Hi Swift Users:
>  >  >  >
>  >  >  > I'm on a cluster and would like to use swift on the different 
> sites on
>  >  >  > the cluster.
>  >  >  >
>  >  >  > How would I do that, please?
>  >  >  >
>  >  >  > Thanks,
>  >  >  > Erin
>  >  >  >
>  >  >  >
>  >  >  > Erin M. Hodgess, PhD
>  >  >  > Associate Professor
>  >  >  > Department of Computer and Mathematical Sciences
>  >  >  > University of Houston - Downtown
>  >  >  > mailto: hodgesse at uhd.edu
>  >  >  >
>  >  >  >
>  >  >  >
>  > ------------------------------------------------------------------------
>  >  >  >
>  >  >  > _______________________________________________
>  >  >  > Swift-user mailing list
>  >  >  > Swift-user at ci.uchicago.edu
>  >  >  > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>  >  >
>  >  >
>  >
> 


From skenny at uchicago.edu  Fri Oct 23 10:45:04 2009
From: skenny at uchicago.edu (skenny at uchicago.edu)
Date: Fri, 23 Oct 2009 10:45:04 -0500 (CDT)
Subject: [Swift-user] Re: [Swift-devel] burnin' up ranger w/the
 latest coasters
In-Reply-To: <20091014153915.CDW69329@m4500-02.uchicago.edu>
References: <20091013111417.CDU59058@m4500-02.uchicago.edu>
	<20091014153915.CDW69329@m4500-02.uchicago.edu>
Message-ID: <20091023104504.CEI95549@m4500-02.uchicago.edu>

however...when i use the configs here and i try to run a
workflow with 196,608 jobs it seems that coasters starts to
ramp up nicely, but maybe a little too well :) as it begins
requesting more cores than i'm allowed in the normal queue on
ranger. that is, the limit is 4096. i tried changing maxNodes
to 4096 which did not work. i'm wondering if workers per node
should actually be 16 instead (?) but i know you've gotten it
to work well with the setting at 32 so i'm not sure...

anyway, it ramped up nicely (and was only like 8 jobs away
from finishing the whole thing) i just need to know how to cap
it off so it won't ask for more than 4096 cores. 

thanks
~sk

---- Original message ----
>Date: Wed, 14 Oct 2009 15:39:15 -0500 (CDT)
>From: <skenny at uchicago.edu>  
>Subject: Re: [Swift-devel] burnin' up ranger w/the latest
coasters  
>To: swift-user at ci.uchicago.edu, swift-devel at ci.uchicago.edu
>
>for those interested, here are the config files used for this
run:
>
>swift.properties:
>
>sites.file=config/coaster_ranger.xml
>tc.file=/ci/projects/cnari/config/tc.data
>lazy.errors=false
>caching.algorithm=LRU
>pgraph=false
>pgraph.graph.options=splines="compound", rankdir="TB"
>pgraph.node.options=color="seagreen", style="filled"
>clustering.enabled=false
>clustering.queue.delay=4
>clustering.min.time=60
>kickstart.enabled=maybe
>kickstart.always.transfer=false
>wrapperlog.always.transfer=false
>throttle.submit=3
>throttle.host.submit=8
>throttle.score.job.factor=64
>throttle.transfers=16
>throttle.file.operations=16
>sitedir.keep=false
>execution.retries=3
>replication.enabled=false
>replication.min.queue.time=60
>replication.limit=3
>foreach.max.threads=16384
>
>coaster_ranger.xml:
>
><config>
> <import file="sys.xml"/>
><import file="vdl-lib.xml"/>
><set name="username">
>  <arg><vdl:new type="string" value="user"/></arg>
></set>
><pool handle="RANGER">
>    <profile namespace="karajan"
>key="jobThrottle">1000.0</profile>
>    <filesystem provider="coaster"
>url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
>    <profile namespace="globus" key="queue">normal</profile>
>    <profile namespace="globus" key="workersPerNode">32</profile>
>    <profile namespace="globus" key="nodeGranularity">1</profile>
>    <profile namespace="globus" key="slots">16</profile>
>    <profile namespace="globus" key="maxNodes">8192</profile>
>    <profile namespace="globus" key="maxTime">72000</profile>
>    <profile namespace="globus"
>key="project">TG-DBS080004N</profile>
>    <execution provider="coaster"
>url="gatekeeper.ranger.tacc.teragrid.org"
>jobManager="gt2:gt2:SGE"/>
>    <!--
>workdirectory>/work/00043/tg457040/sidgrid_out/{username}</workdirectory
>-->
>   
><workdirectory>/work/00926/tg459516/sidgrid_out/{username}</workdirectory>
>  </pool>
></config>
>
>
>
>---- Original message ----
>>Date: Tue, 13 Oct 2009 11:14:17 -0500 (CDT)
>>From: <skenny at uchicago.edu>  
>>Subject: [Swift-devel] burnin' up ranger w/the latest coasters  
>>To: swift-devel at ci.uchicago.edu
>>
>>Final status:  Finished successfully:131072 
>>
>>re-running some of the workflows from our recent SEM
>>paper with the latest swift...sadly, queue time on ranger has
>>only gone up since those initial runs...but luckily coasters
>>has speeded things up, so it ends up evening out for time to
>>solution :)
>>
>>not sure i fully understand the plot:
>>
>>http://www.ci.uchicago.edu/~skenny/workflows/sem_131k/
>>
>>log is here:
>>
>>/ci/projects/cnari/logs/skenny/4reg_2cond-20091012-1607-ugidm2s2.log
>>_______________________________________________
>>Swift-devel mailing list
>>Swift-devel at ci.uchicago.edu
>>http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>_______________________________________________
>Swift-devel mailing list
>Swift-devel at ci.uchicago.edu
>http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel


From hategan at mcs.anl.gov  Fri Oct 23 11:37:10 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Fri, 23 Oct 2009 11:37:10 -0500
Subject: [Swift-user] Re: [Swift-devel] burnin' up ranger w/the latest
	coasters
In-Reply-To: <20091023104504.CEI95549@m4500-02.uchicago.edu>
References: <20091013111417.CDU59058@m4500-02.uchicago.edu>
	<20091014153915.CDW69329@m4500-02.uchicago.edu>
	<20091023104504.CEI95549@m4500-02.uchicago.edu>
Message-ID: <1256315830.10810.5.camel@localhost>

On Fri, 2009-10-23 at 10:45 -0500, skenny at uchicago.edu wrote:
> however...when i use the configs here and i try to run a
> workflow with 196,608 jobs it seems that coasters starts to
> ramp up nicely, but maybe a little too well :) as it begins
> requesting more cores than i'm allowed in the normal queue on
> ranger. that is, the limit is 4096. i tried changing maxNodes
> to 4096 which did not work.

Shouldn't that be 4096/workersPerNode?

>  i'm wondering if workers per node
> should actually be 16 instead (?) but i know you've gotten it
> to work well with the setting at 32 so i'm not sure...

You could set it to 16. My reasoning for doubling it was that if the
processes you run are slightly I/O bound, then you'd get slightly better
performance by running two processes per core. 

> 
> anyway, it ramped up nicely (and was only like 8 jobs away
> from finishing the whole thing) i just need to know how to cap
> it off so it won't ask for more than 4096 cores. 
> 
> thanks
> ~sk
> 
> ---- Original message ----
> >Date: Wed, 14 Oct 2009 15:39:15 -0500 (CDT)
> >From: <skenny at uchicago.edu>  
> >Subject: Re: [Swift-devel] burnin' up ranger w/the latest
> coasters  
> >To: swift-user at ci.uchicago.edu, swift-devel at ci.uchicago.edu
> >
> >for those interested, here are the config files used for this
> run:
> >
> >swift.properties:
> >
> >sites.file=config/coaster_ranger.xml
> >tc.file=/ci/projects/cnari/config/tc.data
> >lazy.errors=false
> >caching.algorithm=LRU
> >pgraph=false
> >pgraph.graph.options=splines="compound", rankdir="TB"
> >pgraph.node.options=color="seagreen", style="filled"
> >clustering.enabled=false
> >clustering.queue.delay=4
> >clustering.min.time=60
> >kickstart.enabled=maybe
> >kickstart.always.transfer=false
> >wrapperlog.always.transfer=false
> >throttle.submit=3
> >throttle.host.submit=8
> >throttle.score.job.factor=64
> >throttle.transfers=16
> >throttle.file.operations=16
> >sitedir.keep=false
> >execution.retries=3
> >replication.enabled=false
> >replication.min.queue.time=60
> >replication.limit=3
> >foreach.max.threads=16384
> >
> >coaster_ranger.xml:
> >
> ><config>
> > <import file="sys.xml"/>
> ><import file="vdl-lib.xml"/>
> ><set name="username">
> >  <arg><vdl:new type="string" value="user"/></arg>
> ></set>
> ><pool handle="RANGER">
> >    <profile namespace="karajan"
> >key="jobThrottle">1000.0</profile>
> >    <filesystem provider="coaster"
> >url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
> >    <profile namespace="globus" key="queue">normal</profile>
> >    <profile namespace="globus" key="workersPerNode">32</profile>
> >    <profile namespace="globus" key="nodeGranularity">1</profile>
> >    <profile namespace="globus" key="slots">16</profile>
> >    <profile namespace="globus" key="maxNodes">8192</profile>
> >    <profile namespace="globus" key="maxTime">72000</profile>
> >    <profile namespace="globus"
> >key="project">TG-DBS080004N</profile>
> >    <execution provider="coaster"
> >url="gatekeeper.ranger.tacc.teragrid.org"
> >jobManager="gt2:gt2:SGE"/>
> >    <!--
> >workdirectory>/work/00043/tg457040/sidgrid_out/{username}</workdirectory
> >-->
> >   
> ><workdirectory>/work/00926/tg459516/sidgrid_out/{username}</workdirectory>
> >  </pool>
> ></config>
> >
> >
> >
> >---- Original message ----
> >>Date: Tue, 13 Oct 2009 11:14:17 -0500 (CDT)
> >>From: <skenny at uchicago.edu>  
> >>Subject: [Swift-devel] burnin' up ranger w/the latest coasters  
> >>To: swift-devel at ci.uchicago.edu
> >>
> >>Final status:  Finished successfully:131072 
> >>
> >>re-running some of the workflows from our recent SEM
> >>paper with the latest swift...sadly, queue time on ranger has
> >>only gone up since those initial runs...but luckily coasters
> >>has speeded things up, so it ends up evening out for time to
> >>solution :)
> >>
> >>not sure i fully understand the plot:
> >>
> >>http://www.ci.uchicago.edu/~skenny/workflows/sem_131k/
> >>
> >>log is here:
> >>
> >>/ci/projects/cnari/logs/skenny/4reg_2cond-20091012-1607-ugidm2s2.log
> >>_______________________________________________
> >>Swift-devel mailing list
> >>Swift-devel at ci.uchicago.edu
> >>http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >_______________________________________________
> >Swift-devel mailing list
> >Swift-devel at ci.uchicago.edu
> >http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel


From skenny at uchicago.edu  Fri Oct 23 13:02:00 2009
From: skenny at uchicago.edu (skenny at uchicago.edu)
Date: Fri, 23 Oct 2009 13:02:00 -0500 (CDT)
Subject: [Swift-user] Re: [Swift-devel] burnin' up ranger w/the
 latest coasters
In-Reply-To: <1256315830.10810.5.camel@localhost>
References: <20091013111417.CDU59058@m4500-02.uchicago.edu>
	<20091014153915.CDW69329@m4500-02.uchicago.edu>
	<20091023104504.CEI95549@m4500-02.uchicago.edu>
	<1256315830.10810.5.camel@localhost>
Message-ID: <20091023130200.CEJ16713@m4500-02.uchicago.edu>

>> however...when i use the configs here and i try to run a
>> workflow with 196,608 jobs it seems that coasters starts to
>> ramp up nicely, but maybe a little too well :) as it begins
>> requesting more cores than i'm allowed in the normal queue on
>> ranger. that is, the limit is 4096. i tried changing maxNodes
>> to 4096 which did not work.
>
>Shouldn't that be 4096/workersPerNode?

don't think i'm understanding you here...the workersPerNode
you originally suggested was 32. why would i increase that to
4096 when what i'm trying to do is request fewer total cores? 

>
>>  i'm wondering if workers per node
>> should actually be 16 instead (?) but i know you've gotten it
>> to work well with the setting at 32 so i'm not sure...
>
>You could set it to 16. My reasoning for doubling it was that
if the
>processes you run are slightly I/O bound, then you'd get
slightly better
>performance by running two processes per core. 
>
>> 
>> anyway, it ramped up nicely (and was only like 8 jobs away
>> from finishing the whole thing) i just need to know how to cap
>> it off so it won't ask for more than 4096 cores. 
>> 
>> thanks
>> ~sk
>> 
>> ---- Original message ----
>> >Date: Wed, 14 Oct 2009 15:39:15 -0500 (CDT)
>> >From: <skenny at uchicago.edu>  
>> >Subject: Re: [Swift-devel] burnin' up ranger w/the latest
>> coasters  
>> >To: swift-user at ci.uchicago.edu, swift-devel at ci.uchicago.edu
>> >
>> >for those interested, here are the config files used for this
>> run:
>> >
>> >swift.properties:
>> >
>> >sites.file=config/coaster_ranger.xml
>> >tc.file=/ci/projects/cnari/config/tc.data
>> >lazy.errors=false
>> >caching.algorithm=LRU
>> >pgraph=false
>> >pgraph.graph.options=splines="compound", rankdir="TB"
>> >pgraph.node.options=color="seagreen", style="filled"
>> >clustering.enabled=false
>> >clustering.queue.delay=4
>> >clustering.min.time=60
>> >kickstart.enabled=maybe
>> >kickstart.always.transfer=false
>> >wrapperlog.always.transfer=false
>> >throttle.submit=3
>> >throttle.host.submit=8
>> >throttle.score.job.factor=64
>> >throttle.transfers=16
>> >throttle.file.operations=16
>> >sitedir.keep=false
>> >execution.retries=3
>> >replication.enabled=false
>> >replication.min.queue.time=60
>> >replication.limit=3
>> >foreach.max.threads=16384
>> >
>> >coaster_ranger.xml:
>> >
>> ><config>
>> > <import file="sys.xml"/>
>> ><import file="vdl-lib.xml"/>
>> ><set name="username">
>> >  <arg><vdl:new type="string" value="user"/></arg>
>> ></set>
>> ><pool handle="RANGER">
>> >    <profile namespace="karajan"
>> >key="jobThrottle">1000.0</profile>
>> >    <filesystem provider="coaster"
>> >url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
>> >    <profile namespace="globus" key="queue">normal</profile>
>> >    <profile namespace="globus"
key="workersPerNode">32</profile>
>> >    <profile namespace="globus"
key="nodeGranularity">1</profile>
>> >    <profile namespace="globus" key="slots">16</profile>
>> >    <profile namespace="globus" key="maxNodes">8192</profile>
>> >    <profile namespace="globus" key="maxTime">72000</profile>
>> >    <profile namespace="globus"
>> >key="project">TG-DBS080004N</profile>
>> >    <execution provider="coaster"
>> >url="gatekeeper.ranger.tacc.teragrid.org"
>> >jobManager="gt2:gt2:SGE"/>
>> >    <!--
>>
>workdirectory>/work/00043/tg457040/sidgrid_out/{username}</workdirectory
>> >-->
>> >   
>>
><workdirectory>/work/00926/tg459516/sidgrid_out/{username}</workdirectory>
>> >  </pool>
>> ></config>
>> >
>> >
>> >
>> >---- Original message ----
>> >>Date: Tue, 13 Oct 2009 11:14:17 -0500 (CDT)
>> >>From: <skenny at uchicago.edu>  
>> >>Subject: [Swift-devel] burnin' up ranger w/the latest
coasters  
>> >>To: swift-devel at ci.uchicago.edu
>> >>
>> >>Final status:  Finished successfully:131072 
>> >>
>> >>re-running some of the workflows from our recent SEM
>> >>paper with the latest swift...sadly, queue time on ranger has
>> >>only gone up since those initial runs...but luckily coasters
>> >>has speeded things up, so it ends up evening out for time to
>> >>solution :)
>> >>
>> >>not sure i fully understand the plot:
>> >>
>> >>http://www.ci.uchicago.edu/~skenny/workflows/sem_131k/
>> >>
>> >>log is here:
>> >>
>>
>>/ci/projects/cnari/logs/skenny/4reg_2cond-20091012-1607-ugidm2s2.log
>> >>_______________________________________________
>> >>Swift-devel mailing list
>> >>Swift-devel at ci.uchicago.edu
>> >>http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>> >_______________________________________________
>> >Swift-devel mailing list
>> >Swift-devel at ci.uchicago.edu
>> >http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>


From hategan at mcs.anl.gov  Fri Oct 23 13:18:24 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Fri, 23 Oct 2009 13:18:24 -0500
Subject: [Swift-user] Re: [Swift-devel] burnin' up ranger w/the latest
	coasters
In-Reply-To: <20091023130200.CEJ16713@m4500-02.uchicago.edu>
References: <20091013111417.CDU59058@m4500-02.uchicago.edu>
	<20091014153915.CDW69329@m4500-02.uchicago.edu>
	<20091023104504.CEI95549@m4500-02.uchicago.edu>
	<1256315830.10810.5.camel@localhost>
	<20091023130200.CEJ16713@m4500-02.uchicago.edu>
Message-ID: <1256321904.15412.3.camel@localhost>

On Fri, 2009-10-23 at 13:02 -0500, skenny at uchicago.edu wrote:
> >> however...when i use the configs here and i try to run a
> >> workflow with 196,608 jobs it seems that coasters starts to
> >> ramp up nicely, but maybe a little too well :) as it begins
> >> requesting more cores than i'm allowed in the normal queue on
> >> ranger. that is, the limit is 4096. i tried changing maxNodes
> >> to 4096 which did not work.
> >
> >Shouldn't that be 4096/workersPerNode?
> 
> don't think i'm understanding you here...the workersPerNode
> you originally suggested was 32. why would i increase that to
> 4096 when what i'm trying to do is request fewer total cores? 

No.
If you have C CORESPerNODE and the maximum number of CORES you can
request is 4096, then the maximum NODES you can request is 4096/C not
4096.

You are setting maxNodes to 4096. That means it will request 4096*16
cores.


From wtan at mcs.anl.gov  Mon Oct 26 16:56:03 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Mon, 26 Oct 2009 16:56:03 -0500
Subject: [Swift-user] Chesnoknov workflow
Message-ID: <4AE61AF3.3040208@mcs.anl.gov>

Hi Mike and others,

    A recapture of the story:
    We have a chesnokov workflow which contains a forEach and the input 
file array has 1171 files.
Running it at a desktop and in a 32-core server have different 
performance indexes, in terms of execution time
See 
http://spreadsheets.google.com/ccc?key=0AriiWNEG__VUdEM0ampSaWRqcGROTW1TNE00X29GVHc&hl=en

--------------------the data Mike wants to 
see---------------------------------------------
   In the same 32-core machine (crush), using a local file system 
instead of the share file system, will reduce the execution time, from 
*2min20sec~2min50sec*, to *1min40sec~1min50sec*.
--------------------end of the the data Mike wants to 
see---------------------------------------------
 

  Best regards,

Wei

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan


From foster at anl.gov  Mon Oct 26 17:09:00 2009
From: foster at anl.gov (Ian Foster)
Date: Mon, 26 Oct 2009 17:09:00 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE61AF3.3040208@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
Message-ID: <00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>

that's great. Do you have the Swift log plots?

On Oct 26, 2009, at 4:56 PM, Wei Tan wrote:

> Hi Mike and others,
>
>   A recapture of the story:
>   We have a chesnokov workflow which contains a forEach and the  
> input file array has 1171 files.
> Running it at a desktop and in a 32-core server have different  
> performance indexes, in terms of execution time
> See http://spreadsheets.google.com/ccc?key=0AriiWNEG__VUdEM0ampSaWRqcGROTW1TNE00X29GVHc&hl=en
>
> --------------------the data Mike wants to  
> see---------------------------------------------
>  In the same 32-core machine (crush), using a local file system  
> instead of the share file system, will reduce the execution time,  
> from *2min20sec~2min50sec*, to *1min40sec~1min50sec*.
> --------------------end of the the data Mike wants to  
> see---------------------------------------------
>
> Best regards,
>
> Wei
>
> -- 
> Wei Tan, Ph.D.
> Computation Institute
> the University of Chicago|Argonne National Laboratory
> http://www.mcs.anl.gov/~wtan
>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user


From wtan at mcs.anl.gov  Mon Oct 26 17:18:46 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Mon, 26 Oct 2009 17:18:46 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
Message-ID: <4AE62046.1060003@mcs.anl.gov>

Hi Ian,

    I am not sure which log plots you are talking about. But my working 
directory is
crush.mcs.anl.gov/tmp/wtan, I guess you can find all the logs you want 
to see there?

To be more specific:
/workingdir: the directory from which I issue the command line swift 
...:there are some error logs since there are 23/1141 failure jobs.
/ecg3... directory generated when running swift workflow
/swift-workflows the directory containing the workflow
/app: the directory containing the executable and the input files
/swift-0.9 the swift installation directory


   Best regards,

Wei

   
Ian Foster wrote:
> that's great. Do you have the Swift log plots?
>
> On Oct 26, 2009, at 4:56 PM, Wei Tan wrote:
>
>> Hi Mike and others,
>>
>>   A recapture of the story:
>>   We have a chesnokov workflow which contains a forEach and the input 
>> file array has 1171 files.
>> Running it at a desktop and in a 32-core server have different 
>> performance indexes, in terms of execution time
>> See 
>> http://spreadsheets.google.com/ccc?key=0AriiWNEG__VUdEM0ampSaWRqcGROTW1TNE00X29GVHc&hl=en 
>>
>>
>> --------------------the data Mike wants to 
>> see---------------------------------------------
>>  In the same 32-core machine (crush), using a local file system 
>> instead of the share file system, will reduce the execution time, 
>> from *2min20sec~2min50sec*, to *1min40sec~1min50sec*.
>> --------------------end of the the data Mike wants to 
>> see---------------------------------------------
>>
>> Best regards,
>>
>> Wei
>>
>> -- 
>> Wei Tan, Ph.D.
>> Computation Institute
>> the University of Chicago|Argonne National Laboratory
>> http://www.mcs.anl.gov/~wtan
>>
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan


From foster at anl.gov  Mon Oct 26 17:24:08 2009
From: foster at anl.gov (Ian Foster)
Date: Mon, 26 Oct 2009 17:24:08 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE62046.1060003@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
Message-ID: <2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>

Hi Wei:

I can't recall the details, but there are nice tools for generating a  
Web page with plots.

Ian/


On Oct 26, 2009, at 5:18 PM, Wei Tan wrote:

> Hi Ian,
>
>   I am not sure which log plots you are talking about. But my  
> working directory is
> crush.mcs.anl.gov/tmp/wtan, I guess you can find all the logs you  
> want to see there?
>
> To be more specific:
> /workingdir: the directory from which I issue the command line  
> swift ...:there are some error logs since there are 23/1141 failure  
> jobs.
> /ecg3... directory generated when running swift workflow
> /swift-workflows the directory containing the workflow
> /app: the directory containing the executable and the input files
> /swift-0.9 the swift installation directory
>
>
>  Best regards,
>
> Wei
>
>
> Ian Foster wrote:
>> that's great. Do you have the Swift log plots?
>>
>> On Oct 26, 2009, at 4:56 PM, Wei Tan wrote:
>>
>>> Hi Mike and others,
>>>
>>>  A recapture of the story:
>>>  We have a chesnokov workflow which contains a forEach and the  
>>> input file array has 1171 files.
>>> Running it at a desktop and in a 32-core server have different  
>>> performance indexes, in terms of execution time
>>> See http://spreadsheets.google.com/ccc?key=0AriiWNEG__VUdEM0ampSaWRqcGROTW1TNE00X29GVHc&hl=en
>>>
>>> --------------------the data Mike wants to  
>>> see---------------------------------------------
>>> In the same 32-core machine (crush), using a local file system  
>>> instead of the share file system, will reduce the execution time,  
>>> from *2min20sec~2min50sec*, to *1min40sec~1min50sec*.
>>> --------------------end of the the data Mike wants to  
>>> see---------------------------------------------
>>>
>>> Best regards,
>>>
>>> Wei
>>>
>>> -- 
>>> Wei Tan, Ph.D.
>>> Computation Institute
>>> the University of Chicago|Argonne National Laboratory
>>> http://www.mcs.anl.gov/~wtan
>>>
>>> _______________________________________________
>>> Swift-user mailing list
>>> Swift-user at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>
>
> -- 
> Wei Tan, Ph.D.
> Computation Institute
> the University of Chicago|Argonne National Laboratory
> http://www.mcs.anl.gov/~wtan
>


From wilde at mcs.anl.gov  Mon Oct 26 17:28:03 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Mon, 26 Oct 2009 17:28:03 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
Message-ID: <4AE62273.2020500@mcs.anl.gov>

Wei, its the swift-plot-log command, in the Swift user guide at:

http://www.ci.uchicago.edu/swift/guides/userguide.php#id2711073

- Mike

On 10/26/09 5:24 PM, Ian Foster wrote:
> Hi Wei:
> 
> I can't recall the details, but there are nice tools for generating a  
> Web page with plots.
> 
> Ian/
> 
> 
> On Oct 26, 2009, at 5:18 PM, Wei Tan wrote:
> 
>> Hi Ian,
>>
>>   I am not sure which log plots you are talking about. But my  
>> working directory is
>> crush.mcs.anl.gov/tmp/wtan, I guess you can find all the logs you  
>> want to see there?
>>
>> To be more specific:
>> /workingdir: the directory from which I issue the command line  
>> swift ...:there are some error logs since there are 23/1141 failure  
>> jobs.
>> /ecg3... directory generated when running swift workflow
>> /swift-workflows the directory containing the workflow
>> /app: the directory containing the executable and the input files
>> /swift-0.9 the swift installation directory
>>
>>
>>  Best regards,
>>
>> Wei
>>
>>
>> Ian Foster wrote:
>>> that's great. Do you have the Swift log plots?
>>>
>>> On Oct 26, 2009, at 4:56 PM, Wei Tan wrote:
>>>
>>>> Hi Mike and others,
>>>>
>>>>  A recapture of the story:
>>>>  We have a chesnokov workflow which contains a forEach and the  
>>>> input file array has 1171 files.
>>>> Running it at a desktop and in a 32-core server have different  
>>>> performance indexes, in terms of execution time
>>>> See http://spreadsheets.google.com/ccc?key=0AriiWNEG__VUdEM0ampSaWRqcGROTW1TNE00X29GVHc&hl=en
>>>>
>>>> --------------------the data Mike wants to  
>>>> see---------------------------------------------
>>>> In the same 32-core machine (crush), using a local file system  
>>>> instead of the share file system, will reduce the execution time,  
>>>> from *2min20sec~2min50sec*, to *1min40sec~1min50sec*.
>>>> --------------------end of the the data Mike wants to  
>>>> see---------------------------------------------
>>>>
>>>> Best regards,
>>>>
>>>> Wei
>>>>
>>>> -- 
>>>> Wei Tan, Ph.D.
>>>> Computation Institute
>>>> the University of Chicago|Argonne National Laboratory
>>>> http://www.mcs.anl.gov/~wtan
>>>>
>>>> _______________________________________________
>>>> Swift-user mailing list
>>>> Swift-user at ci.uchicago.edu
>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>> -- 
>> Wei Tan, Ph.D.
>> Computation Institute
>> the University of Chicago|Argonne National Laboratory
>> http://www.mcs.anl.gov/~wtan
>>
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user


From wilde at mcs.anl.gov  Mon Oct 26 23:29:09 2009
From: wilde at mcs.anl.gov (Michael Wilde)
Date: Mon, 26 Oct 2009 23:29:09 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE62273.2020500@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov>
Message-ID: <4AE67715.1090909@mcs.anl.gov>

More documentation on the log processing tools is at:

   http://www.ci.uchicago.edu/swift/guides/log-processing.php

- Mike

On 10/26/09 5:28 PM, Michael Wilde wrote:
> Wei, its the swift-plot-log command, in the Swift user guide at:
> 
> http://www.ci.uchicago.edu/swift/guides/userguide.php#id2711073
> 
> - Mike
> 
> On 10/26/09 5:24 PM, Ian Foster wrote:
>> Hi Wei:
>>
>> I can't recall the details, but there are nice tools for generating a  
>> Web page with plots.
>>
>> Ian/
>>
>>
>> On Oct 26, 2009, at 5:18 PM, Wei Tan wrote:
>>
>>> Hi Ian,
>>>
>>>   I am not sure which log plots you are talking about. But my  
>>> working directory is
>>> crush.mcs.anl.gov/tmp/wtan, I guess you can find all the logs you  
>>> want to see there?
>>>
>>> To be more specific:
>>> /workingdir: the directory from which I issue the command line  
>>> swift ...:there are some error logs since there are 23/1141 failure  
>>> jobs.
>>> /ecg3... directory generated when running swift workflow
>>> /swift-workflows the directory containing the workflow
>>> /app: the directory containing the executable and the input files
>>> /swift-0.9 the swift installation directory
>>>
>>>
>>>  Best regards,
>>>
>>> Wei
>>>
>>>
>>> Ian Foster wrote:
>>>> that's great. Do you have the Swift log plots?
>>>>
>>>> On Oct 26, 2009, at 4:56 PM, Wei Tan wrote:
>>>>
>>>>> Hi Mike and others,
>>>>>
>>>>>  A recapture of the story:
>>>>>  We have a chesnokov workflow which contains a forEach and the  
>>>>> input file array has 1171 files.
>>>>> Running it at a desktop and in a 32-core server have different  
>>>>> performance indexes, in terms of execution time
>>>>> See http://spreadsheets.google.com/ccc?key=0AriiWNEG__VUdEM0ampSaWRqcGROTW1TNE00X29GVHc&hl=en
>>>>>
>>>>> --------------------the data Mike wants to  
>>>>> see---------------------------------------------
>>>>> In the same 32-core machine (crush), using a local file system  
>>>>> instead of the share file system, will reduce the execution time,  
>>>>> from *2min20sec~2min50sec*, to *1min40sec~1min50sec*.
>>>>> --------------------end of the the data Mike wants to  
>>>>> see---------------------------------------------
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Wei
>>>>>
>>>>> -- 
>>>>> Wei Tan, Ph.D.
>>>>> Computation Institute
>>>>> the University of Chicago|Argonne National Laboratory
>>>>> http://www.mcs.anl.gov/~wtan
>>>>>
>>>>> _______________________________________________
>>>>> Swift-user mailing list
>>>>> Swift-user at ci.uchicago.edu
>>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>> -- 
>>> Wei Tan, Ph.D.
>>> Computation Institute
>>> the University of Chicago|Argonne National Laboratory
>>> http://www.mcs.anl.gov/~wtan
>>>
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> 


From wtan at mcs.anl.gov  Tue Oct 27 11:49:54 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Tue, 27 Oct 2009 11:49:54 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE67715.1090909@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
Message-ID: <4AE724B2.2050507@mcs.anl.gov>


I got this result when running swift-plot-log

Log file path is /tmp/wtan/workingdir/ecg3-20091026-1332-fkizh09c.log
Log is in directory /tmp/wtan/workingdir
Log basename is ecg3-20091026-1332-fkizh09c
Now in directory /tmp/swift-plot-log-btGjiRNxbiel5334
make: ../swift-0.9/bin/../libexec/log-processing//makefile: No such file 
or directory
make: *** No rule to make target 
`../swift-0.9/bin/../libexec/log-processing//makefile'.  Stop.

 From the webpage, it seems that I need to install gnuplot 4.0, gnu m4, 
gnu textutils, perl first?
I am working on it and will post the result here.

Thanks,

Wei

Michael Wilde wrote:
> More documentation on the log processing tools is at:
>
>   http://www.ci.uchicago.edu/swift/guides/log-processing.php
>
> - Mike
>
> On 10/26/09 5:28 PM, Michael Wilde wrote:
>> Wei, its the swift-plot-log command, in the Swift user guide at:
>>
>> http://www.ci.uchicago.edu/swift/guides/userguide.php#id2711073
>>
>> - Mike
>>
>> On 10/26/09 5:24 PM, Ian Foster wrote:
>>> Hi Wei:
>>>
>>> I can't recall the details, but there are nice tools for generating 
>>> a  Web page with plots.
>>>
>>> Ian/
>>>
>>>
>>> On Oct 26, 2009, at 5:18 PM, Wei Tan wrote:
>>>
>>>> Hi Ian,
>>>>
>>>>   I am not sure which log plots you are talking about. But my  
>>>> working directory is
>>>> crush.mcs.anl.gov/tmp/wtan, I guess you can find all the logs you  
>>>> want to see there?
>>>>
>>>> To be more specific:
>>>> /workingdir: the directory from which I issue the command line  
>>>> swift ...:there are some error logs since there are 23/1141 
>>>> failure  jobs.
>>>> /ecg3... directory generated when running swift workflow
>>>> /swift-workflows the directory containing the workflow
>>>> /app: the directory containing the executable and the input files
>>>> /swift-0.9 the swift installation directory
>>>>
>>>>
>>>>  Best regards,
>>>>
>>>> Wei
>>>>
>>>>
>>>> Ian Foster wrote:
>>>>> that's great. Do you have the Swift log plots?
>>>>>
>>>>> On Oct 26, 2009, at 4:56 PM, Wei Tan wrote:
>>>>>
>>>>>> Hi Mike and others,
>>>>>>
>>>>>>  A recapture of the story:
>>>>>>  We have a chesnokov workflow which contains a forEach and the  
>>>>>> input file array has 1171 files.
>>>>>> Running it at a desktop and in a 32-core server have different  
>>>>>> performance indexes, in terms of execution time
>>>>>> See 
>>>>>> http://spreadsheets.google.com/ccc?key=0AriiWNEG__VUdEM0ampSaWRqcGROTW1TNE00X29GVHc&hl=en 
>>>>>>
>>>>>>
>>>>>> --------------------the data Mike wants to  
>>>>>> see---------------------------------------------
>>>>>> In the same 32-core machine (crush), using a local file system  
>>>>>> instead of the share file system, will reduce the execution 
>>>>>> time,  from *2min20sec~2min50sec*, to *1min40sec~1min50sec*.
>>>>>> --------------------end of the the data Mike wants to  
>>>>>> see---------------------------------------------
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Wei
>>>>>>
>>>>>> -- 
>>>>>> Wei Tan, Ph.D.
>>>>>> Computation Institute
>>>>>> the University of Chicago|Argonne National Laboratory
>>>>>> http://www.mcs.anl.gov/~wtan
>>>>>>
>>>>>> _______________________________________________
>>>>>> Swift-user mailing list
>>>>>> Swift-user at ci.uchicago.edu
>>>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>>> -- 
>>>> Wei Tan, Ph.D.
>>>> Computation Institute
>>>> the University of Chicago|Argonne National Laboratory
>>>> http://www.mcs.anl.gov/~wtan
>>>>
>>> _______________________________________________
>>> Swift-user mailing list
>>> Swift-user at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan


From hategan at mcs.anl.gov  Tue Oct 27 11:58:27 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Tue, 27 Oct 2009 11:58:27 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE724B2.2050507@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
	<4AE724B2.2050507@mcs.anl.gov>
Message-ID: <1256662707.20353.8.camel@localhost>

On Tue, 2009-10-27 at 11:49 -0500, Wei Tan wrote:
> I got this result when running swift-plot-log
> 
> Log file path is /tmp/wtan/workingdir/ecg3-20091026-1332-fkizh09c.log
> Log is in directory /tmp/wtan/workingdir
> Log basename is ecg3-20091026-1332-fkizh09c
> Now in directory /tmp/swift-plot-log-btGjiRNxbiel5334
> make: ../swift-0.9/bin/../libexec/log-processing//makefile: No such file 
> or directory
> make: *** No rule to make target 
> `../swift-0.9/bin/../libexec/log-processing//makefile'.  Stop.
> 
>  From the webpage, it seems that I need to install gnuplot 4.0, gnu m4, 
> gnu textutils, perl first?
> I am working on it and will post the result here.

May I suggest doing this on a CI machine instead of cygwin?


From wtan at mcs.anl.gov  Tue Oct 27 11:59:47 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Tue, 27 Oct 2009 11:59:47 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <1256662707.20353.8.camel@localhost>
References: <4AE61AF3.3040208@mcs.anl.gov>	
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>	
	<4AE62046.1060003@mcs.anl.gov>	
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>	
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>	
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost>
Message-ID: <4AE72703.2020100@mcs.anl.gov>


>
> May I suggest doing this on a CI machine instead of cygwin?
>
>   
Sure, but I am using a MCS machine, not cygwin.  :-)
Thanks,

Wei

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan


From hategan at mcs.anl.gov  Tue Oct 27 12:03:39 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Tue, 27 Oct 2009 12:03:39 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE72703.2020100@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost>
	<4AE72703.2020100@mcs.anl.gov>
Message-ID: <1256663019.20883.0.camel@localhost>

What exact command did you type to launch swift-plot-log?

On Tue, 2009-10-27 at 11:59 -0500, Wei Tan wrote:
> >
> > May I suggest doing this on a CI machine instead of cygwin?
> >
> >   
> Sure, but I am using a MCS machine, not cygwin.  :-)
> Thanks,
> 
> Wei
> 


From hategan at mcs.anl.gov  Tue Oct 27 12:16:20 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Tue, 27 Oct 2009 12:16:20 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <1256663019.20883.0.camel@localhost>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost>
	<4AE72703.2020100@mcs.anl.gov>  <1256663019.20883.0.camel@localhost>
Message-ID: <1256663780.22694.5.camel@localhost>

On Tue, 2009-10-27 at 12:03 -0500, Mihael Hategan wrote:
> What exact command did you type to launch swift-plot-log?

> ../swift-0.9/bin/swift-plot-log ecg3-20091026-1332-fkizh09c.log

For reference:

The solution is:
$PWD/../swift-0.9/bin/swift-plot-log ecg3-20091026-1332-fkizh09c.log

The problem is that when swift-plot-log is invoked with a relative
directory in the path name, it fails to find its tools after doing a cd.


From wtan at mcs.anl.gov  Tue Oct 27 15:50:07 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Tue, 27 Oct 2009 15:50:07 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <1256663780.22694.5.camel@localhost>
References: <4AE61AF3.3040208@mcs.anl.gov>	
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>	
	<4AE62046.1060003@mcs.anl.gov>	
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>	
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>	
	<4AE724B2.2050507@mcs.anl.gov>
	<1256662707.20353.8.camel@localhost>	
	<4AE72703.2020100@mcs.anl.gov> <1256663019.20883.0.camel@localhost>
	<1256663780.22694.5.camel@localhost>
Message-ID: <4AE75CFF.10201@mcs.anl.gov>

Hi Mihael,

     Thanks. The command generated the msg attached and no result file 
can be found.
      Best regards,

Wei

Mihael Hategan wrote:
> On Tue, 2009-10-27 at 12:03 -0500, Mihael Hategan wrote:
>   
>> What exact command did you type to launch swift-plot-log?
>>     
>
>   
>> ../swift-0.9/bin/swift-plot-log ecg3-20091026-1332-fkizh09c.log
>>     
>
> For reference:
>
> The solution is:
> $PWD/../swift-0.9/bin/swift-plot-log ecg3-20091026-1332-fkizh09c.log
>
> The problem is that when swift-plot-log is invoked with a relative
> directory in the path name, it fails to find its tools after doing a cd.
>
>
>
>   

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: error.txt
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091027/9cbc8f64/attachment.txt>

From hategan at mcs.anl.gov  Tue Oct 27 15:56:43 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Tue, 27 Oct 2009 15:56:43 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE75CFF.10201@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost>
	<4AE72703.2020100@mcs.anl.gov>  <1256663019.20883.0.camel@localhost>
	<1256663780.22694.5.camel@localhost>  <4AE75CFF.10201@mcs.anl.gov>
Message-ID: <1256677003.27643.1.camel@localhost>

On Tue, 2009-10-27 at 15:50 -0500, Wei Tan wrote:
> Hi Mihael,
> 
>      Thanks. The command generated the msg attached and no result file 
> can be found.

There should be a result directory
called /tmp/wtan/workingdir/report-ecg3-20091026-1332-fkizh09c


From wtan at mcs.anl.gov  Tue Oct 27 16:05:47 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Tue, 27 Oct 2009 16:05:47 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <1256677003.27643.1.camel@localhost>
References: <4AE61AF3.3040208@mcs.anl.gov>	
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>	
	<4AE62046.1060003@mcs.anl.gov>	
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>	
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>	
	<4AE724B2.2050507@mcs.anl.gov>
	<1256662707.20353.8.camel@localhost>	
	<4AE72703.2020100@mcs.anl.gov>
	<1256663019.20883.0.camel@localhost>	
	<1256663780.22694.5.camel@localhost> <4AE75CFF.10201@mcs.anl.gov>
	<1256677003.27643.1.camel@localhost>
Message-ID: <4AE760AB.9040809@mcs.anl.gov>

Yes but that is the directory when I execute the workflow?
I issued this command:
/tmp/wtan/swift-0.9/bin/swift-plot-log 
/tmp/wtan/workingdir/ecg3-20091026-1332-fkizh09c.log

Is that the correct log file?

Thanks,

Wei


Mihael Hategan wrote:
> On Tue, 2009-10-27 at 15:50 -0500, Wei Tan wrote:
>   
>> Hi Mihael,
>>
>>      Thanks. The command generated the msg attached and no result file 
>> can be found.
>>     
>
> There should be a result directory
> called /tmp/wtan/workingdir/report-ecg3-20091026-1332-fkizh09c
>
>   

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan


From hategan at mcs.anl.gov  Tue Oct 27 16:09:41 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Tue, 27 Oct 2009 16:09:41 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE760AB.9040809@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost>
	<4AE72703.2020100@mcs.anl.gov>  <1256663019.20883.0.camel@localhost>
	<1256663780.22694.5.camel@localhost>  <4AE75CFF.10201@mcs.anl.gov>
	<1256677003.27643.1.camel@localhost>  <4AE760AB.9040809@mcs.anl.gov>
Message-ID: <1256677781.27934.1.camel@localhost>

On Tue, 2009-10-27 at 16:05 -0500, Wei Tan wrote:
> Yes but that is the directory when I execute the workflow?

No. Note the fact that what I'm pointing at begins with "report-".

> I issued this command:
> /tmp/wtan/swift-0.9/bin/swift-plot-log 
> /tmp/wtan/workingdir/ecg3-20091026-1332-fkizh09c.log

I'm not sure, but I think the report directory is in the same place as
the log.

It may also be in the current directory, whatever that is.

> > There should be a result directory
> > called /tmp/wtan/workingdir/report-ecg3-20091026-1332-fkizh09c
> >
> >   
> 


From wtan at mcs.anl.gov  Tue Oct 27 16:23:06 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Tue, 27 Oct 2009 16:23:06 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <1256677781.27934.1.camel@localhost>
References: <4AE61AF3.3040208@mcs.anl.gov>	
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>	
	<4AE62046.1060003@mcs.anl.gov>	
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>	
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>	
	<4AE724B2.2050507@mcs.anl.gov>
	<1256662707.20353.8.camel@localhost>	
	<4AE72703.2020100@mcs.anl.gov>
	<1256663019.20883.0.camel@localhost>	
	<1256663780.22694.5.camel@localhost> <4AE75CFF.10201@mcs.anl.gov>	
	<1256677003.27643.1.camel@localhost> <4AE760AB.9040809@mcs.anl.gov>
	<1256677781.27934.1.camel@localhost>
Message-ID: <4AE764BA.70101@mcs.anl.gov>

There is no report-* directory generated. I think the stdout contains 
some error msg so I posted it in my previous email.

Thanks,
Wei

Mihael Hategan wrote:
> On Tue, 2009-10-27 at 16:05 -0500, Wei Tan wrote:
>   
>> Yes but that is the directory when I execute the workflow?
>>     
>
> No. Note the fact that what I'm pointing at begins with "report-".
>
>   
>> I issued this command:
>> /tmp/wtan/swift-0.9/bin/swift-plot-log 
>> /tmp/wtan/workingdir/ecg3-20091026-1332-fkizh09c.log
>>     
>
> I'm not sure, but I think the report directory is in the same place as
> the log.
>
> It may also be in the current directory, whatever that is.
>
>   
>>> There should be a result directory
>>> called /tmp/wtan/workingdir/report-ecg3-20091026-1332-fkizh09c
>>>
>>>   
>>>       
>
>   

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan


From hategan at mcs.anl.gov  Tue Oct 27 16:28:28 2009
From: hategan at mcs.anl.gov (Mihael Hategan)
Date: Tue, 27 Oct 2009 16:28:28 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE764BA.70101@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost>
	<4AE72703.2020100@mcs.anl.gov>  <1256663019.20883.0.camel@localhost>
	<1256663780.22694.5.camel@localhost>  <4AE75CFF.10201@mcs.anl.gov>
	<1256677003.27643.1.camel@localhost>  <4AE760AB.9040809@mcs.anl.gov>
	<1256677781.27934.1.camel@localhost>  <4AE764BA.70101@mcs.anl.gov>
Message-ID: <1256678908.28392.1.camel@localhost>

On Tue, 2009-10-27 at 16:23 -0500, Wei Tan wrote:
> There is no report-* directory generated. I think the stdout contains 
> some error msg so I posted it in my previous email.

I don't see anything. Maybe the error was on stderr or a sub-process?


From wtan at mcs.anl.gov  Tue Oct 27 16:47:25 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Tue, 27 Oct 2009 16:47:25 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <1256678908.28392.1.camel@localhost>
References: <4AE61AF3.3040208@mcs.anl.gov>	
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>	
	<4AE62046.1060003@mcs.anl.gov>	
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>	
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>	
	<4AE724B2.2050507@mcs.anl.gov>
	<1256662707.20353.8.camel@localhost>	
	<4AE72703.2020100@mcs.anl.gov>
	<1256663019.20883.0.camel@localhost>	
	<1256663780.22694.5.camel@localhost> <4AE75CFF.10201@mcs.anl.gov>	
	<1256677003.27643.1.camel@localhost>
	<4AE760AB.9040809@mcs.anl.gov>	
	<1256677781.27934.1.camel@localhost> <4AE764BA.70101@mcs.anl.gov>
	<1256678908.28392.1.camel@localhost>
Message-ID: <4AE76A6D.1000808@mcs.anl.gov>

I attached the stdout and stderr.

Mihael Hategan wrote:
> On Tue, 2009-10-27 at 16:23 -0500, Wei Tan wrote:
>   
>> There is no report-* directory generated. I think the stdout contains 
>> some error msg so I posted it in my previous email.
>>     
>
> I don't see anything. Maybe the error was on stderr or a sub-process?
>
>
>   

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: error2.txt
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20091027/155d757a/attachment.txt>

From benc at hawaga.org.uk  Wed Oct 28 22:47:53 2009
From: benc at hawaga.org.uk (Ben Clifford)
Date: Thu, 29 Oct 2009 03:47:53 +0000 (GMT)
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <1256663780.22694.5.camel@localhost>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost>
	<4AE72703.2020100@mcs.anl.gov>  <1256663019.20883.0.camel@localhost>
	<1256663780.22694.5.camel@localhost>
Message-ID: <Pine.LNX.4.64.0910290347380.14799@dildano.hawaga.org.uk>


> For reference:
> 
> The solution is:
> $PWD/../swift-0.9/bin/swift-plot-log ecg3-20091026-1332-fkizh09c.log

or put swift on your path?

-- 


From benc at hawaga.org.uk  Wed Oct 28 22:53:37 2009
From: benc at hawaga.org.uk (Ben Clifford)
Date: Thu, 29 Oct 2009 03:53:37 +0000 (GMT)
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <4AE76A6D.1000808@mcs.anl.gov>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov> 
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost> 
	<4AE72703.2020100@mcs.anl.gov> <1256663019.20883.0.camel@localhost> 
	<1256663780.22694.5.camel@localhost> <4AE75CFF.10201@mcs.anl.gov> 
	<1256677003.27643.1.camel@localhost> <4AE760AB.9040809@mcs.anl.gov> 
	<1256677781.27934.1.camel@localhost> <4AE764BA.70101@mcs.anl.gov>
	<1256678908.28392.1.camel@localhost> <4AE76A6D.1000808@mcs.anl.gov>
Message-ID: <Pine.LNX.4.64.0910290352130.14799@dildano.hawaga.org.uk>


The error is with missing kickstart directory - I thought r2794 fixed 
that, but perhaps not...


From wtan at mcs.anl.gov  Wed Oct 28 23:01:29 2009
From: wtan at mcs.anl.gov (Wei Tan)
Date: Wed, 28 Oct 2009 23:01:29 -0500
Subject: [Swift-user] Chesnoknov workflow
In-Reply-To: <Pine.LNX.4.64.0910290352130.14799@dildano.hawaga.org.uk>
References: <4AE61AF3.3040208@mcs.anl.gov>
	<00D43302-AE07-449E-AE6B-BAB5F41316FB@anl.gov>
	<4AE62046.1060003@mcs.anl.gov>
	<2F0ABDE0-FCD3-44C8-B33B-1302DBB485C2@anl.gov>
	<4AE62273.2020500@mcs.anl.gov> <4AE67715.1090909@mcs.anl.gov>
	<4AE724B2.2050507@mcs.anl.gov> <1256662707.20353.8.camel@localhost>
	<4AE72703.2020100@mcs.anl.gov> <1256663019.20883.0.camel@localhost>
	<1256663780.22694.5.camel@localhost> <4AE75CFF.10201@mcs.anl.gov>
	<1256677003.27643.1.camel@localhost> <4AE760AB.9040809@mcs.anl.gov>
	<1256677781.27934.1.camel@localhost> <4AE764BA.70101@mcs.anl.gov>
	<1256678908.28392.1.camel@localhost> <4AE76A6D.1000808@mcs.anl.gov>
	<Pine.LNX.4.64.0910290352130.14799@dildano.hawaga.org.uk>
Message-ID: <4AE91399.3080604@mcs.anl.gov>

Hi Ben,

    I used the absolute directory and it works fine now. Thanks for your 
reply anyway :-)

    Best regards,

Wei

Ben Clifford wrote:
> The error is with missing kickstart directory - I thought r2794 fixed 
> that, but perhaps not...
>
>
>
>   

-- 
Wei Tan, Ph.D.
Computation Institute
the University of Chicago|Argonne National Laboratory
http://www.mcs.anl.gov/~wtan