From wilde at mcs.anl.gov Wed Jun 3 08:04:29 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 03 Jun 2009 08:04:29 -0500 Subject: [Swift-user] Re: [Swft] joining up with swift users In-Reply-To: <4A266903.4030507@mcs.anl.gov> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36EF6@BALI.uhd.campus> <1244014767.32393.3.camel@localhost> <4A266903.4030507@mcs.anl.gov> Message-ID: <4A2674DD.6080902@mcs.anl.gov> I think good sites to use for initial work are TeraPort (fast queue) and uc-teragrid, and Ranger to work with Sarah's SEM scripts. I suggested later usong using OSG and the swift-osg-ress-site-catalog command (after the COndor-G provider is tested further). Is this command documented? I thought there was a section describing it in the Users Guide "How-To Tips for Specific User Communities" section, but I was mistaken. - Mike On 6/3/09 7:13 AM, Michael Wilde wrote: > Erin will be using the SIDGrid/CNARI TG allocation. > > - Mike > > On 6/3/09 2:39 AM, Mihael Hategan wrote: >> On Tue, 2009-06-02 at 22:08 -0500, Hodgess, Erin wrote: >>> Hi Swift users! >> >> You are probably looking for either swift-devel at ci.uchicago.edu or >> swift-user at ci.uchicago.edu or both. >>> Do I need to sign up to join you, please? >> >> Instructions for joining those mailing lists are at >> http://www.ci.uchicago.edu/swift/support/index.php. >>> Also, could someone direct me to a good sites.xml file please? >> >> What sites are you looking forward to using? Do you have a Teragrid >> allocation? >> >> Mihael >> From hategan at mcs.anl.gov Wed Jun 3 08:46:50 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 03 Jun 2009 08:46:50 -0500 Subject: [Swift-user] Re: [Swft] joining up with swift users In-Reply-To: <4A2674DD.6080902@mcs.anl.gov> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36EF6@BALI.uhd.campus> <1244014767.32393.3.camel@localhost> <4A266903.4030507@mcs.anl.gov> <4A2674DD.6080902@mcs.anl.gov> Message-ID: <1244036810.5930.0.camel@localhost> On Wed, 2009-06-03 at 08:04 -0500, Michael Wilde wrote: > I think good sites to use for initial work are TeraPort (fast queue) and > uc-teragrid, and Ranger to work with Sarah's SEM scripts. > > I suggested later usong using OSG and the swift-osg-ress-site-catalog > command (after the COndor-G provider is tested further). Is this command > documented? I thought there was a section describing it in the Users > Guide "How-To Tips for Specific User Communities" section, but I was > mistaken. Right. It's in the "Commands" section: http://www.ci.uchicago.edu/swift/guides/userguide.php#id2643664 From benc at hawaga.org.uk Wed Jun 3 09:51:42 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 3 Jun 2009 14:51:42 +0000 (GMT) Subject: [Swift-user] Re: [Swft] joining up with swift users In-Reply-To: <4A2674DD.6080902@mcs.anl.gov> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36EF6@BALI.uhd.campus> <1244014767.32393.3.camel@localhost> <4A266903.4030507@mcs.anl.gov> <4A2674DD.6080902@mcs.anl.gov> Message-ID: > (after the COndor-G provider is tested further). Is this command documented? I > thought there was a section describing it in the Users Guide "How-To Tips for > Specific User Communities" section, but I was mistaken. It was there. It is in the commands section now. -- From zhaozhang at uchicago.edu Wed Jun 3 09:53:55 2009 From: zhaozhang at uchicago.edu (Zhao Zhang) Date: Wed, 03 Jun 2009 09:53:55 -0500 Subject: [Swift-user] Re: [Swft] joining up with swift users In-Reply-To: <4A2674DD.6080902@mcs.anl.gov> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36EF6@BALI.uhd.campus> <1244014767.32393.3.camel@localhost> <4A266903.4030507@mcs.anl.gov> <4A2674DD.6080902@mcs.anl.gov> Message-ID: <4A268E83.8060207@uchicago.edu> Michael Wilde wrote: > I think good sites to use for initial work are TeraPort (fast queue) and > uc-teragrid, and Ranger to work with Sarah's SEM scripts. I have sites.xml definitions with coaster for uc-teragrid and Rnager. and condor-g for TeraPort. zhao > > I suggested later usong using OSG and the swift-osg-ress-site-catalog > command (after the COndor-G provider is tested further). Is this > command documented? I thought there was a section describing it in the > Users Guide "How-To Tips for Specific User Communities" section, but I > was mistaken. > > - Mike > > > On 6/3/09 7:13 AM, Michael Wilde wrote: >> Erin will be using the SIDGrid/CNARI TG allocation. >> >> - Mike >> >> On 6/3/09 2:39 AM, Mihael Hategan wrote: >>> On Tue, 2009-06-02 at 22:08 -0500, Hodgess, Erin wrote: >>>> Hi Swift users! >>> >>> You are probably looking for either swift-devel at ci.uchicago.edu or >>> swift-user at ci.uchicago.edu or both. >>>> Do I need to sign up to join you, please? >>> >>> Instructions for joining those mailing lists are at >>> http://www.ci.uchicago.edu/swift/support/index.php. >>>> Also, could someone direct me to a good sites.xml file please? >>> >>> What sites are you looking forward to using? Do you have a Teragrid >>> allocation? >>> >>> Mihael >>> > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From wilde at mcs.anl.gov Wed Jun 3 10:09:32 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 03 Jun 2009 10:09:32 -0500 Subject: [Swift-user] Re: [Swft] joining up with swift users In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36EFC@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36EF6@BALI.uhd.campus> <1244014767.32393.3.camel@localhost> <4A266903.4030507@mcs.anl.gov> <4A2674DD.6080902@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C36EFC@BALI.uhd.campus> Message-ID: <4A26922C.9070602@mcs.anl.gov> Erin, I think its best to start with one or two systems and master them first. When you are ready to try OSG, its best to do this via the "engage" VO, as that VO has a good support team and also tests the sites in the VO for correct operation for the VO members. OSG has no "login" hosts: you need to do everything via globus-job-run and globus-url-copy. UberFTP is a convenient way to "login" to OSG sites to peruse files and manage files and directories (basically any subcommand of FTP is available: ls, pwd, mkdir, rm, etc) When you get to the point of wanting to put R at several OSG sites, you can do one manually using "pacman" and then used "ADEM" to automatically install it on many sites. Thats experimental software, for which we need to post a link to the latest docs. Done by Zhengxiong Hou, a CS grad student from China who spent last year at CI and recently returned home. He's still eager to see people use, test, and improve ADEM. Ben may have further (or better) pointers. - Mike On 6/3/09 9:52 AM, Hodgess, Erin wrote: > Hi Mike: > > In order to use OSG, should I just log into osgedu.cs.clemson.edu, or is > there a tp-something to log into, please? > > Thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > -----Original Message----- > From: Michael Wilde [mailto:wilde at mcs.anl.gov] > Sent: Wed 6/3/2009 8:04 AM > To: Hodgess, Erin > Cc: Swift User Discussion List > Subject: Re: [Swft] joining up with swift users > > I think good sites to use for initial work are TeraPort (fast queue) and > uc-teragrid, and Ranger to work with Sarah's SEM scripts. > > I suggested later usong using OSG and the swift-osg-ress-site-catalog > command (after the COndor-G provider is tested further). Is this command > documented? I thought there was a section describing it in the Users > Guide "How-To Tips for Specific User Communities" section, but I was > mistaken. > > - Mike > > > On 6/3/09 7:13 AM, Michael Wilde wrote: > > Erin will be using the SIDGrid/CNARI TG allocation. > > > > - Mike > > > > On 6/3/09 2:39 AM, Mihael Hategan wrote: > >> On Tue, 2009-06-02 at 22:08 -0500, Hodgess, Erin wrote: > >>> Hi Swift users! > >> > >> You are probably looking for either swift-devel at ci.uchicago.edu or > >> swift-user at ci.uchicago.edu or both. > >>> Do I need to sign up to join you, please? > >> > >> Instructions for joining those mailing lists are at > >> http://www.ci.uchicago.edu/swift/support/index.php. > >>> Also, could someone direct me to a good sites.xml file please? > >> > >> What sites are you looking forward to using? Do you have a Teragrid > >> allocation? > >> > >> Mihael > >> > > From HodgessE at uhd.edu Wed Jun 3 14:11:23 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Wed, 3 Jun 2009 14:11:23 -0500 Subject: [Swift-user] (no subject) Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F03@BALI.uhd.campus> Hi Swift Users: I'm trying to collect all of the files with the extension ".jpg", rotate them 180 degrees, and produce output of "hoot.xxx.jpeg". here is my swift file and my error message. I've tried lots of things with files and foreach, but I can't make any sense out of this. Any suggestions, please? [erin at tp-login2 swift1]$ cat rot2.swift type file; app (file o) rotate(file s, int angle) { convert "-rotate" angle @filename(s) @filename(o); } file frame[] ; file output[] ; foreach ix in frame { output[ix] = rotate(frame, 180); } [erin at tp-login2 swift1]$ [erin at tp-login2 swift1]$ swift -tc.file tc.data rot2.swift Could not start execution. Compile error in foreach statement at line 10: Compile error in procedure invocation at line 11: Wrong type for parameter number 0, expected file, got file[] Thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Wed Jun 3 14:40:22 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 03 Jun 2009 14:40:22 -0500 Subject: [Swift-user] (no subject) In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F03@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F03@BALI.uhd.campus> Message-ID: <4A26D1A6.60701@mcs.anl.gov> (Erin, please use good subject lines when sending to the list, as it makes it easier to manage and find discussion threads) The line: foreach ix in frame { output[ix] = rotate(frame, 180); output[ix] = rotate(frame, 180); should be: foreach ix,i in frame { output[i] = rotate(ix, 180); or the inner statement can be: output[i] = rotate(frame[i], 180); Not sure, but you might be able to use simple_mapper for input mapping as well. I'll leave you to experiment with that aspect for now. - Mike On 6/3/09 2:11 PM, Hodgess, Erin wrote: > Hi Swift Users: > > I'm trying to collect all of the files with the extension ".jpg", rotate > them 180 degrees, and produce output of "hoot.xxx.jpeg". > > here is my swift file and my error message. I've tried lots of things > with files and foreach, but I can't make any sense out of this. > > Any suggestions, please? > > > > [erin at tp-login2 swift1]$ cat rot2.swift > type file; > > app (file o) rotate(file s, int angle) { > convert "-rotate" angle @filename(s) @filename(o); > } > > file frame[] ; > file output[] ; > > foreach ix in frame { > output[ix] = rotate(frame, 180); > } [erin at tp-login2 swift1]$ > [erin at tp-login2 swift1]$ swift -tc.file tc.data rot2.swift > Could not start execution. > Compile error in foreach statement at line 10: Compile error in > procedure invocation at line 11: Wrong type for parameter number 0, > expected file, got file[] > > > Thanks, > Erin > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From wilde at mcs.anl.gov Wed Jun 3 14:48:21 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 03 Jun 2009 14:48:21 -0500 Subject: [Swift-user] (no subject) In-Reply-To: <4A26D1A6.60701@mcs.anl.gov> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F03@BALI.uhd.campus> <4A26D1A6.60701@mcs.anl.gov> Message-ID: <4A26D385.9060607@mcs.anl.gov> Also, I should clarify: the message: "Compile error in procedure invocation at line 11: Wrong type for parameter number 0, expected file, got file[]" is really referring to the expression in the assignment statement *within* the foreach. Its complaining that you passed an object of data type "file[]" - i.e., "array of file" to a procedure whose corresponding argument was declared as expecting an object of data type "file" - i.e., a single scalar object of type "file". The ability to pass an entire file[] array is a cool feature of Swift, but not what you wanted to do in this case. - Mike On 6/3/09 2:40 PM, Michael Wilde wrote: > (Erin, please use good subject lines when sending to the list, as it > makes it easier to manage and find discussion threads) > > The line: > > foreach ix in frame { > output[ix] = rotate(frame, 180); output[ix] = rotate(frame, 180); > > should be: > > foreach ix,i in frame { > output[i] = rotate(ix, 180); > > or the inner statement can be: > output[i] = rotate(frame[i], 180); > > Not sure, but you might be able to use simple_mapper for input mapping > as well. I'll leave you to experiment with that aspect for now. > > - Mike > > > On 6/3/09 2:11 PM, Hodgess, Erin wrote: >> Hi Swift Users: >> >> I'm trying to collect all of the files with the extension ".jpg", >> rotate them 180 degrees, and produce output of "hoot.xxx.jpeg". >> >> here is my swift file and my error message. I've tried lots of things >> with files and foreach, but I can't make any sense out of this. >> >> Any suggestions, please? >> >> >> >> [erin at tp-login2 swift1]$ cat rot2.swift >> type file; >> >> app (file o) rotate(file s, int angle) { >> convert "-rotate" angle @filename(s) @filename(o); >> } >> >> file frame[] ; >> file output[] ; >> >> foreach ix in frame { >> output[ix] = rotate(frame, 180); >> } [erin at tp-login2 swift1]$ >> [erin at tp-login2 swift1]$ swift -tc.file tc.data rot2.swift >> Could not start execution. >> Compile error in foreach statement at line 10: Compile error >> in procedure invocation at line 11: Wrong type for parameter number 0, >> expected file, got file[] >> >> >> Thanks, >> Erin >> >> Erin M. Hodgess, PhD >> Associate Professor >> Department of Computer and Mathematical Sciences >> University of Houston - Downtown >> mailto: hodgesse at uhd.edu >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From zhaozhang at uchicago.edu Wed Jun 3 14:51:54 2009 From: zhaozhang at uchicago.edu (Zhao Zhang) Date: Wed, 03 Jun 2009 14:51:54 -0500 Subject: [Swift-user] (no subject) In-Reply-To: <4A26D385.9060607@mcs.anl.gov> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F03@BALI.uhd.campus> <4A26D1A6.60701@mcs.anl.gov> <4A26D385.9060607@mcs.anl.gov> Message-ID: <4A26D45A.4090608@uchicago.edu> Hi, Mike I am helping Erin on this, it seems we figure the problem out, and we are testing the script further. zhao Michael Wilde wrote: > Also, I should clarify: the message: > > "Compile error in procedure invocation at line 11: Wrong type for > parameter number 0, expected file, got file[]" > > is really referring to the expression in the assignment statement > *within* the foreach. > > Its complaining that you passed an object of data type "file[]" - > i.e., "array of file" to a procedure whose corresponding argument was > declared as expecting an object of data type "file" - i.e., a single > scalar object of type "file". > > The ability to pass an entire file[] array is a cool feature of Swift, > but not what you wanted to do in this case. > > - Mike > > > On 6/3/09 2:40 PM, Michael Wilde wrote: >> (Erin, please use good subject lines when sending to the list, as it >> makes it easier to manage and find discussion threads) >> >> The line: >> >> foreach ix in frame { >> output[ix] = rotate(frame, 180); output[ix] = rotate(frame, 180); >> >> should be: >> >> foreach ix,i in frame { >> output[i] = rotate(ix, 180); >> >> or the inner statement can be: >> output[i] = rotate(frame[i], 180); >> >> Not sure, but you might be able to use simple_mapper for input >> mapping as well. I'll leave you to experiment with that aspect for now. >> >> - Mike >> >> >> On 6/3/09 2:11 PM, Hodgess, Erin wrote: >>> Hi Swift Users: >>> >>> I'm trying to collect all of the files with the extension ".jpg", >>> rotate them 180 degrees, and produce output of "hoot.xxx.jpeg". >>> >>> here is my swift file and my error message. I've tried lots of >>> things with files and foreach, but I can't make any sense out of this. >>> >>> Any suggestions, please? >>> >>> >>> >>> [erin at tp-login2 swift1]$ cat rot2.swift >>> type file; >>> >>> app (file o) rotate(file s, int angle) { >>> convert "-rotate" angle @filename(s) @filename(o); >>> } >>> >>> file frame[] ; >>> file output[] ; >>> >>> foreach ix in frame { >>> output[ix] = rotate(frame, 180); >>> } [erin at tp-login2 swift1]$ >>> [erin at tp-login2 swift1]$ swift -tc.file tc.data rot2.swift >>> Could not start execution. >>> Compile error in foreach statement at line 10: Compile error >>> in procedure invocation at line 11: Wrong type for parameter number >>> 0, expected file, got file[] >>> >>> >>> Thanks, >>> Erin >>> >>> Erin M. Hodgess, PhD >>> Associate Professor >>> Department of Computer and Mathematical Sciences >>> University of Houston - Downtown >>> mailto: hodgesse at uhd.edu >>> >>> >>> ------------------------------------------------------------------------ >>> >>> >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From amoore2 at uchicago.edu Wed Jun 3 16:43:35 2009 From: amoore2 at uchicago.edu (Alex Moore) Date: Wed, 3 Jun 2009 16:43:35 -0500 Subject: [Swift-user] Swift + Matlab Message-ID: <8a98db410906031443j528b77fej52f68bee5f0b1c53@mail.gmail.com> Hey Mike, I'm trying to execute the compiled matlab executable through swift, and I'm running into some trouble. I can run swift on the examples provided, and I can compile the matlab executable fine without swift. I've followed the procedure you detailed exactly (I copy and pasted the email below for reference), and I get the same error messages. When I run the swift app you detail below, I get the following messages: Swift 0.9 swift-r2860 cog-r2388 RunID: 20090603-1623-nis6ejxc Progress: Failed to transfer wrapper log from hwsq-20090603-1623-nis6ejxc/info/b on local host Failed to transfer wrapper log from hwsq-20090603-1623-nis6ejxc/info/a on local host Progress Selecting site:7 Stage in:1 Failed but can retry:2 Failed to transfer wrapper log from hwsq-20090603-1623-nis6ejxc/info/e on local host Progress Selecting site:7 Failed but can retry:3 Progress Selecting site:7 Failed but can retry:3 Progress Selecting site:7 Failed but can retry:3 Progress Selecting site:6 Stage in:1 Failed but can retry:3 Failed to transfer wrapper log from hwsq-20090603-1623-nis6ejxc/info/g on local host It keeps on going, picking decreasing the Selecting site: number. The program doesn't terminate either (is there some command I can use to stop it?). I think the trouble might be something with my entries in tc.data? I use this entry for tc.data that you specified (I changed the path and directories for my path): > localhost runhwsq /home/amoore2/Work/hwsq.sh INSTALLED INTEL32::LINUX null I then changed it to run_hwsq.sh since this is also one of the executables that matlab produces, but I still got the same error messages. Also, you seem to use hwsq and runhwsq interchangeably in the swift file and in the tc.data entry above. Here is part of the hwsq.swift file: ,> app (file outdata) hwsq (file indata, int factor) > { > runhwsq @indata @outdata factor; > } > Is there a reason why in the first line you define the app hwsq, then refer to runhwsq? Or, use tc.data to tell swift that runhwsq is located at hwsq.sh? Also, where did you input this wrapper? > ...to call this wrapper (which I wrote): > > --- > > /home/wilde/matlablab/test1/bin/run_hwsq.sh ~/matlablab/MCR/v77/ $* Thanks. -Alex Moore > -------- Original Message -------- > Subject: Swift MatLab example > Date: Wed, 04 Mar 2009 00:16:57 -0600 > From: Michael Wilde > To: Andrew Robert Jamieson > CC: Tibi Stef-Praun , Benjamin Clifford > > References: <49A5F0F7.8000109 at mcs.anl.gov> > > <49A83FA3.4090702 at mcs.anl.gov> > > <49ADA7AF.4000106 at mcs.anl.gov> > > Andrew, > > I played with the mcc compiler till I got a basic test working - a > simple wrapper around the "magic-square" hello-world program, but which > passes the degree f the matrix in via an input file, as well as a > parameter to transform the matrix (multiple by a scalar parameter) and > write and return the final matrix in a file. > > So that gives me this swift app: > > --- > > type file; > > app (file outdata) hwsq (file indata, int factor) > { > runhwsq @indata @outdata factor; > } > > file degreeData<"degree.dat">; > > int factors[] = [0:9]; > file squareMats[] ; > > foreach f, i in factors { > squareMats[i] = hwsq (degreeData, f); > } > > > --- > > to run this matlab code: > > --- > > :::::::::::::: > hwsq.m > :::::::::::::: > function m = hwsq(infile,outfile,fac) > n = dlmread(infile) > if ischar(fac) > fac=str2num(fac); > end > m = myf1(n,fac) > dlmwrite(outfile,m) > :::::::::::::: > myf1.m > :::::::::::::: > function m = myf1(n,fac) > > if ischar(n) > n=str2num(n); > end > g = magic(n) > m = g * fac > > --- > > I use these entries in tc.data: > > --- > > localhost runhwsq /home/wilde/matlablab/test1/bin/hwsq.sh INSTALLED > INTEL32::LINUX null > uc32 runhwsq /home/wilde/mccapps/hwsq/hwsq.sh INSTALLED > INTEL32::LINUX null > > --- > > ...to call this wrapper (which I wrote): > > --- > > /home/wilde/matlablab/test1/bin/run_hwsq.sh ~/matlablab/MCR/v77/ $* > > --- > > which calls the run_hwsq.h wrapper that mcc created for me, when I > compiled my .m files with: > > --- > > mcc -o hwsq -m -d bin -v hwsq.m -a myf1.m > > --- > > Then I edited run_hwsq.sh to specify a full (instead of relative) path > for it to invoke the hwsq compiled executable: > > --- > > /home/wilde/matlablab/test1/bin/hwsq $* > > instead of: > > ./hwsq $* > > --- > > I think that covers it. I was able to run this locally on > login.ci.uchicago.edu, as well as to uc-teragrid 32-bit viz nodes, from > login.ci as submit host. > > I think we have all the parts here to run on x86_64 machines as well. > > Now that I get the gist of matlab compilation and execution, I think I > (we?) can turn this into a mini-matlab tutorial. > > But hopefully the above can get you started. > > Lets meet as soon as possible, ideally after you try to reproduce y > steps, and then to a first-run of your actual code. > > Passing other files, including .mat files, should be similar to the above. > > - Mike From HodgessE at uhd.edu Wed Jun 3 22:03:39 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Wed, 3 Jun 2009 22:03:39 -0500 Subject: [Swift-user] doesn't create output file Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F0C@BALI.uhd.campus> Hi again! I'm looking at some of Sarah's really cool stuff and trying to reproduce test versions. Here is the swift file: [erin at tp-login2 bin]$ cat test1.swift type file{} #--- trying to learn from Sarah's cool stuff [erin at tp-login2 bin]$ cat test1.swift type file{} #--- trying to learn from Sarah's cool stuff app (file simResult) simScript (file scriptFile, file inputFile, int iter,int tval){ RInvoke @filename(scriptFile) @filename(inputFile) iter tval; } file script<"e1.in">; file inputData<"xx.dat">; file simResult<"xy.out">; int iter = 25; int tval = 4; simResult=simScript(script,inputData,iter,tval); [erin at tp-login2 bin]$ And here is the result: [erin at tp-login2 bin]$ swift -tc.file tc.data test1.swift Swift svn swift-r2950 cog-r2406 RunID: 20090603-2200-krmao849 Progress: Progress: Checking status:1 Progress: Checking status:1 Progress: Checking status:1 Execution failed: Exception in RInvoke: Arguments: [e1.in, xx.dat, 25, 4] Host: localhost Directory: test1-20090603-2200-krmao849/jobs/k/RInvoke-k4l6zqbj stderr.txt: stdout.txt: ---- Caused by: The following output files were not created by the application: xy.out [erin at tp-login2 bin]$ The tc.data is here: [erin at tp-login2 bin]$ cat tc.data #NOTE WELL: fields in this file must be separated by tabs, not spaces # and there must be no trailing whitespace at the end of each line. # # sitename app pathname (ignored) (ignored) profiles localhost echo /bin/echo INSTALLED INTEL32::LINUX null teraport echo /bin/echo INSTALLED INTEL32::LINUX null localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null Any suggestions most appreciated. Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Wed Jun 3 23:03:48 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 03 Jun 2009 23:03:48 -0500 Subject: [Swift-user] doesn't create output file In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F0C@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F0C@BALI.uhd.campus> Message-ID: <4A2747A4.2050805@mcs.anl.gov> Erin, Looking at your R script, it seems like it ran OK (from e1.in.Rout) but there is no R statement in the R input that writes the output object xz to the data file xy.out where your swift scriptis expecting to find it. Also, in future email questions, send the name of the directory that you ran from - I had to hunt around a bit in /home/erin to find it. - Mike R output was: more e1.in.Rout R version 2.5.1 (2007-06-27) Copyright (C) 2007 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > allinputs <- Sys.getenv("R_SWIFT_ARGS") > print(allinputs) R_SWIFT_ARGS "xx.dat 25 4" > inputfilename <- noquote(strsplit(allinputs," ")[[1]][1]) > start_num <- as.numeric(noquote(strsplit(allinputs," ")[[1]][2])) > n1 <- as.numeric(noquote(strsplit(allinputs," ")[[1]][3])) > print(n1) [1] 4 > xy <- matrix(scan(inputfilename),byrow=TRUE,ncol=5) Read 50 items > xz <- apply(xy,2,mean) > xz [1] 0.2526954 -0.7243118 0.2969102 -0.2283567 -0.2468762 > > proc.time() user system elapsed 0.775 0.033 0.797 ^^^^ No statement to write xy into xy.out (or to write out any other result, like xz). On 6/3/09 10:03 PM, Hodgess, Erin wrote: > Hi again! > > I'm looking at some of Sarah's really cool stuff and trying to reproduce > test versions. > > Here is the swift file: > [erin at tp-login2 bin]$ cat test1.swift > type file{} > #--- trying to learn from Sarah's cool stuff > > > [erin at tp-login2 bin]$ cat test1.swift > type file{} > #--- trying to learn from Sarah's cool stuff > > > > app (file simResult) simScript (file scriptFile, file inputFile, > int iter,int tval){ > RInvoke @filename(scriptFile) @filename(inputFile) iter tval; > } > > > file script<"e1.in">; > file inputData<"xx.dat">; > file simResult<"xy.out">; > int iter = 25; > int tval = 4; > simResult=simScript(script,inputData,iter,tval); > > [erin at tp-login2 bin]$ > > > And here is the result: > [erin at tp-login2 bin]$ swift -tc.file tc.data test1.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090603-2200-krmao849 > Progress: > Progress: Checking status:1 > Progress: Checking status:1 > Progress: Checking status:1 > Execution failed: > Exception in RInvoke: > Arguments: [e1.in, xx.dat, 25, 4] > Host: localhost > Directory: test1-20090603-2200-krmao849/jobs/k/RInvoke-k4l6zqbj > stderr.txt: > stdout.txt: > ---- > > Caused by: > The following output files were not created by the application: > xy.out > [erin at tp-login2 bin]$ > > > The tc.data is here: > [erin at tp-login2 bin]$ cat tc.data > #NOTE WELL: fields in this file must be separated by tabs, not spaces > # and there must be no trailing whitespace at the end of each > line. > # > # sitename app pathname (ignored) (ignored) > profiles > localhost echo /bin/echo INSTALLED INTEL32::LINUX null > teraport echo /bin/echo INSTALLED INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED > INTEL32::LINUX null > localhost R /home/erin/R-2.9.0/bin/R INSTALLED > INTEL32::LINUX null > localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null > localhost convert /usr/bin/convert INSTALLED > INTEL32::LINUX null > localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh > INSTALLED INTEL32::LINUX null > > Any suggestions most appreciated. > > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From HodgessE at uhd.edu Wed Jun 3 23:17:34 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Wed, 3 Jun 2009 23:17:34 -0500 Subject: [Swift-user] doesn't create output file References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F0C@BALI.uhd.campus> <4A2747A4.2050805@mcs.anl.gov> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F0D@BALI.uhd.campus> That was it! thank you, e Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Michael Wilde [mailto:wilde at mcs.anl.gov] Sent: Wed 6/3/2009 11:03 PM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] doesn't create output file Erin, Looking at your R script, it seems like it ran OK (from e1.in.Rout) but there is no R statement in the R input that writes the output object xz to the data file xy.out where your swift scriptis expecting to find it. Also, in future email questions, send the name of the directory that you ran from - I had to hunt around a bit in /home/erin to find it. - Mike R output was: more e1.in.Rout R version 2.5.1 (2007-06-27) Copyright (C) 2007 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > allinputs <- Sys.getenv("R_SWIFT_ARGS") > print(allinputs) R_SWIFT_ARGS "xx.dat 25 4" > inputfilename <- noquote(strsplit(allinputs," ")[[1]][1]) > start_num <- as.numeric(noquote(strsplit(allinputs," ")[[1]][2])) > n1 <- as.numeric(noquote(strsplit(allinputs," ")[[1]][3])) > print(n1) [1] 4 > xy <- matrix(scan(inputfilename),byrow=TRUE,ncol=5) Read 50 items > xz <- apply(xy,2,mean) > xz [1] 0.2526954 -0.7243118 0.2969102 -0.2283567 -0.2468762 > > proc.time() user system elapsed 0.775 0.033 0.797 ^^^^ No statement to write xy into xy.out (or to write out any other result, like xz). On 6/3/09 10:03 PM, Hodgess, Erin wrote: > Hi again! > > I'm looking at some of Sarah's really cool stuff and trying to reproduce > test versions. > > Here is the swift file: > [erin at tp-login2 bin]$ cat test1.swift > type file{} > #--- trying to learn from Sarah's cool stuff > > > [erin at tp-login2 bin]$ cat test1.swift > type file{} > #--- trying to learn from Sarah's cool stuff > > > > app (file simResult) simScript (file scriptFile, file inputFile, > int iter,int tval){ > RInvoke @filename(scriptFile) @filename(inputFile) iter tval; > } > > > file script<"e1.in">; > file inputData<"xx.dat">; > file simResult<"xy.out">; > int iter = 25; > int tval = 4; > simResult=simScript(script,inputData,iter,tval); > > [erin at tp-login2 bin]$ > > > And here is the result: > [erin at tp-login2 bin]$ swift -tc.file tc.data test1.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090603-2200-krmao849 > Progress: > Progress: Checking status:1 > Progress: Checking status:1 > Progress: Checking status:1 > Execution failed: > Exception in RInvoke: > Arguments: [e1.in, xx.dat, 25, 4] > Host: localhost > Directory: test1-20090603-2200-krmao849/jobs/k/RInvoke-k4l6zqbj > stderr.txt: > stdout.txt: > ---- > > Caused by: > The following output files were not created by the application: > xy.out > [erin at tp-login2 bin]$ > > > The tc.data is here: > [erin at tp-login2 bin]$ cat tc.data > #NOTE WELL: fields in this file must be separated by tabs, not spaces > # and there must be no trailing whitespace at the end of each > line. > # > # sitename app pathname (ignored) (ignored) > profiles > localhost echo /bin/echo INSTALLED INTEL32::LINUX null > teraport echo /bin/echo INSTALLED INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED > INTEL32::LINUX null > localhost R /home/erin/R-2.9.0/bin/R INSTALLED > INTEL32::LINUX null > localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null > localhost convert /usr/bin/convert INSTALLED > INTEL32::LINUX null > localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh > INSTALLED INTEL32::LINUX null > > Any suggestions most appreciated. > > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Thu Jun 4 12:36:48 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Thu, 4 Jun 2009 12:36:48 -0500 Subject: [Swift-user] Writing to an output file Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F1B@BALI.uhd.campus> Here is what SHOULD be a simple process. I'm writing to an array, and sending that to an output file. Here are the files and the results: [erin at tp-login2 swift1]$ cat iter2.swift type file; app initialCondition() { int x = 10; } int step[0]=initialCondition(); file o <"output.txt">; o=step[0]; [erin at tp-login2 swift1]$ swift -tc.file tc.data iter2.swift Could not compile SwiftScript source: line 4:13: unexpected token: x [erin at tp-login2 swift1]$ The file is in /home/erin/swift1/iter2.swift. Thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Thu Jun 4 13:06:16 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 04 Jun 2009 13:06:16 -0500 Subject: [Swift-user] Writing to an output file In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F1B@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F1B@BALI.uhd.campus> Message-ID: <4A280D18.1020508@mcs.anl.gov> Erin, An app function can only be the declaration of a single external application. It can't be anything else - and here you're trying to use an app declaration to return a value, like a Java property. You'll typically want to place statements like "int x = 10" in open code (ie outside all otehr procedures, typically at the end of your script) or inside a compound procedure. Next, what you're trying to do here doesnt fit well with the Swift model: you typically dont take data that you compute directly in Swift (as opposed to in a Swift app) and write that data to a file. Files are typically written only by app procedures - i.e., they contain the output of programs. - Mike In this test, were you just trying to fill an array with scalar values On 6/4/09 12:36 PM, Hodgess, Erin wrote: > Here is what SHOULD be a simple process. I'm writing to an array, and > sending that to an output file. Here are the files and the results: > > [erin at tp-login2 swift1]$ cat iter2.swift > type file; > > app initialCondition() { > int x = 10; > } > > int step[0]=initialCondition(); > > file o <"output.txt">; > > o=step[0]; > [erin at tp-login2 swift1]$ swift -tc.file tc.data iter2.swift > Could not compile SwiftScript source: line 4:13: unexpected token: x > [erin at tp-login2 swift1]$ > > The file is in /home/erin/swift1/iter2.swift. > > Thanks, > Erin > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From hategan at mcs.anl.gov Thu Jun 4 16:34:56 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 04 Jun 2009 16:34:56 -0500 Subject: [Swift-user] Writing to an output file In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F1B@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F1B@BALI.uhd.campus> Message-ID: <1244151296.29231.9.camel@localhost> For most practical purposes, there are no side-effects in swift. "int x = 0" means "For convenience, and for the current lexical scope, x will be 0". It's the common (though reversed) "statement, where x = 0" from math, except there is no "statement" in your case. Outside of "initialCondition" it doesn't mean anything. I'm not sure what exactly the compiler is complaining about, but what you wrote doesn't mean much. That plus what Mike said in his reply to this message. May I suggest expressing in simple words what you are trying to achieve? >From that, it's easier to both translate to swift and for us to help with that. Mihael On Thu, 2009-06-04 at 12:36 -0500, Hodgess, Erin wrote: > Here is what SHOULD be a simple process. I'm writing to an array, and > sending that to an output file. Here are the files and the results: > > [erin at tp-login2 swift1]$ cat iter2.swift > type file; > > app initialCondition() { > int x = 10; > } > > int step[0]=initialCondition(); > > file o <"output.txt">; > > o=step[0]; > [erin at tp-login2 swift1]$ swift -tc.file tc.data iter2.swift > Could not compile SwiftScript source: line 4:13: unexpected token: x > [erin at tp-login2 swift1]$ > > The file is in /home/erin/swift1/iter2.swift. > > Thanks, > Erin > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From HodgessE at uhd.edu Thu Jun 4 16:35:44 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Thu, 4 Jun 2009 16:35:44 -0500 Subject: [Swift-user] Writing to an output file References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F1B@BALI.uhd.campus> <1244151296.29231.9.camel@localhost> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F23@BALI.uhd.campus> I talked to Mike, got things fixed, and all is well. thank you, Sincerely, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Mihael Hategan [mailto:hategan at mcs.anl.gov] Sent: Thu 6/4/2009 4:34 PM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] Writing to an output file For most practical purposes, there are no side-effects in swift. "int x = 0" means "For convenience, and for the current lexical scope, x will be 0". It's the common (though reversed) "statement, where x = 0" from math, except there is no "statement" in your case. Outside of "initialCondition" it doesn't mean anything. I'm not sure what exactly the compiler is complaining about, but what you wrote doesn't mean much. That plus what Mike said in his reply to this message. May I suggest expressing in simple words what you are trying to achieve? >From that, it's easier to both translate to swift and for us to help with that. Mihael On Thu, 2009-06-04 at 12:36 -0500, Hodgess, Erin wrote: > Here is what SHOULD be a simple process. I'm writing to an array, and > sending that to an output file. Here are the files and the results: > > [erin at tp-login2 swift1]$ cat iter2.swift > type file; > > app initialCondition() { > int x = 10; > } > > int step[0]=initialCondition(); > > file o <"output.txt">; > > o=step[0]; > [erin at tp-login2 swift1]$ swift -tc.file tc.data iter2.swift > Could not compile SwiftScript source: line 4:13: unexpected token: x > [erin at tp-login2 swift1]$ > > The file is in /home/erin/swift1/iter2.swift. > > Thanks, > Erin > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Sat Jun 6 18:40:19 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Sat, 6 Jun 2009 18:40:19 -0500 Subject: [Swift-user] trying coasters and sites1.xml Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F36@BALI.uhd.campus> Dear Swift People: I have a process, fun1.swift, which works just fine, as you can see: [erin at tp-login2 bin]$ swift -tc.file tc.data fun1.swift Swift svn swift-r2950 cog-r2406 RunID: 20090606-1836-j7rc2402 Progress: Progress: Selecting site:8 Active:1 Checking status:1 Progress: Selecting site:6 Active:1 Checking status:1 Finished successfully:2 Progress: Selecting site:4 Active:1 Checking status:1 Finished successfully:4 Progress: Selecting site:2 Active:1 Checking status:1 Finished successfully:6 Progress: Active:1 Checking status:1 Finished successfully:8 Final status: Finished successfully:10 [erin at tp-login2 bin]$ However, if I run with a sites file, I get: [erin at tp-login2 bin]$ swift -tc.file tc.data -sites.file sites1.xml fun1.swift Swift svn swift-r2950 cog-r2406 RunID: 20090606-1837-xu7t94t5 Progress: Execution failed: Could not find any valid host for task "Task(type=UNKNOWN, identity=urn:cog-1244331435263)" with constraints {tr=RInvoke, filenames=[Ljava.lang.String;@4b6218f9, trfqn=RInvoke, filecache=org.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 1555aa19} [erin at tp-login2 bin]$ here is the sites1 file: [erin at tp-login2 bin]$ cat sites1.xml fast 00:30:00 2 /home/erin/R-2.9.0/bin [erin at tp-login2 bin]$ Now there is one other thing. The fun1.swift calls a program called R, which is not on all of the sites. Could that be the problem, please? Here is the fun1.swift file: [erin at tp-login2 bin]$ cat fun1.swift type file{} (file rout) perm_r (file datfile, int pnum, file scriptname) { app { RInvoke @filename(scriptname) pnum @filename(datfile) ; } } file r_script; file perm_matrix; foreach i in [1:10] { file r_out ; (r_out) = perm_r(perm_matrix, i, r_script); } [erin at tp-login2 bin]$ All of this stuff is in /home/erin/R-2.9.0/bin directory. Thanks and keep warm! Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Jun 6 19:48:50 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 06 Jun 2009 19:48:50 -0500 Subject: [Swift-user] trying coasters and sites1.xml In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F36@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F36@BALI.uhd.campus> Message-ID: <4A2B0E72.90909@mcs.anl.gov> Erin, See pointers below. - Mike On 6/6/09 6:40 PM, Hodgess, Erin wrote: > Dear Swift People: > > I have a process, fun1.swift, which works just fine, as you can see: > > [erin at tp-login2 bin]$ swift -tc.file tc.data fun1.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090606-1836-j7rc2402 > Progress: > Progress: Selecting site:8 Active:1 Checking status:1 > Progress: Selecting site:6 Active:1 Checking status:1 Finished > successfully:2 > Progress: Selecting site:4 Active:1 Checking status:1 Finished > successfully:4 > Progress: Selecting site:2 Active:1 Checking status:1 Finished > successfully:6 > Progress: Active:1 Checking status:1 Finished successfully:8 > Final status: Finished successfully:10 > [erin at tp-login2 bin]$ > > > However, if I run with a sites file, I get: > > [erin at tp-login2 bin]$ swift -tc.file tc.data -sites.file sites1.xml > fun1.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090606-1837-xu7t94t5 > Progress: > Execution failed: > Could not find any valid host for task "Task(type=UNKNOWN, > identity=urn:cog-1244331435263)" with constraints {tr=RInvoke, > filenames=[Ljava.lang.String;@4b6218f9, trfqn=RInvoke, > filecache=org.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 1555aa19} > [erin at tp-login2 bin]$ > That message means Swift is trying to run an application name RInvoke. It looks through tc.data, and did not find it listed as installed on any of the sites in your sites.xml. You need to replicate this line in your tc.data file: localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null for the site "teraport", and for any other sites you want Swift to try to run it on. So add: teraport RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null (Teraport uses the same file server and path names to your CI home directory) > here is the sites1 file: > > [erin at tp-login2 bin]$ cat sites1.xml > > > > fast > key="coasterWorkerMaxwalltime">00:30:00 > 2 > > > /home/erin/R-2.9.0/bin > > > [erin at tp-login2 bin]$ > > Now there is one other thing. The fun1.swift calls a program called R, > which is not on all of the sites. Could that be the problem, please? You mean "RInvoke", not R, I think. And yes, as above. > Here is the fun1.swift file: > [erin at tp-login2 bin]$ cat fun1.swift > type file{} > > (file rout) perm_r (file datfile, int pnum, file scriptname) > { > app > { > RInvoke @filename(scriptname) pnum @filename(datfile) ; > } > } > > > file r_script; > > file perm_matrix; > > foreach i in [1:10] > { > file r_out file=@strcat("results/",i,".NG-S.out")>; > (r_out) = perm_r(perm_matrix, i, r_script); > } > [erin at tp-login2 bin]$ > > > All of this stuff is in /home/erin/R-2.9.0/bin directory. > > Thanks and keep warm! > > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From HodgessE at uhd.edu Sat Jun 6 20:47:23 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Sat, 6 Jun 2009 20:47:23 -0500 Subject: [Swift-user] running on teraport Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F3A@BALI.uhd.campus> Hello again. I'm running on the teraport, and received all of these messages. [erin at tp-login2 bin]$ swift -tc.file tc.data -sites.file sites1.xml fun1.swift Swift svn swift-r2950 cog-r2406 RunID: 20090606-2016-8bxotcug Progress: Progress: Selecting site:9 Initializing site shared directory:1 Progress: Selecting site:9 Stage in:1 Progress: Selecting site:8 Submitting:1 Submitted:1 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/a on teraport Progress: Selecting site:7 Submitting:1 Submitted:1 Failed but can retry:1 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/9 on teraport Progress: Selecting site:7 Submitted:1 Failed but can retry:2 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/c on teraport Progress: Selecting site:6 Submitted:1 Failed but can retry:3 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/e on teraport Progress: Selecting site:6 Failed but can retry:4 Progress: Selecting site:6 Failed but can retry:4 Progress: Selecting site:6 Failed but can retry:4 Progress: Selecting site:5 Stage in:1 Failed but can retry:4 Progress: Selecting site:5 Submitted:1 Failed but can retry:4 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/h on teraport Progress: Selecting site:5 Failed but can retry:5 Progress: Selecting site:5 Failed but can retry:5 Progress: Selecting site:5 Failed but can retry:5 Progress: Selecting site:4 Stage in:1 Failed but can retry:5 Progress: Selecting site:4 Submitted:1 Failed but can retry:5 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/j on teraport Progress: Selecting site:4 Failed but can retry:6 Progress: Selecting site:4 Failed but can retry:6 Progress: Selecting site:4 Failed but can retry:6 Progress: Selecting site:3 Stage in:1 Failed but can retry:6 Progress: Selecting site:3 Submitted:1 Failed but can retry:6 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/l on teraport Progress: Selecting site:3 Failed but can retry:7 Progress: Selecting site:3 Failed but can retry:7 Progress: Selecting site:3 Failed but can retry:7 Progress: Selecting site:2 Stage in:1 Failed but can retry:7 Progress: Selecting site:2 Submitted:1 Failed but can retry:7 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/n on teraport Progress: Selecting site:2 Failed but can retry:8 Progress: Selecting site:2 Failed but can retry:8 Progress: Selecting site:2 Failed but can retry:8 Progress: Selecting site:1 Stage in:1 Failed but can retry:8 Progress: Selecting site:1 Submitted:1 Failed but can retry:8 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/p on teraport Progress: Selecting site:1 Failed but can retry:9 Progress: Selecting site:1 Failed but can retry:9 Progress: Selecting site:1 Failed but can retry:9 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/r on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitting:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/t on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/v on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/y on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/0 on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/2 on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/4 on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/6 on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/8 on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/a on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/c on teraport Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Failed but can retry:10 Progress: Stage in:1 Failed but can retry:9 Progress: Submitted:1 Failed but can retry:9 Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/e on teraport Execution failed: Exception in RInvoke: Arguments: [ttest1.R, 1, fun.dat] Host: teraport Directory: fun1-20090606-2016-8bxotcug/jobs/e/RInvoke-enuqrvbj stderr.txt: stdout.txt: ---- Caused by: Could not submit job Caused by: Could not start coaster service Caused by: Task ended before registration was received. STDOUT: STDERR: which: no gmd5sum in (/soft/java-1.6.0_11-sun-r1/bin:/soft/java-1.6.0_11-sun-r1/jre/bin:/soft/apache-ant-1.7.1-r1/bin:/software/common/gx-map-0.5.3.3-r1/bin:/soft/condor-7.0.5-r1/bin:/soft/globus-4.2.1-r1/bin:/soft/globus-4.2.1-r1/sbin:/usr/kerberos/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/local/bin:/software/common/softenv-1.6.0-r1/bin:/home/erin/bin/linux-rhel5-x86_64:/home/erin/bin:/soft/torque-2.3.6-r1/bin:/soft/maui-3.2.6p21-r1/bin:/soft/maui-3.2.6p21-r1/sbin:/soft/matlab-7.7-r1/bin:/soft/osg-client-1.0.0-r1/lcg/bin:/soft/osg-client-1.0.0-r1/srm-client-lbnl/bin:/soft/osg-client-1.0.0-r1/srm-client-fermi/sbin:/soft/osg-client-1.0.0-r1/srm-client-fermi/bin:/soft/osg-client-1.0.0-r1/curl/bin:/soft/osg-client-1.0.0-r1/wget/bin:/soft/osg-client-1.0.0-r1/cert-scripts/bin:/soft/osg-client-1.0.0-r1/glite/sbin:/soft/osg-client-1.0.0-r1/glite/bin:/soft/osg-client-1.0.0-r1/pyglobus-url-copy/bin:/soft/osg-client-1.0.0-r1/pegasus/bin:/soft/osg-client-1.0.0-r1/ant/bin:/soft/osg-client-1.0.0-r1/gpt/sbin:/soft/osg-client-1.0.0-r1/globus/bin:/soft/osg-client-1.0.0-r1/globus/sbin:/soft/osg-client-1.0.0-r1/jdk1.5/bin:/soft/osg-client-1.0.0-r1/condor/sbin:/soft/osg-client-1.0.0-r1/condor/bin:/soft/osg-client-1.0.0-r1/logrotate/sbin:/software/common/pacman-3.26-r1/bin:/soft/osg-client-1.0.0-r1/vdt/sbin:/soft/osg-client-1.0.0-r1/vdt/bin:/home/grog/bin/linux-rhel4-ia32:/home/grog/bin:/sw/bin:/sbin:/usr/sbin:/home/erin/bin:/home/erin/cog/modules/swift/dist/swift-svn/bin) Caused by: Job failed with an exit code of 1 Cleaning up... Done So anyhow, it doesn't like something. Here is the tc.data file: erin at tp-login2 bin]$ cat tc.data #NOTE WELL: fields in this file must be separated by tabs, not spaces # and there must be no trailing whitespace at the end of each line. # # sitename app pathname (ignored) (ignored) profiles localhost echo /bin/echo INSTALLED INTEL32::LINUX null teraport echo /bin/echo INSTALLED INTEL32::LINUX null localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null teraport RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Jun 6 21:31:32 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 06 Jun 2009 21:31:32 -0500 Subject: [Swift-user] running on teraport In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F3A@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36F3A@BALI.uhd.campus> Message-ID: <4A2B2684.8040009@mcs.anl.gov> Youre getting an error when Swift tries to start coasters, because you dont have a valid Grid proxy. You need to do a grid-proxy-init before you run with coasters, even if your are using them with jobmanager=local:pbs, because coasters uses grid security for its communication channels. These errors were in your coaster bootstrap log: java.lang.RuntimeException: Failed to register service ... Caused by: org.globus.cog.karajan.workflow.service.channels.ChannelException: Failed to start channel GSSCChannel-https: //128.135.125.117:50000(1) ... Caused by: org.globus.gsi.GlobusCredentialException: [JGLOBUS-10] Expired credentials (DC=org,DC=doegrids,OU=People,... in your $HOME. -- For initial tests, its usually simpler to test using the local pbs provider instead of coasters: Once that works, try coasters. - Mike On 6/6/09 8:47 PM, Hodgess, Erin wrote: > Hello again. > > I'm running on the teraport, and received all of these messages. > > > > [erin at tp-login2 bin]$ swift -tc.file tc.data -sites.file sites1.xml > fun1.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090606-2016-8bxotcug > Progress: > Progress: Selecting site:9 Initializing site shared directory:1 > Progress: Selecting site:9 Stage in:1 > Progress: Selecting site:8 Submitting:1 Submitted:1 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/a > on teraport > Progress: Selecting site:7 Submitting:1 Submitted:1 Failed but can > retry:1 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/9 > on teraport > Progress: Selecting site:7 Submitted:1 Failed but can retry:2 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/c > on teraport > Progress: Selecting site:6 Submitted:1 Failed but can retry:3 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/e > on teraport > Progress: Selecting site:6 Failed but can retry:4 > Progress: Selecting site:6 Failed but can retry:4 > Progress: Selecting site:6 Failed but can retry:4 > Progress: Selecting site:5 Stage in:1 Failed but can retry:4 > Progress: Selecting site:5 Submitted:1 Failed but can retry:4 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/h > on teraport > Progress: Selecting site:5 Failed but can retry:5 > Progress: Selecting site:5 Failed but can retry:5 > Progress: Selecting site:5 Failed but can retry:5 > Progress: Selecting site:4 Stage in:1 Failed but can retry:5 > Progress: Selecting site:4 Submitted:1 Failed but can retry:5 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/j > on teraport > Progress: Selecting site:4 Failed but can retry:6 > Progress: Selecting site:4 Failed but can retry:6 > Progress: Selecting site:4 Failed but can retry:6 > Progress: Selecting site:3 Stage in:1 Failed but can retry:6 > Progress: Selecting site:3 Submitted:1 Failed but can retry:6 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/l > on teraport > Progress: Selecting site:3 Failed but can retry:7 > Progress: Selecting site:3 Failed but can retry:7 > Progress: Selecting site:3 Failed but can retry:7 > Progress: Selecting site:2 Stage in:1 Failed but can retry:7 > Progress: Selecting site:2 Submitted:1 Failed but can retry:7 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/n > on teraport > Progress: Selecting site:2 Failed but can retry:8 > Progress: Selecting site:2 Failed but can retry:8 > Progress: Selecting site:2 Failed but can retry:8 > Progress: Selecting site:1 Stage in:1 Failed but can retry:8 > Progress: Selecting site:1 Submitted:1 Failed but can retry:8 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/p > on teraport > Progress: Selecting site:1 Failed but can retry:9 > Progress: Selecting site:1 Failed but can retry:9 > Progress: Selecting site:1 Failed but can retry:9 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/r > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitting:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/t > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/v > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/y > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/0 > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/2 > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/4 > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/6 > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/8 > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/a > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/c > on teraport > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > Progress: Failed but can retry:10 > > > > Progress: Stage in:1 Failed but can retry:9 > Progress: Submitted:1 Failed but can retry:9 > Failed to transfer wrapper log from fun1-20090606-2016-8bxotcug/info/e > on teraport > Execution failed: > Exception in RInvoke: > Arguments: [ttest1.R, 1, fun.dat] > Host: teraport > Directory: fun1-20090606-2016-8bxotcug/jobs/e/RInvoke-enuqrvbj > stderr.txt: > > stdout.txt: > > ---- > > Caused by: > Could not submit job > Caused by: > Could not start coaster service > Caused by: > Task ended before registration was received. > STDOUT: > STDERR: which: no gmd5sum in > (/soft/java-1.6.0_11-sun-r1/bin:/soft/java-1.6.0_11-sun-r1/jre/bin:/soft/apache-ant-1.7.1-r1/bin:/software/common/gx-map-0.5.3.3-r1/bin:/soft/condor-7.0.5-r1/bin:/soft/globus-4.2.1-r1/bin:/soft/globus-4.2.1-r1/sbin:/usr/kerberos/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/local/bin:/software/common/softenv-1.6.0-r1/bin:/home/erin/bin/linux-rhel5-x86_64:/home/erin/bin:/soft/torque-2.3.6-r1/bin:/soft/maui-3.2.6p21-r1/bin:/soft/maui-3.2.6p21-r1/sbin:/soft/matlab-7.7-r1/bin:/soft/osg-client-1.0.0-r1/lcg/bin:/soft/osg-client-1.0.0-r1/srm-client-lbnl/bin:/soft/osg-client-1.0.0-r1/srm-client-fermi/sbin:/soft/osg-client-1.0.0-r1/srm-client-fermi/bin:/soft/osg-client-1.0.0-r1/curl/bin:/soft/osg-client-1.0.0-r1/wget/bin:/soft/osg-client-1.0.0-r1/cert-scripts/bin:/soft/osg-client-1.0.0-r1/glite/sbin:/soft/osg-client-1.0.0-r1/glite/bin:/soft/osg-client-1.0.0-r1/pyglobus-url-copy/bin:/soft/osg-client-1.0.0-r1/pegasus/bin:/soft/osg-client-1.0.0-r1/ant/bin:/soft/osg-client-1.0.0-r1/gpt/sbin:/so ft/osg-client-1.0.0-r1/globus/bin:/soft/osg-client-1.0.0-r1/globus/sbin:/soft/osg-client-1.0.0-r1/jdk1.5/bin:/soft/osg-client-1.0.0-r1/condor/sbin:/soft/osg-client-1.0.0-r1/condor/bin:/soft/osg-client-1.0.0-r1/logrotate/sbin:/software/common/pacman-3.26-r1/bin:/soft/osg-client-1.0.0-r1/vdt/sbin:/soft/osg-client-1.0.0-r1/vdt/bin:/home/grog/bin/linux-rhel4-ia32:/home/grog/bin:/sw/bin:/sbin:/usr/sbin:/home/erin/bin:/home/erin/cog/modules/swift/dist/swift-svn/bin) > > > Caused by: > Job failed with an exit code of 1 > Cleaning up... > Done > > > So anyhow, it doesn't like something. > > Here is the tc.data file: > erin at tp-login2 bin]$ cat tc.data > #NOTE WELL: fields in this file must be separated by tabs, not spaces > # and there must be no trailing whitespace at the end of each > line. > # > # sitename app pathname (ignored) (ignored) > profiles > localhost echo /bin/echo INSTALLED INTEL32::LINUX null > teraport echo /bin/echo INSTALLED INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED > INTEL32::LINUX null > localhost R /home/erin/R-2.9.0/bin/R INSTALLED > INTEL32::LINUX null > localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null > localhost convert /usr/bin/convert INSTALLED > INTEL32::LINUX null > localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh > INSTALLED INTEL32::LINUX null > teraport RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh > INSTALLED INTEL32::LINUX null > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From HodgessE at uhd.edu Wed Jun 10 12:00:25 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Wed, 10 Jun 2009 12:00:25 -0500 Subject: [Swift-user] bad stuff from one of the tutorials. Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FAA@BALI.uhd.campus> Here is an example from one of the tutorials which is not working. [erin at tp-login2 swift1]$ cat fold9.swift type counterfile; (counterfile t) echo(string m) { app { echo m stdout=@filename(t); } } (counterfile t) countstep(counterfile i) { app { wcl @filename(i) @filename(t); } } counterfile a[] ; a[0] = echo("793578934574893"); iterate v { a[v+1] = countstep(a[v]); } until (@extractint(a[v+1]) <= 1); [erin at tp-login2 swift1]$ cat tc.data #NOTE WELL: fields in this file must be separated by tabs, not spaces # and there must be no trailing whitespace at the end of each line. # # sitename app pathname (ignored) (ignored) profiles localhost echo /bin/echo INSTALLED INTEL32::LINUX null teraport echo /bin/echo INSTALLED INTEL32::LINUX null localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null localhost wcl /home/erin/swift1 INSTALLED INTEL32::LINUX null [erin at tp-login2 swift1]$ swift -tc.file tc.data fold9.swift Could not start execution. variable a has multip Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Wed Jun 10 13:59:17 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Wed, 10 Jun 2009 13:59:17 -0500 Subject: [Swift-user] weird "non output" from swift procedure Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FAD@BALI.uhd.campus> Hi again! This runs without errors, but it does not produce any output. Is it missing some file names, please? Thanks, Erin [erin at tp-login2 ~]$ swift -tc.file tc.data restart.swift Swift svn swift-r2950 cog-r2406 RunID: 20090610-1355-ccu7emi4 Progress: Progress: Checking status:1 Finished successfully:3 Final status: Finished successfully:4 [erin at tp-login2 ~]$ cat restart.swift type file; (file f) touch() { app { touch @f; } } (file f) processL(file inp) { app { echo "processL" stdout=@f; } } (file f) processR(file inp) { app { broken "process" stdout=@f; } } (file f) join(file left, file right) { app { echo "join" @left @right stdout=@f; } } file f = touch(); file g = processL(f); file h = processR(f); file i = join(g,h); [erin at tp-login2 ~]$ Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Wed Jun 10 14:20:24 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jun 2009 14:20:24 -0500 Subject: [Swift-user] weird "non output" from swift procedure In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FAD@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FAD@BALI.uhd.campus> Message-ID: <4A300778.3050101@mcs.anl.gov> Erin, I suspect that since you did not specify any mappings, that *if* the script ran correctly, it produced output files under the directory _concurrent. This "concurrent" mapper and its directory convention are the default when no output mapping is specified, and is described in the user guide. Can you check is the expected output files (which will have weird names) is under that directory? (Do an "ls -lRt" on it). - Mike On 6/10/09 1:59 PM, Hodgess, Erin wrote: > Hi again! > > This runs without errors, but it does not produce any output. > > Is it missing some file names, please? > > Thanks, > Erin > > > > [erin at tp-login2 ~]$ swift -tc.file tc.data restart.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090610-1355-ccu7emi4 > Progress: > Progress: Checking status:1 Finished successfully:3 > Final status: Finished successfully:4 > [erin at tp-login2 ~]$ cat restart.swift > type file; > > (file f) touch() { > app { > touch @f; > } > } > > (file f) processL(file inp) { > app { > echo "processL" stdout=@f; > } > } > > (file f) processR(file inp) { > app { > broken "process" stdout=@f; > } > } > > (file f) join(file left, file right) { > app { > echo "join" @left @right stdout=@f; > } > } > > > file f = touch(); > > file g = processL(f); > file h = processR(f); > > file i = join(g,h); > [erin at tp-login2 ~]$ > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From HodgessE at uhd.edu Wed Jun 10 14:28:14 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Wed, 10 Jun 2009 14:28:14 -0500 Subject: [Swift-user] weird "non output" from swift procedure References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FAD@BALI.uhd.campus> <4A300778.3050101@mcs.anl.gov> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FB1@BALI.uhd.campus> That's where they are. thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Michael Wilde [mailto:wilde at mcs.anl.gov] Sent: Wed 6/10/2009 2:20 PM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] weird "non output" from swift procedure Erin, I suspect that since you did not specify any mappings, that *if* the script ran correctly, it produced output files under the directory _concurrent. This "concurrent" mapper and its directory convention are the default when no output mapping is specified, and is described in the user guide. Can you check is the expected output files (which will have weird names) is under that directory? (Do an "ls -lRt" on it). - Mike On 6/10/09 1:59 PM, Hodgess, Erin wrote: > Hi again! > > This runs without errors, but it does not produce any output. > > Is it missing some file names, please? > > Thanks, > Erin > > > > [erin at tp-login2 ~]$ swift -tc.file tc.data restart.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090610-1355-ccu7emi4 > Progress: > Progress: Checking status:1 Finished successfully:3 > Final status: Finished successfully:4 > [erin at tp-login2 ~]$ cat restart.swift > type file; > > (file f) touch() { > app { > touch @f; > } > } > > (file f) processL(file inp) { > app { > echo "processL" stdout=@f; > } > } > > (file f) processR(file inp) { > app { > broken "process" stdout=@f; > } > } > > (file f) join(file left, file right) { > app { > echo "join" @left @right stdout=@f; > } > } > > > file f = touch(); > > file g = processL(f); > file h = processR(f); > > file i = join(g,h); > [erin at tp-login2 ~]$ > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From fedorov at cs.wm.edu Fri Jun 12 13:18:09 2009 From: fedorov at cs.wm.edu (Andriy Fedorov) Date: Fri, 12 Jun 2009 14:18:09 -0400 Subject: [Swift-user] Swift on local resources Message-ID: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> Hi, I am trying to set up Swift with the local cluster and non-cluster resources in our lab. Here some configuration details. Due to technical problems, passphrase login is not possible for the nodes on local network, and I need to enter password each time. For the cluster, I was able to set up passphrase login for the head node. The cluster is running Lava and Condor schedulers at the same time, but Lava should be used if possible. Two questions: (1) is it possible to configure Swift to talk to Lava scheduler? (2) I am following the instructions on setting up ssh site provider to use nodes on the local network. (2.1) do I need to set up auth.defaults even if I have ssh-agent running, and can ssh to the remote node without being asked for password? (2.2.) can anybody give me more detailed instructions on how to set up auth.defaults? I cannot make it work. Thanks Andriy Fedorov From wilde at mcs.anl.gov Fri Jun 12 13:52:45 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 12 Jun 2009 13:52:45 -0500 Subject: [Swift-user] Swift on local resources In-Reply-To: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> Message-ID: <4A32A3FD.8000603@mcs.anl.gov> Andriy, Ben or Mihael may have better ideas, but I offer my thoughts below. On 6/12/09 1:18 PM, Andriy Fedorov wrote: > Hi, > > I am trying to set up Swift with the local cluster and non-cluster > resources in our lab. Here some configuration details. > > Due to technical problems, passphrase login is not possible for the > nodes on local network, and I need to enter password each time. > > For the cluster, I was able to set up passphrase login for the head > node. The cluster is running Lava and Condor schedulers at the same > time, but Lava should be used if possible. > > Two questions: > > (1) is it possible to configure Swift to talk to Lava scheduler? Making Swift talk to a new scheduler means writing a new CoG provider (in Java). You can likely use an existing "data" provider like "local"; you could model the "execution" provider after the "PBS" provider. How hard this is depends on how close Lava is to PBS in nature. (I dont know it). And the provider interface you need to code to is not well documented afaik. I would try the Condor provider. While that provider is less mature and tested than others, it should work, and if it doesnt, we should try to fix it. If possible, make sure a simple condor_submit hello-world works for you first. Run swift on the head/login node; use the "local" data provider. Another route is to use Falkon, but that will be harder and its less supported, so I suggest against this until easier routes are exhausted. I dont think that ssh will get you far, as to leverage the cluster I think you'd need to describe each worker node with a separate sites.xml entry. Thats fine in principle, but a bit awkward, and may have scheduling issues (ie if ssh hangs or dies when you dont own the node). Save ssh as another last resort; I suggest trying Condor first. If needed, people who used ssh recently can send you the info below. - Mike > (2) I am following the instructions on setting up ssh site provider to > use nodes on the local network. > (2.1) do I need to set up auth.defaults even if I have ssh-agent > running, and can ssh to the remote node without being asked for > password? > (2.2.) can anybody give me more detailed instructions on how to set > up auth.defaults? I cannot make it work. > Thanks > > Andriy Fedorov > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From fedorov at cs.wm.edu Fri Jun 12 14:06:15 2009 From: fedorov at cs.wm.edu (Andriy Fedorov) Date: Fri, 12 Jun 2009 15:06:15 -0400 Subject: [Swift-user] Swift on local resources In-Reply-To: <4A32A3FD.8000603@mcs.anl.gov> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> Message-ID: <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> Michael, Thank you for the advice, I will look into this. This is very helpful. I had an impression Lava is not included in the list of schedulers supported out of the box, but wanted to check. Just a clarification -- I need to access two different types of local resources. Cluster (via Lava or Condor) is one, but for the multicore nodes we have on the network, which are not part of cluster, the only option is to use ssh. AF On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: > Andriy, > > Ben or Mihael may have better ideas, but I offer my thoughts below. > > On 6/12/09 1:18 PM, Andriy Fedorov wrote: >> >> Hi, >> >> I am trying to set up Swift with the local cluster and non-cluster >> resources in our lab. Here some configuration details. >> >> Due to technical problems, passphrase login is not possible for the >> nodes on local network, and I need to enter password each time. >> >> For the cluster, I was able to set up passphrase login for the head >> node. The cluster is running Lava and Condor schedulers at the same >> time, but Lava should be used if possible. >> >> Two questions: >> >> (1) is it possible to configure Swift to talk to Lava scheduler? > > Making Swift talk to a new scheduler means writing a new CoG provider (in > Java). You can likely use an existing "data" provider like "local"; you > could model the "execution" provider after the "PBS" provider. How hard this > is depends on how close Lava is to PBS in nature. (I dont know it). And the > provider interface you need to code to is not well documented afaik. > > I would try the Condor provider. While that provider is less mature and > tested than others, it should work, and if it doesnt, we should try to fix > it. > > If possible, make sure a simple condor_submit hello-world works for you > first. > > Run swift on the head/login node; use the "local" data provider. > > Another route is to use Falkon, but that will be harder and its less > supported, so I suggest against this until easier routes are exhausted. > > I dont think that ssh will get you far, as to leverage the cluster I think > you'd need to describe each worker node with a separate sites.xml entry. > Thats fine in principle, but a bit awkward, and may have scheduling issues > (ie if ssh hangs or dies when you dont own the node). > > Save ssh as another last resort; I suggest trying Condor first. > > If needed, people who used ssh recently can send you the info below. > > - Mike > >> (2) I am following the instructions on setting up ssh site provider to >> use nodes on the local network. >> (2.1) do I need to set up auth.defaults even if I have ssh-agent >> running, and can ssh to the remote node without being asked for >> password? >> (2.2.) can anybody give me more detailed instructions on how to set >> up auth.defaults? I cannot make it work. > > >> Thanks >> >> Andriy Fedorov >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From wilde at mcs.anl.gov Fri Jun 12 14:18:22 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 12 Jun 2009 14:18:22 -0500 Subject: [Swift-user] Swift on local resources In-Reply-To: <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> Message-ID: <4A32A9FE.4000600@mcs.anl.gov> Ah, very cool. Im eager to get more user experience feedback on multicore use. So I will try to hunt down my examples of .ssh configs. Also, Allan Espinosa used this recently. Allan, can you post details and examples? Thanks! Mike On 6/12/09 2:06 PM, Andriy Fedorov wrote: > Michael, > > Thank you for the advice, I will look into this. This is very helpful. > I had an impression Lava is not included in the list of schedulers > supported out of the box, but wanted to check. > > Just a clarification -- I need to access two different types of local > resources. Cluster (via Lava or Condor) is one, but for the multicore > nodes we have on the network, which are not part of cluster, the only > option is to use ssh. > > AF > > > > On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: >> Andriy, >> >> Ben or Mihael may have better ideas, but I offer my thoughts below. >> >> On 6/12/09 1:18 PM, Andriy Fedorov wrote: >>> Hi, >>> >>> I am trying to set up Swift with the local cluster and non-cluster >>> resources in our lab. Here some configuration details. >>> >>> Due to technical problems, passphrase login is not possible for the >>> nodes on local network, and I need to enter password each time. >>> >>> For the cluster, I was able to set up passphrase login for the head >>> node. The cluster is running Lava and Condor schedulers at the same >>> time, but Lava should be used if possible. >>> >>> Two questions: >>> >>> (1) is it possible to configure Swift to talk to Lava scheduler? >> Making Swift talk to a new scheduler means writing a new CoG provider (in >> Java). You can likely use an existing "data" provider like "local"; you >> could model the "execution" provider after the "PBS" provider. How hard this >> is depends on how close Lava is to PBS in nature. (I dont know it). And the >> provider interface you need to code to is not well documented afaik. >> >> I would try the Condor provider. While that provider is less mature and >> tested than others, it should work, and if it doesnt, we should try to fix >> it. >> >> If possible, make sure a simple condor_submit hello-world works for you >> first. >> >> Run swift on the head/login node; use the "local" data provider. >> >> Another route is to use Falkon, but that will be harder and its less >> supported, so I suggest against this until easier routes are exhausted. >> >> I dont think that ssh will get you far, as to leverage the cluster I think >> you'd need to describe each worker node with a separate sites.xml entry. >> Thats fine in principle, but a bit awkward, and may have scheduling issues >> (ie if ssh hangs or dies when you dont own the node). >> >> Save ssh as another last resort; I suggest trying Condor first. >> >> If needed, people who used ssh recently can send you the info below. >> >> - Mike >> >>> (2) I am following the instructions on setting up ssh site provider to >>> use nodes on the local network. >>> (2.1) do I need to set up auth.defaults even if I have ssh-agent >>> running, and can ssh to the remote node without being asked for >>> password? >>> (2.2.) can anybody give me more detailed instructions on how to set >>> up auth.defaults? I cannot make it work. >> >>> Thanks >>> >>> Andriy Fedorov >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From aespinosa at cs.uchicago.edu Fri Jun 12 15:04:09 2009 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Fri, 12 Jun 2009 15:04:09 -0500 Subject: [Swift-user] Swift on local resources In-Reply-To: <4A32A9FE.4000600@mcs.anl.gov> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> Message-ID: <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> Hi Andriy and Mike, here is my example ~/.ssh/auth.defaults for executing jobs on tp-login1.ci.uchicago.edu: [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults tp-login1.ci.uchicago.edu.type=key tp-login1.ci.uchicago.edu.username=aespinosa tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX We have used falkon and coasters before for multi-core configurations. but for a single multi-core machine, i believe you can get away with having multiple entries of the same host in the sites.xml file using the ssh-provider. ie: ... ... 2009/6/12 Michael Wilde : > Ah, very cool. Im eager to get more user experience feedback on multicore > use. > > So I will try to hunt down my examples of .ssh configs. > > Also, Allan Espinosa used this recently. Allan, can you post details and > examples? > > Thanks! > > Mike > > On 6/12/09 2:06 PM, Andriy Fedorov wrote: >> >> Michael, >> >> Thank you for the advice, I will look into this. This is very helpful. >> I had an impression Lava is not included in the list of schedulers >> supported out of the box, but wanted to check. >> >> Just a clarification -- I need to access two different types of local >> resources. Cluster (via Lava or Condor) is one, but for the multicore >> nodes we have on the network, which are not part of cluster, the only >> option is to use ssh. >> >> AF >> >> >> >> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: >>> >>> Andriy, >>> >>> Ben or Mihael may have better ideas, but I offer my thoughts below. >>> >>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: >>>> >>>> Hi, >>>> >>>> I am trying to set up Swift with the local cluster and non-cluster >>>> resources in our lab. Here some configuration details. >>>> >>>> Due to technical problems, passphrase login is not possible for the >>>> nodes on local network, and I need to enter password each time. >>>> >>>> For the cluster, I was able to set up passphrase login for the head >>>> node. The cluster is running Lava and Condor schedulers at the same >>>> time, but Lava should be used if possible. >>>> >>>> Two questions: >>>> >>>> (1) is it possible to configure Swift to talk to Lava scheduler? >>> >>> Making Swift talk to a new scheduler means writing a new CoG provider (in >>> Java). You can likely use an existing "data" provider like "local"; you >>> could model the "execution" provider after the "PBS" provider. How hard >>> this >>> is depends on how close Lava is to PBS in nature. (I dont know it). And >>> the >>> provider interface you need to code to is not well documented afaik. >>> >>> I would try the Condor provider. While that provider is less mature and >>> tested than others, it should work, and if it doesnt, we should try to >>> fix >>> it. >>> >>> If possible, make sure a simple condor_submit hello-world works for you >>> first. >>> >>> Run swift on the head/login node; use the "local" data provider. >>> >>> Another route is to use Falkon, but that will be harder and its less >>> supported, so I suggest against this until easier routes are exhausted. >>> >>> I dont think that ssh will get you far, as to leverage the cluster I >>> think >>> you'd need to describe each worker node with a separate sites.xml entry. >>> Thats fine in principle, but a bit awkward, and may have scheduling >>> issues >>> (ie if ssh hangs or dies when you dont own the node). >>> >>> Save ssh as another last resort; I suggest trying Condor first. >>> >>> If needed, people who used ssh recently can send you the info below. >>> >>> - Mike >>> >>>> (2) I am following the instructions on setting up ssh site provider to >>>> use nodes on the local network. >>>> ?(2.1) do I need to set up auth.defaults even if I have ssh-agent >>>> running, and can ssh to the remote node without being asked for >>>> password? >>>> ?(2.2.) can anybody give me more detailed instructions on how to set >>>> up auth.defaults? I cannot make it work. >>> >>>> Thanks >>>> >>>> Andriy Fedorov >>>> _______________________________________________ -- Allan M. Espinosa PhD student, Computer Science University of Chicago From fedorov at cs.wm.edu Fri Jun 12 15:17:02 2009 From: fedorov at cs.wm.edu (Andriy Fedorov) Date: Fri, 12 Jun 2009 16:17:02 -0400 Subject: [Swift-user] Swift on local resources In-Reply-To: <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> Message-ID: <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> Allan, Thank you for the example. I have exactly same setup, but it doesn't work for me. I suspect the reason is that I am unable to set up my environment to work with passphrase. I can only log in with password. I wonder if there is any workaround... AF On Fri, Jun 12, 2009 at 4:04 PM, Allan Espinosa wrote: > Hi Andriy and Mike, > > here is my example ~/.ssh/auth.defaults for executing jobs on > tp-login1.ci.uchicago.edu: > > [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults > tp-login1.ci.uchicago.edu.type=key > tp-login1.ci.uchicago.edu.username=aespinosa > tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa > tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX > > We have used falkon and coasters before for multi-core configurations. > but for a single multi-core machine, i believe you can get away with > having multiple entries of the same host in the sites.xml file using > the ssh-provider. > > ie: > > > > > ... > > > ... > > > > 2009/6/12 Michael Wilde : >> Ah, very cool. Im eager to get more user experience feedback on multicore >> use. >> >> So I will try to hunt down my examples of .ssh configs. >> >> Also, Allan Espinosa used this recently. Allan, can you post details and >> examples? >> >> Thanks! >> >> Mike >> >> On 6/12/09 2:06 PM, Andriy Fedorov wrote: >>> >>> Michael, >>> >>> Thank you for the advice, I will look into this. This is very helpful. >>> I had an impression Lava is not included in the list of schedulers >>> supported out of the box, but wanted to check. >>> >>> Just a clarification -- I need to access two different types of local >>> resources. Cluster (via Lava or Condor) is one, but for the multicore >>> nodes we have on the network, which are not part of cluster, the only >>> option is to use ssh. >>> >>> AF >>> >>> >>> >>> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: >>>> >>>> Andriy, >>>> >>>> Ben or Mihael may have better ideas, but I offer my thoughts below. >>>> >>>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am trying to set up Swift with the local cluster and non-cluster >>>>> resources in our lab. Here some configuration details. >>>>> >>>>> Due to technical problems, passphrase login is not possible for the >>>>> nodes on local network, and I need to enter password each time. >>>>> >>>>> For the cluster, I was able to set up passphrase login for the head >>>>> node. The cluster is running Lava and Condor schedulers at the same >>>>> time, but Lava should be used if possible. >>>>> >>>>> Two questions: >>>>> >>>>> (1) is it possible to configure Swift to talk to Lava scheduler? >>>> >>>> Making Swift talk to a new scheduler means writing a new CoG provider (in >>>> Java). You can likely use an existing "data" provider like "local"; you >>>> could model the "execution" provider after the "PBS" provider. How hard >>>> this >>>> is depends on how close Lava is to PBS in nature. (I dont know it). And >>>> the >>>> provider interface you need to code to is not well documented afaik. >>>> >>>> I would try the Condor provider. While that provider is less mature and >>>> tested than others, it should work, and if it doesnt, we should try to >>>> fix >>>> it. >>>> >>>> If possible, make sure a simple condor_submit hello-world works for you >>>> first. >>>> >>>> Run swift on the head/login node; use the "local" data provider. >>>> >>>> Another route is to use Falkon, but that will be harder and its less >>>> supported, so I suggest against this until easier routes are exhausted. >>>> >>>> I dont think that ssh will get you far, as to leverage the cluster I >>>> think >>>> you'd need to describe each worker node with a separate sites.xml entry. >>>> Thats fine in principle, but a bit awkward, and may have scheduling >>>> issues >>>> (ie if ssh hangs or dies when you dont own the node). >>>> >>>> Save ssh as another last resort; I suggest trying Condor first. >>>> >>>> If needed, people who used ssh recently can send you the info below. >>>> >>>> - Mike >>>> >>>>> (2) I am following the instructions on setting up ssh site provider to >>>>> use nodes on the local network. >>>>> (2.1) do I need to set up auth.defaults even if I have ssh-agent >>>>> running, and can ssh to the remote node without being asked for >>>>> password? >>>>> (2.2.) can anybody give me more detailed instructions on how to set >>>>> up auth.defaults? I cannot make it work. >>>> >>>>> Thanks >>>>> >>>>> Andriy Fedorov >>>>> _______________________________________________ > > > > -- > Allan M. Espinosa > PhD student, Computer Science > University of Chicago > From aespinosa at cs.uchicago.edu Fri Jun 12 15:41:41 2009 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Fri, 12 Jun 2009 15:41:41 -0500 Subject: [Swift-user] Swift on local resources In-Reply-To: <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> Message-ID: <50b07b4b0906121341x5bb20738q2b89ba71f521ee3f@mail.gmail.com> Can you please post your swift logs? when i remove my passphrase entry: i get the following error: Swift svn swift-r2949 cog-r2406 RunID: remoterun Progress: Progress: Initializing site shared directory:1 Progress: Initializing site shared directory:1 Execution failed: Could not initialize shared directory on TERAPORT Caused by: org.globus.cog.abstraction.impl.file.FileResourceException: Error while communicating with the SSH server on tp-login1.ci.uchicago.edu:22 Caused by: java.lang.NullPointerException at org.globus.cog.abstraction.impl.ssh.SSHChannelManager.loadDefaultCredentials(SSHChannelManager.java:160) at org.globus.cog.abstraction.impl.ssh.SSHChannelManager.getDefaultCredentials(SSHChannelManager.java:120) at org.globus.cog.abstraction.impl.ssh.SSHChannelManager.getCredentials(SSHChannelManager.java:79) at org.globus.cog.abstraction.impl.ssh.SSHChannelManager.getChannel(SSHChannelManager.java:62) at org.globus.cog.abstraction.impl.ssh.file.FileResourceImpl.start(FileResourceImpl.java:81) at org.globus.cog.abstraction.impl.file.FileResourceCache.getResource(FileResourceCache.java:98) at org.globus.cog.abstraction.impl.file.CachingDelegatedFileOperationHandler.getResource(CachingDelegatedFileOperationHandler.java:75) at org.globus.cog.abstraction.impl.file.CachingDelegatedFileOperationHandler.submit(CachingDelegatedFileOperationHandler.java:40) at org.globus.cog.abstraction.impl.common.task.CachingFileOperationTaskHandler.submit(CachingFileOperationTaskHandler.java:28) at org.globus.cog.karajan.scheduler.submitQueue.NonBlockingSubmit.run(NonBlockingSubmit.java:86) at edu.emory.mathcs.backport.java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:431) at edu.emory.mathcs.backport.java.util.concurrent.FutureTask.run(FutureTask.java:166) at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:643) at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:668) at java.lang.Thread.run(Thread.java:595) 2009/6/12 Andriy Fedorov : > Allan, > > Thank you for the example. > > I have exactly same setup, but it doesn't work for me. I suspect the > reason is that I am unable to set up my environment to work with > passphrase. I can only log in with password. I wonder if there is any > workaround... > > AF > > > > On Fri, Jun 12, 2009 at 4:04 PM, Allan > Espinosa wrote: >> Hi Andriy and Mike, >> >> here is my example ~/.ssh/auth.defaults for executing jobs on >> tp-login1.ci.uchicago.edu: >> >> [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults >> tp-login1.ci.uchicago.edu.type=key >> tp-login1.ci.uchicago.edu.username=aespinosa >> tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa >> tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX >> >> We have used falkon and coasters before for multi-core configurations. >> ?but for a single multi-core machine, i believe you can get away with >> having multiple entries of the same host in the sites.xml file using >> the ssh-provider. >> >> ie: >> >> >> >> ? >> ? ... >> >> >> ? ... >> >> >> >> 2009/6/12 Michael Wilde : >>> Ah, very cool. Im eager to get more user experience feedback on multicore >>> use. >>> >>> So I will try to hunt down my examples of .ssh configs. >>> >>> Also, Allan Espinosa used this recently. Allan, can you post details and >>> examples? >>> >>> Thanks! >>> >>> Mike >>> >>> On 6/12/09 2:06 PM, Andriy Fedorov wrote: >>>> >>>> Michael, >>>> >>>> Thank you for the advice, I will look into this. This is very helpful. >>>> I had an impression Lava is not included in the list of schedulers >>>> supported out of the box, but wanted to check. >>>> >>>> Just a clarification -- I need to access two different types of local >>>> resources. Cluster (via Lava or Condor) is one, but for the multicore >>>> nodes we have on the network, which are not part of cluster, the only >>>> option is to use ssh. >>>> >>>> AF >>>> >>>> >>>> >>>> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: >>>>> >>>>> Andriy, >>>>> >>>>> Ben or Mihael may have better ideas, but I offer my thoughts below. >>>>> >>>>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am trying to set up Swift with the local cluster and non-cluster >>>>>> resources in our lab. Here some configuration details. >>>>>> >>>>>> Due to technical problems, passphrase login is not possible for the >>>>>> nodes on local network, and I need to enter password each time. >>>>>> >>>>>> For the cluster, I was able to set up passphrase login for the head >>>>>> node. The cluster is running Lava and Condor schedulers at the same >>>>>> time, but Lava should be used if possible. >>>>>> >>>>>> Two questions: >>>>>> >>>>>> (1) is it possible to configure Swift to talk to Lava scheduler? >>>>> >>>>> Making Swift talk to a new scheduler means writing a new CoG provider (in >>>>> Java). You can likely use an existing "data" provider like "local"; you >>>>> could model the "execution" provider after the "PBS" provider. How hard >>>>> this >>>>> is depends on how close Lava is to PBS in nature. (I dont know it). And >>>>> the >>>>> provider interface you need to code to is not well documented afaik. >>>>> >>>>> I would try the Condor provider. While that provider is less mature and >>>>> tested than others, it should work, and if it doesnt, we should try to >>>>> fix >>>>> it. >>>>> >>>>> If possible, make sure a simple condor_submit hello-world works for you >>>>> first. >>>>> >>>>> Run swift on the head/login node; use the "local" data provider. >>>>> >>>>> Another route is to use Falkon, but that will be harder and its less >>>>> supported, so I suggest against this until easier routes are exhausted. >>>>> >>>>> I dont think that ssh will get you far, as to leverage the cluster I >>>>> think >>>>> you'd need to describe each worker node with a separate sites.xml entry. >>>>> Thats fine in principle, but a bit awkward, and may have scheduling >>>>> issues >>>>> (ie if ssh hangs or dies when you dont own the node). >>>>> >>>>> Save ssh as another last resort; I suggest trying Condor first. >>>>> >>>>> If needed, people who used ssh recently can send you the info below. >>>>> >>>>> - Mike >>>>> >>>>>> (2) I am following the instructions on setting up ssh site provider to >>>>>> use nodes on the local network. >>>>>> ?(2.1) do I need to set up auth.defaults even if I have ssh-agent >>>>>> running, and can ssh to the remote node without being asked for >>>>>> password? >>>>>> ?(2.2.) can anybody give me more detailed instructions on how to set >>>>>> up auth.defaults? I cannot make it work. >>>>> >>>>>> Thanks >>>>>> >>>>>> Andriy Fedorov >>>>>> _______________________________________________ >> > From fedorov at cs.wm.edu Fri Jun 12 15:46:15 2009 From: fedorov at cs.wm.edu (Andriy Fedorov) Date: Fri, 12 Jun 2009 16:46:15 -0400 Subject: [Swift-user] Swift on local resources In-Reply-To: <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> Message-ID: <82f536810906121346n272781dbr5d2bef730a393efc@mail.gmail.com> My ~/.ssh/auth.defaults is this: george.bwh.harvard.edu.type=key george.bwh.harvard.edu.username=fedorov george.bwh.harvard.edu.key=/home/fedorov/.ssh/identity.pub george.bwh.harvard.edu.passphrase=**** But what I am saying is that passphrase login is not working even when I do plain ssh -- it is asking me to enter the password. Here are the error messages I am getting trying to run a simple test: Swift 0.9 swift-r2860 cog-r2388 RunID: 20090612-1643-tlksdhh5 Progress: Progress: Initializing site shared directory:1 Progress: Initializing site shared directory:1 Progress: Initializing site shared directory:1 Execution failed: Could not initialize shared directory on spl_george Caused by: org.globus.cog.abstraction.impl.file.FileResourceException: Error while communicating with the SSH server on george.bwh.harvard.edu:22 Caused by: java.lang.NullPointerException Caused by: java.lang.NullPointerException at java.lang.StringBuffer.(StringBuffer.java:104) at com.sshtools.j2ssh.openssh.PEMReader.read(PEMReader.java:117) at com.sshtools.j2ssh.openssh.PEMReader.(PEMReader.java:61) at com.sshtools.j2ssh.openssh.OpenSSHPrivateKeyFormat.isFormatted(OpenSSHPrivateKeyFormat.java:205) at com.sshtools.j2ssh.transport.publickey.SshPrivateKeyFile.parse(SshPrivateKeyFile.java:132) at com.sshtools.j2ssh.transport.publickey.SshPrivateKeyFile.parse(SshPrivateKeyFile.java:171) at org.globus.cog.abstraction.impl.ssh.Ssh.connect(Ssh.java:254) at org.globus.cog.abstraction.impl.ssh.SSHConnectionBundle$Connection.ensureConnected(SSHConnectionBundle.java:234) at org.globus.cog.abstraction.impl.ssh.SSHConnectionBundle.allocateChannel(SSHConnectionBundle.java:76) at org.globus.cog.abstraction.impl.ssh.SSHChannelManager.getChannel(SSHChannelManager.java:71) at org.globus.cog.abstraction.impl.ssh.file.FileResourceImpl.start(FileResourceImpl.java:81) at org.globus.cog.abstraction.impl.file.FileResourceCache.getResource(FileResourceCache.java:98) at org.globus.cog.abstraction.impl.file.CachingDelegatedFileOperationHandler.getResource(CachingDelegatedFileOperationHandler.java:75) at org.globus.cog.abstraction.impl.file.CachingDelegatedFileOperationHandler.submit(CachingDelegatedFileOperationHandler.java:40) at org.globus.cog.abstraction.impl.common.task.CachingFileOperationTaskHandler.submit(CachingFileOperationTaskHandler.java:28) at org.globus.cog.karajan.scheduler.submitQueue.NonBlockingSubmit.run(NonBlockingSubmit.java:86) at edu.emory.mathcs.backport.java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:431) at edu.emory.mathcs.backport.java.util.concurrent.FutureTask.run(FutureTask.java:166) at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:643) at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:668) at java.lang.Thread.run(Thread.java:595) On Fri, Jun 12, 2009 at 4:17 PM, Andriy Fedorov wrote: > Allan, > > Thank you for the example. > > I have exactly same setup, but it doesn't work for me. I suspect the > reason is that I am unable to set up my environment to work with > passphrase. I can only log in with password. I wonder if there is any > workaround... > > AF > > > > On Fri, Jun 12, 2009 at 4:04 PM, Allan > Espinosa wrote: >> Hi Andriy and Mike, >> >> here is my example ~/.ssh/auth.defaults for executing jobs on >> tp-login1.ci.uchicago.edu: >> >> [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults >> tp-login1.ci.uchicago.edu.type=key >> tp-login1.ci.uchicago.edu.username=aespinosa >> tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa >> tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX >> >> We have used falkon and coasters before for multi-core configurations. >> but for a single multi-core machine, i believe you can get away with >> having multiple entries of the same host in the sites.xml file using >> the ssh-provider. >> >> ie: >> >> >> >> >> ... >> >> >> ... >> >> >> >> 2009/6/12 Michael Wilde : >>> Ah, very cool. Im eager to get more user experience feedback on multicore >>> use. >>> >>> So I will try to hunt down my examples of .ssh configs. >>> >>> Also, Allan Espinosa used this recently. Allan, can you post details and >>> examples? >>> >>> Thanks! >>> >>> Mike >>> >>> On 6/12/09 2:06 PM, Andriy Fedorov wrote: >>>> >>>> Michael, >>>> >>>> Thank you for the advice, I will look into this. This is very helpful. >>>> I had an impression Lava is not included in the list of schedulers >>>> supported out of the box, but wanted to check. >>>> >>>> Just a clarification -- I need to access two different types of local >>>> resources. Cluster (via Lava or Condor) is one, but for the multicore >>>> nodes we have on the network, which are not part of cluster, the only >>>> option is to use ssh. >>>> >>>> AF >>>> >>>> >>>> >>>> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: >>>>> >>>>> Andriy, >>>>> >>>>> Ben or Mihael may have better ideas, but I offer my thoughts below. >>>>> >>>>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am trying to set up Swift with the local cluster and non-cluster >>>>>> resources in our lab. Here some configuration details. >>>>>> >>>>>> Due to technical problems, passphrase login is not possible for the >>>>>> nodes on local network, and I need to enter password each time. >>>>>> >>>>>> For the cluster, I was able to set up passphrase login for the head >>>>>> node. The cluster is running Lava and Condor schedulers at the same >>>>>> time, but Lava should be used if possible. >>>>>> >>>>>> Two questions: >>>>>> >>>>>> (1) is it possible to configure Swift to talk to Lava scheduler? >>>>> >>>>> Making Swift talk to a new scheduler means writing a new CoG provider (in >>>>> Java). You can likely use an existing "data" provider like "local"; you >>>>> could model the "execution" provider after the "PBS" provider. How hard >>>>> this >>>>> is depends on how close Lava is to PBS in nature. (I dont know it). And >>>>> the >>>>> provider interface you need to code to is not well documented afaik. >>>>> >>>>> I would try the Condor provider. While that provider is less mature and >>>>> tested than others, it should work, and if it doesnt, we should try to >>>>> fix >>>>> it. >>>>> >>>>> If possible, make sure a simple condor_submit hello-world works for you >>>>> first. >>>>> >>>>> Run swift on the head/login node; use the "local" data provider. >>>>> >>>>> Another route is to use Falkon, but that will be harder and its less >>>>> supported, so I suggest against this until easier routes are exhausted. >>>>> >>>>> I dont think that ssh will get you far, as to leverage the cluster I >>>>> think >>>>> you'd need to describe each worker node with a separate sites.xml entry. >>>>> Thats fine in principle, but a bit awkward, and may have scheduling >>>>> issues >>>>> (ie if ssh hangs or dies when you dont own the node). >>>>> >>>>> Save ssh as another last resort; I suggest trying Condor first. >>>>> >>>>> If needed, people who used ssh recently can send you the info below. >>>>> >>>>> - Mike >>>>> >>>>>> (2) I am following the instructions on setting up ssh site provider to >>>>>> use nodes on the local network. >>>>>> (2.1) do I need to set up auth.defaults even if I have ssh-agent >>>>>> running, and can ssh to the remote node without being asked for >>>>>> password? >>>>>> (2.2.) can anybody give me more detailed instructions on how to set >>>>>> up auth.defaults? I cannot make it work. >>>>> >>>>>> Thanks >>>>>> >>>>>> Andriy Fedorov >>>>>> _______________________________________________ >> >> >> >> -- >> Allan M. Espinosa >> PhD student, Computer Science >> University of Chicago >> > From benc at hawaga.org.uk Fri Jun 12 15:48:27 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 12 Jun 2009 20:48:27 +0000 (GMT) Subject: [Swift-user] Swift on local resources In-Reply-To: <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> Message-ID: On Fri, 12 Jun 2009, Allan Espinosa wrote: > We have used falkon and coasters before for multi-core configurations. > but for a single multi-core machine, i believe you can get away with > having multiple entries of the same host in the sites.xml file using > the ssh-provider. A single site can have multiple jobs at once - you don't need one per core. Look at any use of a cluster, for example. Set the jobThrottle to reflect the number of simultaneous jobs you want running (== the number of cores you want to use) - if you don't specify it, the default is 20 jobs at once (jobThrottle=0.2), which is probably not what you want. The same applies for using the local execution provider to run on multiple cores on your local macine. -- From aespinosa at cs.uchicago.edu Fri Jun 12 15:53:52 2009 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Fri, 12 Jun 2009 15:53:52 -0500 Subject: [Swift-user] Swift on local resources In-Reply-To: <82f536810906121346n272781dbr5d2bef730a393efc@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> <82f536810906121346n272781dbr5d2bef730a393efc@mail.gmail.com> Message-ID: <50b07b4b0906121353j5e000462ub9c1d4c83bcf8411@mail.gmail.com> 2009/6/12 Andriy Fedorov : > My ~/.ssh/auth.defaults is this: > > george.bwh.harvard.edu.type=key > george.bwh.harvard.edu.username=fedorov > george.bwh.harvard.edu.key=/home/fedorov/.ssh/identity.pub is identity.pub your public key? This entry should refer to your private key. > george.bwh.harvard.edu.passphrase=**** > > But what I am saying is that passphrase login is not working even when > I do plain ssh -- it is asking me to enter the password. I see. is your public key on the ~/.ssh/authorized_keys file of the remote host? > > Here are the error messages I am getting trying to run a simple test: > > Swift 0.9 swift-r2860 cog-r2388 > > RunID: 20090612-1643-tlksdhh5 > Progress: > Progress: ?Initializing site shared directory:1 > Progress: ?Initializing site shared directory:1 > Progress: ?Initializing site shared directory:1 > Execution failed: > ? ? ? ?Could not initialize shared directory on spl_george > Caused by: > ? ? ? ?org.globus.cog.abstraction.impl.file.FileResourceException: > Error while communicating with the SSH server on > george.bwh.harvard.edu:22 > Caused by: > ? ? ? ?java.lang.NullPointerException > Caused by: > ? ? ? ?java.lang.NullPointerException > ? ? ? ?at java.lang.StringBuffer.(StringBuffer.java:104) > ? ? ? ?at com.sshtools.j2ssh.openssh.PEMReader.read(PEMReader.java:117) > ? ? ? ?at com.sshtools.j2ssh.openssh.PEMReader.(PEMReader.java:61) > ? ? ? ?at com.sshtools.j2ssh.openssh.OpenSSHPrivateKeyFormat.isFormatted(OpenSSHPrivateKeyFormat.java:205) > ? ? ? ?at com.sshtools.j2ssh.transport.publickey.SshPrivateKeyFile.parse(SshPrivateKeyFile.java:132) > ? ? ? ?at com.sshtools.j2ssh.transport.publickey.SshPrivateKeyFile.parse(SshPrivateKeyFile.java:171) > ? ? ? ?at org.globus.cog.abstraction.impl.ssh.Ssh.connect(Ssh.java:254) > ? ? ? ?at org.globus.cog.abstraction.impl.ssh.SSHConnectionBundle$Connection.ensureConnected(SSHConnectionBundle.java:234) > ? ? ? ?at org.globus.cog.abstraction.impl.ssh.SSHConnectionBundle.allocateChannel(SSHConnectionBundle.java:76) > ? ? ? ?at org.globus.cog.abstraction.impl.ssh.SSHChannelManager.getChannel(SSHChannelManager.java:71) > ? ? ? ?at org.globus.cog.abstraction.impl.ssh.file.FileResourceImpl.start(FileResourceImpl.java:81) > ? ? ? ?at org.globus.cog.abstraction.impl.file.FileResourceCache.getResource(FileResourceCache.java:98) > ? ? ? ?at org.globus.cog.abstraction.impl.file.CachingDelegatedFileOperationHandler.getResource(CachingDelegatedFileOperationHandler.java:75) > ? ? ? ?at org.globus.cog.abstraction.impl.file.CachingDelegatedFileOperationHandler.submit(CachingDelegatedFileOperationHandler.java:40) > ? ? ? ?at org.globus.cog.abstraction.impl.common.task.CachingFileOperationTaskHandler.submit(CachingFileOperationTaskHandler.java:28) > ? ? ? ?at org.globus.cog.karajan.scheduler.submitQueue.NonBlockingSubmit.run(NonBlockingSubmit.java:86) > ? ? ? ?at edu.emory.mathcs.backport.java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:431) > ? ? ? ?at edu.emory.mathcs.backport.java.util.concurrent.FutureTask.run(FutureTask.java:166) > ? ? ? ?at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:643) > ? ? ? ?at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:668) > ? ? ? ?at java.lang.Thread.run(Thread.java:595) > > > On Fri, Jun 12, 2009 at 4:17 PM, Andriy Fedorov wrote: >> Allan, >> >> Thank you for the example. >> >> I have exactly same setup, but it doesn't work for me. I suspect the >> reason is that I am unable to set up my environment to work with >> passphrase. I can only log in with password. I wonder if there is any >> workaround... >> >> AF >> >> >> >> On Fri, Jun 12, 2009 at 4:04 PM, Allan >> Espinosa wrote: >>> Hi Andriy and Mike, >>> >>> here is my example ~/.ssh/auth.defaults for executing jobs on >>> tp-login1.ci.uchicago.edu: >>> >>> [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults >>> tp-login1.ci.uchicago.edu.type=key >>> tp-login1.ci.uchicago.edu.username=aespinosa >>> tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa >>> tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX >>> >>> We have used falkon and coasters before for multi-core configurations. >>> ?but for a single multi-core machine, i believe you can get away with >>> having multiple entries of the same host in the sites.xml file using >>> the ssh-provider. >>> >>> ie: >>> >>> >>> >>> ? >>> ? ... >>> >>> >>> ? ... >>> >>> >>> >>> 2009/6/12 Michael Wilde : >>>> Ah, very cool. Im eager to get more user experience feedback on multicore >>>> use. >>>> >>>> So I will try to hunt down my examples of .ssh configs. >>>> >>>> Also, Allan Espinosa used this recently. Allan, can you post details and >>>> examples? >>>> >>>> Thanks! >>>> >>>> Mike >>>> >>>> On 6/12/09 2:06 PM, Andriy Fedorov wrote: >>>>> >>>>> Michael, >>>>> >>>>> Thank you for the advice, I will look into this. This is very helpful. >>>>> I had an impression Lava is not included in the list of schedulers >>>>> supported out of the box, but wanted to check. >>>>> >>>>> Just a clarification -- I need to access two different types of local >>>>> resources. Cluster (via Lava or Condor) is one, but for the multicore >>>>> nodes we have on the network, which are not part of cluster, the only >>>>> option is to use ssh. >>>>> >>>>> AF >>>>> >>>>> >>>>> >>>>> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: >>>>>> >>>>>> Andriy, >>>>>> >>>>>> Ben or Mihael may have better ideas, but I offer my thoughts below. >>>>>> >>>>>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I am trying to set up Swift with the local cluster and non-cluster >>>>>>> resources in our lab. Here some configuration details. >>>>>>> >>>>>>> Due to technical problems, passphrase login is not possible for the >>>>>>> nodes on local network, and I need to enter password each time. >>>>>>> >>>>>>> For the cluster, I was able to set up passphrase login for the head >>>>>>> node. The cluster is running Lava and Condor schedulers at the same >>>>>>> time, but Lava should be used if possible. >>>>>>> >>>>>>> Two questions: >>>>>>> >>>>>>> (1) is it possible to configure Swift to talk to Lava scheduler? >>>>>> >>>>>> Making Swift talk to a new scheduler means writing a new CoG provider (in >>>>>> Java). You can likely use an existing "data" provider like "local"; you >>>>>> could model the "execution" provider after the "PBS" provider. How hard >>>>>> this >>>>>> is depends on how close Lava is to PBS in nature. (I dont know it). And >>>>>> the >>>>>> provider interface you need to code to is not well documented afaik. >>>>>> >>>>>> I would try the Condor provider. While that provider is less mature and >>>>>> tested than others, it should work, and if it doesnt, we should try to >>>>>> fix >>>>>> it. >>>>>> >>>>>> If possible, make sure a simple condor_submit hello-world works for you >>>>>> first. >>>>>> >>>>>> Run swift on the head/login node; use the "local" data provider. >>>>>> >>>>>> Another route is to use Falkon, but that will be harder and its less >>>>>> supported, so I suggest against this until easier routes are exhausted. >>>>>> >>>>>> I dont think that ssh will get you far, as to leverage the cluster I >>>>>> think >>>>>> you'd need to describe each worker node with a separate sites.xml entry. >>>>>> Thats fine in principle, but a bit awkward, and may have scheduling >>>>>> issues >>>>>> (ie if ssh hangs or dies when you dont own the node). >>>>>> >>>>>> Save ssh as another last resort; I suggest trying Condor first. >>>>>> >>>>>> If needed, people who used ssh recently can send you the info below. >>>>>> >>>>>> - Mike >>>>>> >>>>>>> (2) I am following the instructions on setting up ssh site provider to >>>>>>> use nodes on the local network. >>>>>>> ?(2.1) do I need to set up auth.defaults even if I have ssh-agent >>>>>>> running, and can ssh to the remote node without being asked for >>>>>>> password? >>>>>>> ?(2.2.) can anybody give me more detailed instructions on how to set >>>>>>> up auth.defaults? I cannot make it work. >>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Andriy Fedorov >>>>>>> _______________________________________________ >>> >>> >>> >>> -- >>> Allan M. Espinosa >>> PhD student, Computer Science >>> University of Chicago >>> >> > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From fedorov at cs.wm.edu Fri Jun 12 16:16:40 2009 From: fedorov at cs.wm.edu (Andriy Fedorov) Date: Fri, 12 Jun 2009 17:16:40 -0400 Subject: [Swift-user] Swift on local resources In-Reply-To: <50b07b4b0906121353j5e000462ub9c1d4c83bcf8411@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> <82f536810906121346n272781dbr5d2bef730a393efc@mail.gmail.com> <50b07b4b0906121353j5e000462ub9c1d4c83bcf8411@mail.gmail.com> Message-ID: <82f536810906121416o66e56a85xe981d3b7536153f7@mail.gmail.com> On Fri, Jun 12, 2009 at 4:53 PM, Allan Espinosa wrote: > 2009/6/12 Andriy Fedorov : >> My ~/.ssh/auth.defaults is this: >> >> george.bwh.harvard.edu.type=key >> george.bwh.harvard.edu.username=fedorov >> george.bwh.harvard.edu.key=/home/fedorov/.ssh/identity.pub > > is identity.pub your public key? This entry should refer to your private key. > Ah, ok ... Now I have this error, which makes more sense: Swift 0.9 swift-r2860 cog-r2388 RunID: 20090612-1714-8od3dmb0 Progress: Progress: Initializing site shared directory:1 Progress: Initializing site shared directory:1 Progress: Initializing site shared directory:1 Execution failed: Could not initialize shared directory on spl_george Caused by: org.globus.cog.abstraction.impl.file.FileResourceException: Error while communicating with the SSH server on george.bwh.harvard.edu:22 Caused by: Public Key Authentication failed >> george.bwh.harvard.edu.passphrase=**** >> >> But what I am saying is that passphrase login is not working even when >> I do plain ssh -- it is asking me to enter the password. > > I see. is your public key on the ~/.ssh/authorized_keys file of the remote host? >> Yes, I think I followed the instructions precisely. Something is wrong with the system, because I was able to set up passphrase access to the cluster head node, but not between the nodes that share ~/.ssh (/home is NFS-mounted). Other people in the lab had same difficulties with ssh keys, so I am afraid it's not something obvious. >> Here are the error messages I am getting trying to run a simple test: >> >> Swift 0.9 swift-r2860 cog-r2388 >> >> RunID: 20090612-1643-tlksdhh5 >> Progress: >> Progress: Initializing site shared directory:1 >> Progress: Initializing site shared directory:1 >> Progress: Initializing site shared directory:1 >> Execution failed: >> Could not initialize shared directory on spl_george >> Caused by: >> org.globus.cog.abstraction.impl.file.FileResourceException: >> Error while communicating with the SSH server on >> george.bwh.harvard.edu:22 >> Caused by: >> java.lang.NullPointerException >> Caused by: >> java.lang.NullPointerException >> at java.lang.StringBuffer.(StringBuffer.java:104) >> at com.sshtools.j2ssh.openssh.PEMReader.read(PEMReader.java:117) >> at com.sshtools.j2ssh.openssh.PEMReader.(PEMReader.java:61) >> at com.sshtools.j2ssh.openssh.OpenSSHPrivateKeyFormat.isFormatted(OpenSSHPrivateKeyFormat.java:205) >> at com.sshtools.j2ssh.transport.publickey.SshPrivateKeyFile.parse(SshPrivateKeyFile.java:132) >> at com.sshtools.j2ssh.transport.publickey.SshPrivateKeyFile.parse(SshPrivateKeyFile.java:171) >> at org.globus.cog.abstraction.impl.ssh.Ssh.connect(Ssh.java:254) >> at org.globus.cog.abstraction.impl.ssh.SSHConnectionBundle$Connection.ensureConnected(SSHConnectionBundle.java:234) >> at org.globus.cog.abstraction.impl.ssh.SSHConnectionBundle.allocateChannel(SSHConnectionBundle.java:76) >> at org.globus.cog.abstraction.impl.ssh.SSHChannelManager.getChannel(SSHChannelManager.java:71) >> at org.globus.cog.abstraction.impl.ssh.file.FileResourceImpl.start(FileResourceImpl.java:81) >> at org.globus.cog.abstraction.impl.file.FileResourceCache.getResource(FileResourceCache.java:98) >> at org.globus.cog.abstraction.impl.file.CachingDelegatedFileOperationHandler.getResource(CachingDelegatedFileOperationHandler.java:75) >> at org.globus.cog.abstraction.impl.file.CachingDelegatedFileOperationHandler.submit(CachingDelegatedFileOperationHandler.java:40) >> at org.globus.cog.abstraction.impl.common.task.CachingFileOperationTaskHandler.submit(CachingFileOperationTaskHandler.java:28) >> at org.globus.cog.karajan.scheduler.submitQueue.NonBlockingSubmit.run(NonBlockingSubmit.java:86) >> at edu.emory.mathcs.backport.java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:431) >> at edu.emory.mathcs.backport.java.util.concurrent.FutureTask.run(FutureTask.java:166) >> at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:643) >> at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:668) >> at java.lang.Thread.run(Thread.java:595) >> >> >> On Fri, Jun 12, 2009 at 4:17 PM, Andriy Fedorov wrote: >>> Allan, >>> >>> Thank you for the example. >>> >>> I have exactly same setup, but it doesn't work for me. I suspect the >>> reason is that I am unable to set up my environment to work with >>> passphrase. I can only log in with password. I wonder if there is any >>> workaround... >>> >>> AF >>> >>> >>> >>> On Fri, Jun 12, 2009 at 4:04 PM, Allan >>> Espinosa wrote: >>>> Hi Andriy and Mike, >>>> >>>> here is my example ~/.ssh/auth.defaults for executing jobs on >>>> tp-login1.ci.uchicago.edu: >>>> >>>> [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults >>>> tp-login1.ci.uchicago.edu.type=key >>>> tp-login1.ci.uchicago.edu.username=aespinosa >>>> tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa >>>> tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX >>>> >>>> We have used falkon and coasters before for multi-core configurations. >>>> but for a single multi-core machine, i believe you can get away with >>>> having multiple entries of the same host in the sites.xml file using >>>> the ssh-provider. >>>> >>>> ie: >>>> >>>> >>>> >>>> >>>> ... >>>> >>>> >>>> ... >>>> >>>> >>>> >>>> 2009/6/12 Michael Wilde : >>>>> Ah, very cool. Im eager to get more user experience feedback on multicore >>>>> use. >>>>> >>>>> So I will try to hunt down my examples of .ssh configs. >>>>> >>>>> Also, Allan Espinosa used this recently. Allan, can you post details and >>>>> examples? >>>>> >>>>> Thanks! >>>>> >>>>> Mike >>>>> >>>>> On 6/12/09 2:06 PM, Andriy Fedorov wrote: >>>>>> >>>>>> Michael, >>>>>> >>>>>> Thank you for the advice, I will look into this. This is very helpful. >>>>>> I had an impression Lava is not included in the list of schedulers >>>>>> supported out of the box, but wanted to check. >>>>>> >>>>>> Just a clarification -- I need to access two different types of local >>>>>> resources. Cluster (via Lava or Condor) is one, but for the multicore >>>>>> nodes we have on the network, which are not part of cluster, the only >>>>>> option is to use ssh. >>>>>> >>>>>> AF >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: >>>>>>> >>>>>>> Andriy, >>>>>>> >>>>>>> Ben or Mihael may have better ideas, but I offer my thoughts below. >>>>>>> >>>>>>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am trying to set up Swift with the local cluster and non-cluster >>>>>>>> resources in our lab. Here some configuration details. >>>>>>>> >>>>>>>> Due to technical problems, passphrase login is not possible for the >>>>>>>> nodes on local network, and I need to enter password each time. >>>>>>>> >>>>>>>> For the cluster, I was able to set up passphrase login for the head >>>>>>>> node. The cluster is running Lava and Condor schedulers at the same >>>>>>>> time, but Lava should be used if possible. >>>>>>>> >>>>>>>> Two questions: >>>>>>>> >>>>>>>> (1) is it possible to configure Swift to talk to Lava scheduler? >>>>>>> >>>>>>> Making Swift talk to a new scheduler means writing a new CoG provider (in >>>>>>> Java). You can likely use an existing "data" provider like "local"; you >>>>>>> could model the "execution" provider after the "PBS" provider. How hard >>>>>>> this >>>>>>> is depends on how close Lava is to PBS in nature. (I dont know it). And >>>>>>> the >>>>>>> provider interface you need to code to is not well documented afaik. >>>>>>> >>>>>>> I would try the Condor provider. While that provider is less mature and >>>>>>> tested than others, it should work, and if it doesnt, we should try to >>>>>>> fix >>>>>>> it. >>>>>>> >>>>>>> If possible, make sure a simple condor_submit hello-world works for you >>>>>>> first. >>>>>>> >>>>>>> Run swift on the head/login node; use the "local" data provider. >>>>>>> >>>>>>> Another route is to use Falkon, but that will be harder and its less >>>>>>> supported, so I suggest against this until easier routes are exhausted. >>>>>>> >>>>>>> I dont think that ssh will get you far, as to leverage the cluster I >>>>>>> think >>>>>>> you'd need to describe each worker node with a separate sites.xml entry. >>>>>>> Thats fine in principle, but a bit awkward, and may have scheduling >>>>>>> issues >>>>>>> (ie if ssh hangs or dies when you dont own the node). >>>>>>> >>>>>>> Save ssh as another last resort; I suggest trying Condor first. >>>>>>> >>>>>>> If needed, people who used ssh recently can send you the info below. >>>>>>> >>>>>>> - Mike >>>>>>> >>>>>>>> (2) I am following the instructions on setting up ssh site provider to >>>>>>>> use nodes on the local network. >>>>>>>> (2.1) do I need to set up auth.defaults even if I have ssh-agent >>>>>>>> running, and can ssh to the remote node without being asked for >>>>>>>> password? >>>>>>>> (2.2.) can anybody give me more detailed instructions on how to set >>>>>>>> up auth.defaults? I cannot make it work. >>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> Andriy Fedorov >>>>>>>> _______________________________________________ >>>> >>>> >>>> >>>> -- >>>> Allan M. Espinosa >>>> PhD student, Computer Science >>>> University of Chicago >>>> >>> >> >> > > > > -- > Allan M. Espinosa > PhD student, Computer Science > University of Chicago > From HodgessE at uhd.edu Fri Jun 12 16:58:29 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Fri, 12 Jun 2009 16:58:29 -0500 Subject: [Swift-user] puzzled on error code Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FE4@BALI.uhd.campus> Here is some output with a definitely non-help error message: [erin at tp-login2 bin]$ swift -tc.file tc.data perm1.swift Swift svn swift-r2950 cog-r2406 RunID: 20090612-1653-55eg5dg9 Progress: Progress: Checking status:1 Progress: Stage in:1 Progress: Failed but can retry:1 Execution failed: Exception in RPermInvoke: Arguments: [permutations.R, 4, 1] Host: localhost Directory: perm1-20090612-1653-55eg5dg9/jobs/c/RPermInvoke-czu3f6cj stderr.txt: stdout.txt: ---- Caused by: Exit code 2 [erin at tp-login2 bin]$ cat perm1.swift # for running the script permutations.R # swift will give R the permutation number # and the matrices for running the t-test # and will produce a file results/.out type file{} (file rout) perm_r (file scriptname, int pnum, int pstart) { app { RPermInvoke @filename(scriptname) pnum pstart ; } } file r_script; #file perm_matrix; foreach i in [1:1] { file r_out ; (r_out) = perm_r(r_script,4,i); } [erin at tp-login2 bin]$ cat tc.data #NOTE WELL: fields in this file must be separated by tabs, not spaces # and there must be no trailing whitespace at the end of each line. # # sitename app pathname (ignored) (ignored) profiles localhost echo /bin/echo INSTALLED INTEL32::LINUX null teraport echo /bin/echo INSTALLED INTEL32::LINUX null localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null teraport RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null localhost RPermInvoke /home/erin/R-2.9.0/bin/RPermInvoke.sh INSTALLED INTEL32::LINUX null Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Fri Jun 12 17:18:10 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Fri, 12 Jun 2009 17:18:10 -0500 Subject: [Swift-user] upgraded to error one Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FE6@BALI.uhd.campus> I found some problems with RPermInvoke.sh, but still am having problems as you can see: [erin at tp-login2 bin]$ swift -tc.file tc.data perm1.swift Swift svn swift-r2950 cog-r2406 RunID: 20090612-1714-qye382u3 Progress: Progress: Checking status:1 Progress: Checking status:1 Progress: Checking status:1 Execution failed: Exception in RPermInvoke: Arguments: [permutations.R, 4, 1] Host: localhost Directory: perm1-20090612-1714-qye382u3/jobs/7/RPermInvoke-7q5yf6cj stderr.txt: stdout.txt: ---- Caused by: Exit code 1 [erin at tp-login2 bin]$ cat tc.data #NOTE WELL: fields in this file must be separated by tabs, not spaces # and there must be no trailing whitespace at the end of each line. # # sitename app pathname (ignored) (ignored) profiles localhost echo /bin/echo INSTALLED INTEL32::LINUX null teraport echo /bin/echo INSTALLED INTEL32::LINUX null localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null teraport RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null localhost RPermInvoke /home/erin/R-2.9.0/bin/RPermInvoke.sh INSTALLED INTEL32::LINUX null [erin at tp-login2 bin]$ cat perm1.swift # for running the script permutations.R # swift will give R the permutation number # and the matrices for running the t-test # and will produce a file results/.out type file{} (file rout) perm_r (file scriptname, int pnum, int pstart) { app { RPermInvoke @filename(scriptname) pnum pstart ; } } file r_script; #file perm_matrix; foreach i in [1:1] { file r_out ; (r_out) = perm_r(r_script,4,i); } [erin at tp-login2 bin]$ cat RPermInvoke.sh #!/bin/bash export R_SCRIPT=$1 shift export R_SWIFT_ARGS="$*" /home/erin/R-2.9.0/bin/R CMD BATCH --vanilla $R_SCRIPT [erin at tp-login2 bin]$ Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Fri Jun 12 17:54:18 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 12 Jun 2009 22:54:18 +0000 (GMT) Subject: [Swift-user] upgraded to error one In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FE6@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FE6@BALI.uhd.campus> Message-ID: if you run RPermInvoke.sh with appropriate commandline parameters and then on the immediately following command line, type: echo $? what number do you get? On Fri, 12 Jun 2009, Hodgess, Erin wrote: > I found some problems with RPermInvoke.sh, but still am having problems as you can see: > > > > > [erin at tp-login2 bin]$ swift -tc.file tc.data perm1.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090612-1714-qye382u3 > Progress: > Progress: Checking status:1 > Progress: Checking status:1 > Progress: Checking status:1 > Execution failed: > Exception in RPermInvoke: > Arguments: [permutations.R, 4, 1] > Host: localhost > Directory: perm1-20090612-1714-qye382u3/jobs/7/RPermInvoke-7q5yf6cj > stderr.txt: > stdout.txt: > ---- > > Caused by: > Exit code 1 > [erin at tp-login2 bin]$ cat tc.data > #NOTE WELL: fields in this file must be separated by tabs, not spaces > # and there must be no trailing whitespace at the end of each line. > # > # sitename app pathname (ignored) (ignored) profiles > localhost echo /bin/echo INSTALLED INTEL32::LINUX null > teraport echo /bin/echo INSTALLED INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null > localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null > localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null > localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null > localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null > teraport RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null > localhost RPermInvoke /home/erin/R-2.9.0/bin/RPermInvoke.sh INSTALLED INTEL32::LINUX null > [erin at tp-login2 bin]$ cat perm1.swift > # for running the script permutations.R > # swift will give R the permutation number > # and the matrices for running the t-test > # and will produce a file results/.out > > type file{} > > (file rout) perm_r (file scriptname, int pnum, int pstart) > { > app > { > RPermInvoke @filename(scriptname) pnum pstart ; > } > } > > > file r_script; > > #file perm_matrix; > > foreach i in [1:1] > { > file r_out ; > (r_out) = perm_r(r_script,4,i); > } > > [erin at tp-login2 bin]$ cat RPermInvoke.sh > #!/bin/bash > > > export R_SCRIPT=$1 > shift > export R_SWIFT_ARGS="$*" > > > /home/erin/R-2.9.0/bin/R CMD BATCH --vanilla $R_SCRIPT > [erin at tp-login2 bin]$ > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > From HodgessE at uhd.edu Sat Jun 13 02:08:14 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Sat, 13 Jun 2009 02:08:14 -0500 Subject: [Swift-user] upgraded to error one References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FE6@BALI.uhd.campus> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FED@BALI.uhd.campus> It never finishes. Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Ben Clifford [mailto:benc at hawaga.org.uk] Sent: Fri 6/12/2009 5:54 PM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] upgraded to error one if you run RPermInvoke.sh with appropriate commandline parameters and then on the immediately following command line, type: echo $? what number do you get? On Fri, 12 Jun 2009, Hodgess, Erin wrote: > I found some problems with RPermInvoke.sh, but still am having problems as you can see: > > > > > [erin at tp-login2 bin]$ swift -tc.file tc.data perm1.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090612-1714-qye382u3 > Progress: > Progress: Checking status:1 > Progress: Checking status:1 > Progress: Checking status:1 > Execution failed: > Exception in RPermInvoke: > Arguments: [permutations.R, 4, 1] > Host: localhost > Directory: perm1-20090612-1714-qye382u3/jobs/7/RPermInvoke-7q5yf6cj > stderr.txt: > stdout.txt: > ---- > > Caused by: > Exit code 1 > [erin at tp-login2 bin]$ cat tc.data > #NOTE WELL: fields in this file must be separated by tabs, not spaces > # and there must be no trailing whitespace at the end of each line. > # > # sitename app pathname (ignored) (ignored) profiles > localhost echo /bin/echo INSTALLED INTEL32::LINUX null > teraport echo /bin/echo INSTALLED INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null > localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null > localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null > localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null > localhost RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null > teraport RInvoke /home/erin/R-2.9.0/bin/RInvoke.sh INSTALLED INTEL32::LINUX null > localhost RPermInvoke /home/erin/R-2.9.0/bin/RPermInvoke.sh INSTALLED INTEL32::LINUX null > [erin at tp-login2 bin]$ cat perm1.swift > # for running the script permutations.R > # swift will give R the permutation number > # and the matrices for running the t-test > # and will produce a file results/.out > > type file{} > > (file rout) perm_r (file scriptname, int pnum, int pstart) > { > app > { > RPermInvoke @filename(scriptname) pnum pstart ; > } > } > > > file r_script; > > #file perm_matrix; > > foreach i in [1:1] > { > file r_out ; > (r_out) = perm_r(r_script,4,i); > } > > [erin at tp-login2 bin]$ cat RPermInvoke.sh > #!/bin/bash > > > export R_SCRIPT=$1 > shift > export R_SWIFT_ARGS="$*" > > > /home/erin/R-2.9.0/bin/R CMD BATCH --vanilla $R_SCRIPT > [erin at tp-login2 bin]$ > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Sat Jun 13 03:56:00 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Sat, 13 Jun 2009 08:56:00 +0000 (GMT) Subject: [Swift-user] upgraded to error one In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FED@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FE6@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C36FED@BALI.uhd.campus> Message-ID: On Sat, 13 Jun 2009, Hodgess, Erin wrote: > It never finishes. Can you send the entire commandline you used? based on waht I see from your swiftscript it should be something like RPermInvoke permutations.R 4 1 -- From hategan at mcs.anl.gov Sat Jun 13 06:12:42 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 13 Jun 2009 06:12:42 -0500 Subject: [Swift-user] Swift on local resources In-Reply-To: <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> Message-ID: <1244891562.10588.6.camel@localhost> On Fri, 2009-06-12 at 16:17 -0400, Andriy Fedorov wrote: > Allan, > > Thank you for the example. > > I have exactly same setup, but it doesn't work for me. I suspect the > reason is that I am unable to set up my environment to work with > passphrase. I can only log in with password. I wonder if there is any > workaround... If you want to use password authentication, remove the key and passphrase lines and add .password=xxx. However, in some cases being asked by ssh for a password may mean "keyboard interactive authentication" which is not the same as "username/password authentication" and may not work with the ssh provider. Do an ssh -v and you'll see the list of authentication methods that the server allows. > > AF > > > > On Fri, Jun 12, 2009 at 4:04 PM, Allan > Espinosa wrote: > > Hi Andriy and Mike, > > > > here is my example ~/.ssh/auth.defaults for executing jobs on > > tp-login1.ci.uchicago.edu: > > > > [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults > > tp-login1.ci.uchicago.edu.type=key > > tp-login1.ci.uchicago.edu.username=aespinosa > > tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa > > tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX > > > > We have used falkon and coasters before for multi-core configurations. > > but for a single multi-core machine, i believe you can get away with > > having multiple entries of the same host in the sites.xml file using > > the ssh-provider. > > > > ie: > > > > > > > > > > ... > > > > > > ... > > > > > > > > 2009/6/12 Michael Wilde : > >> Ah, very cool. Im eager to get more user experience feedback on multicore > >> use. > >> > >> So I will try to hunt down my examples of .ssh configs. > >> > >> Also, Allan Espinosa used this recently. Allan, can you post details and > >> examples? > >> > >> Thanks! > >> > >> Mike > >> > >> On 6/12/09 2:06 PM, Andriy Fedorov wrote: > >>> > >>> Michael, > >>> > >>> Thank you for the advice, I will look into this. This is very helpful. > >>> I had an impression Lava is not included in the list of schedulers > >>> supported out of the box, but wanted to check. > >>> > >>> Just a clarification -- I need to access two different types of local > >>> resources. Cluster (via Lava or Condor) is one, but for the multicore > >>> nodes we have on the network, which are not part of cluster, the only > >>> option is to use ssh. > >>> > >>> AF > >>> > >>> > >>> > >>> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: > >>>> > >>>> Andriy, > >>>> > >>>> Ben or Mihael may have better ideas, but I offer my thoughts below. > >>>> > >>>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> I am trying to set up Swift with the local cluster and non-cluster > >>>>> resources in our lab. Here some configuration details. > >>>>> > >>>>> Due to technical problems, passphrase login is not possible for the > >>>>> nodes on local network, and I need to enter password each time. > >>>>> > >>>>> For the cluster, I was able to set up passphrase login for the head > >>>>> node. The cluster is running Lava and Condor schedulers at the same > >>>>> time, but Lava should be used if possible. > >>>>> > >>>>> Two questions: > >>>>> > >>>>> (1) is it possible to configure Swift to talk to Lava scheduler? > >>>> > >>>> Making Swift talk to a new scheduler means writing a new CoG provider (in > >>>> Java). You can likely use an existing "data" provider like "local"; you > >>>> could model the "execution" provider after the "PBS" provider. How hard > >>>> this > >>>> is depends on how close Lava is to PBS in nature. (I dont know it). And > >>>> the > >>>> provider interface you need to code to is not well documented afaik. > >>>> > >>>> I would try the Condor provider. While that provider is less mature and > >>>> tested than others, it should work, and if it doesnt, we should try to > >>>> fix > >>>> it. > >>>> > >>>> If possible, make sure a simple condor_submit hello-world works for you > >>>> first. > >>>> > >>>> Run swift on the head/login node; use the "local" data provider. > >>>> > >>>> Another route is to use Falkon, but that will be harder and its less > >>>> supported, so I suggest against this until easier routes are exhausted. > >>>> > >>>> I dont think that ssh will get you far, as to leverage the cluster I > >>>> think > >>>> you'd need to describe each worker node with a separate sites.xml entry. > >>>> Thats fine in principle, but a bit awkward, and may have scheduling > >>>> issues > >>>> (ie if ssh hangs or dies when you dont own the node). > >>>> > >>>> Save ssh as another last resort; I suggest trying Condor first. > >>>> > >>>> If needed, people who used ssh recently can send you the info below. > >>>> > >>>> - Mike > >>>> > >>>>> (2) I am following the instructions on setting up ssh site provider to > >>>>> use nodes on the local network. > >>>>> (2.1) do I need to set up auth.defaults even if I have ssh-agent > >>>>> running, and can ssh to the remote node without being asked for > >>>>> password? > >>>>> (2.2.) can anybody give me more detailed instructions on how to set > >>>>> up auth.defaults? I cannot make it work. > >>>> > >>>>> Thanks > >>>>> > >>>>> Andriy Fedorov > >>>>> _______________________________________________ > > > > > > > > -- > > Allan M. Espinosa > > PhD student, Computer Science > > University of Chicago > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From wilde at mcs.anl.gov Sat Jun 13 14:58:27 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 13 Jun 2009 14:58:27 -0500 Subject: [Swift-user] upgraded to error one In-Reply-To: References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FE6@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C36FED@BALI.uhd.campus> Message-ID: <4A3404E3.2000204@mcs.anl.gov> Erin, if the script never finishes, then thats something you and I should work on outside of the swift-user list, as the problem is in your R program, not in Swift. (Even if it came from the Swift apps CVS tree) But when reporting things like this, additional info that is valuable (as well as a few things to try) is: - how long did you let it run? Minutes? longer? - did you look at the process using ps while it was running? Was it consuming CPU time, or "frozen", waiting for something? - what dataset was it reading? Is that dataset located through some other means? Is the script reading from a file or a database? If a database, can you get to it outside of R, eg with mysql? How long is the dataset? Did you try the program on a tiny fragment of the data? I did not look at the script - presumably the answers are in the source code. But unless Im missing something from earlier in this thread, this is a topic we should work on off of the swift-user list. (and Im happy to help with it) - Mike On 6/13/09 3:56 AM, Ben Clifford wrote: > On Sat, 13 Jun 2009, Hodgess, Erin wrote: > >> It never finishes. > > Can you send the entire commandline you used? > > based on waht I see from your swiftscript it should be something like > > RPermInvoke permutations.R 4 1 > From hategan at mcs.anl.gov Sat Jun 13 16:39:29 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 13 Jun 2009 16:39:29 -0500 Subject: [Swift-user] upgraded to error one In-Reply-To: <4A3404E3.2000204@mcs.anl.gov> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C36FE6@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C36FED@BALI.uhd.campus> <4A3404E3.2000204@mcs.anl.gov> Message-ID: <1244929169.5558.1.camel@localhost> This is also one case where swift/coasters should say "walltime exceeded" instead of "Exit code n". On Sat, 2009-06-13 at 14:58 -0500, Michael Wilde wrote: > Erin, if the script never finishes, then thats something you and I > should work on outside of the swift-user list, as the problem is in your > R program, not in Swift. (Even if it came from the Swift apps CVS tree) > > But when reporting things like this, additional info that is valuable > (as well as a few things to try) is: > > - how long did you let it run? Minutes? longer? > > - did you look at the process using ps while it was running? Was it > consuming CPU time, or "frozen", waiting for something? > > - what dataset was it reading? Is that dataset located through some > other means? Is the script reading from a file or a database? If a > database, can you get to it outside of R, eg with mysql? > > How long is the dataset? Did you try the program on a tiny fragment of > the data? > > I did not look at the script - presumably the answers are in the source > code. > > But unless Im missing something from earlier in this thread, this is a > topic we should work on off of the swift-user list. (and Im happy to > help with it) > > - Mike > > > > On 6/13/09 3:56 AM, Ben Clifford wrote: > > On Sat, 13 Jun 2009, Hodgess, Erin wrote: > > > >> It never finishes. > > > > Can you send the entire commandline you used? > > > > based on waht I see from your swiftscript it should be something like > > > > RPermInvoke permutations.R 4 1 > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From HodgessE at uhd.edu Mon Jun 15 08:46:35 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Mon, 15 Jun 2009 08:46:35 -0500 Subject: [Swift-user] multiple files for each app call Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37000@BALI.uhd.campus> Hi! Suppose I have an application that I call which produces, say 5, files per call. I then want to put that in a foreach loop for 1:10. My goal is to produce 50 files. Where the trouble is seems to be producing the multiple files in a single call. Would this be appropriate for an array of files, please? thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Mon Jun 15 09:01:24 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 15 Jun 2009 14:01:24 +0000 (GMT) Subject: [Swift-user] multiple files for each app call In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37000@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37000@BALI.uhd.campus> Message-ID: On Mon, 15 Jun 2009, Hodgess, Erin wrote: > Would this be appropriate for an array of files, please? You can return an array or a structure; or you can return multiple return parameters from a procedure. An array makes sense if the files are named numerically, for example a0001.txt a0002.txt, ... To return multiple files, you can declare multiple return values from a procedure like this: (file a, file b) myproc() { ... } -- From HodgessE at uhd.edu Mon Jun 15 09:06:55 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Mon, 15 Jun 2009 09:06:55 -0500 Subject: [Swift-user] multiple files for each app call References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37000@BALI.uhd.campus> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37002@BALI.uhd.campus> Ok. thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Ben Clifford [mailto:benc at hawaga.org.uk] Sent: Mon 6/15/2009 9:01 AM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] multiple files for each app call On Mon, 15 Jun 2009, Hodgess, Erin wrote: > Would this be appropriate for an array of files, please? You can return an array or a structure; or you can return multiple return parameters from a procedure. An array makes sense if the files are named numerically, for example a0001.txt a0002.txt, ... To return multiple files, you can declare multiple return values from a procedure like this: (file a, file b) myproc() { ... } -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Mon Jun 15 11:45:46 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Mon, 15 Jun 2009 11:45:46 -0500 Subject: [Swift-user] still having problems Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37005@BALI.uhd.campus> Hi again! I have brought things down to the bones here. Still no luck. The files are /home/erin/R-2.9.0/bin/perm.short.R and /home/erin/R-2.9.0/bin/perm4.swift [erin at tp-login2 bin]$ swift -tc.file tc.data perm4.swift Swift svn swift-r2950 cog-r2406 RunID: 20090615-1137-nbfujhb2 Progress: Progress: Checking status:1 Progress: Stage in:1 Execution failed: Exception in RInvoke: Arguments: [perm.short.R, 0] Host: localhost Directory: perm4-20090615-1137-nbfujhb2/jobs/v/RInvoke-vti60bcj stderr.txt: stdout.txt: ---- Caused by: Exit code 1 [erin at tp-login2 bin]$ cat perm4.swift type file{} (file procOut) permScript (file script, int batchSize){ app{ RInvoke @filename(script) batchSize ; } } int batchSize=0; file script<"perm.short.R">; file procOut<"a.txt">; procOut=permScript(script, batchSize); [erin at tp-login2 bin]$ cat perm.short.R # -- read data files that are hardcoded per analysis and get vox_speech_vec <- as.matrix(read.table("origccf.txt")) pm <- as.matrix(read.table("corr.perm.matrix.txt")) allinputs <- Sys.getenv("R_SWIFT_ARGS") print(allinputs) permlength <- as.numeric(noquote(strsplit(allinputs," ")[[1]][1])) # startrpermrow <- as.numeric(noquote(strsplit(allinputs," ")[[1]][1])) startrpermrow <- 1 endpermrow <- permlength + startrpermrow debug = 1 if (debug == 1) { print(paste("start", startrpermrow, "end", endpermrow, sep=",")) } # rotate across specified rows in permutation matrix for (rr in startrpermrow:endpermrow){ permvec = pm[rr,] # initialize a 'ccf' matrix for each permutation ccf <- matrix(nrow=200, ncol=1) mat_row = 0 for (vox in 1:200) { mat_row = mat_row + 1 speechperm_ccf <- ccf(vox_speech_vec[vox,], permvec, lag.max = 6, type = c("correlation"), na.action=na.pass, plot = FALSE) if (any(speechperm_ccf$acf[1:13] == "NaN")) { # speechperm_ccf$acf[1:13] = 0 speechperm_ccf$acf[1:13] <- rep(0,13) } else { speechperm_ccf$acf[1:13] = speechperm_ccf$acf[1:13] } speechperm_ccf <- as.matrix(data.frame(speechperm_ccf$acf, speechperm_ccf$lag))[7:13,] speechperm_cor <- speechperm_ccf[which.max(speechperm_ccf[,1]),] ccf[mat_row, ] <- c(speechperm_cor[[1]]) } write.table (ccf, file=paste("a.txt",sep=""), row.names=FALSE, col.names=FALSE) if (debug == 1) { print(paste("permrow finished", rr, sep=",")) print(date()) } rm(ccf) } [erin at tp-login2 bin]$ Any help appreciated. Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From me.melly at gmail.com Mon Jun 15 11:41:19 2009 From: me.melly at gmail.com (Melinda Chin) Date: Mon, 15 Jun 2009 11:41:19 -0500 Subject: [Swift-user] multiple files for each app call In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37002@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37000@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C37002@BALI.uhd.campus> Message-ID: <63cc32bc0906150941n2afe574dud938d577d3c2852c@mail.gmail.com> I was trying to do the t.swift tutorial again and I keep getting stuck on this error. What does it mean to not be able to find a valid host? How do I remedy this? And another question on the side what's the difference between swift-user at ci.uchicago.edu and swift-user-owner at ci.uchicago.edu. For help we can send to both? or usually just swift-user? *THIS IS WHAT t.swift LOOKS LIKE:* =============================================================== [mchin at tp-login2 swift]$ cat t.swift type messagefile; (messagefile t) greeting (string s) { app{ echo s stdout = @filename(t); } } (messagefile t) capitalise (messagefile f) { app { translate "[a-z]" "[A-Z]" stdin = @filename(f) stdout = @filename(t); } } messagefile outfile <"greeting.txt">; messagefile cfile <"capitalised.txt">; outfile = greeting("hello from Swift"); cfile = capitalise(outfile); =============================================================== *THIS IS WHAT tc.data LOOKS LIKE:* =============================================================== [mchin at tp-login2 swift]$ cat tc.data #This is the transformation catalog. # #It comes pre-configured with a number of simple transformations with #paths that are likely to work on a linux box. However, on some systems, #the paths to these executables will be different (for example, sometimes #some of these programs are found in /usr/bin rather than in /bin) # #NOTE WELL: fields in this file must be separated by tabs, not spaces; and #there must be no trailing whitespace at the end of each line. # # sitename transformation path INSTALLED platform profiles localhost echo /bin/echo INSTALLED INTEL32::LINUX null localhost cat /bin/cat INSTALLED INTEL32::LINUX null localhost ls /bin/ls INSTALLED INTEL32::LINUX null localhost grep /bin/grep INSTALLED INTEL32::LINUX null localhost sort /bin/sort INSTALLED INTEL32::LINUX null localhost paste /bin/paste INSTALLED INTEL32::LINUX null localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null =============================================================== *THIS IS THE ERROR MESSAGE:* =============================================================== [mchin at tp-login2 swift]$ swift t.swift Swift svn swift-r2950 cog-r2406 RunID: 20090615-1125-6z1ehvf8 Progress: Progress: Initializing site shared directory:1 Execution failed: Could not find any valid host for task "Task(type=UNKNOWN, identity=urn:cog-1245083114852)" with constraints {filenames=[Ljava.lang.String;@1f0a76e, trfqn=translate, filecache=org.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 1f0a797, tr=translate} =============================================================== Thank you for your help, Melinda Chin -------------- next part -------------- An HTML attachment was scrubbed... URL: From me.melly at gmail.com Mon Jun 15 11:43:01 2009 From: me.melly at gmail.com (Melinda Chin) Date: Mon, 15 Jun 2009 11:43:01 -0500 Subject: [Swift-user] Questions and ERRORs Message-ID: <63cc32bc0906150943q42790c10s5ea98cf71319ada2@mail.gmail.com> On Mon, Jun 15, 2009 at 11:41 AM, Melinda Chin wrote: > I was trying to do the t.swift tutorial again and I keep getting stuck on > this error. What does it mean to not be able to find a valid host? How do > I remedy this? And another question on the side what's the difference > between swift-user at ci.uchicago.edu and swift-user-owner at ci.uchicago.edu. > For help we can send to both? or usually just swift-user? > > *THIS IS WHAT t.swift LOOKS LIKE:* > =============================================================== > [mchin at tp-login2 swift]$ cat t.swift > type messagefile; > (messagefile t) greeting (string s) { > app{ > echo s stdout = @filename(t); > } > } > > (messagefile t) capitalise (messagefile f) { > app { > translate "[a-z]" "[A-Z]" stdin = @filename(f) stdout = > @filename(t); > } > } > > messagefile outfile <"greeting.txt">; > messagefile cfile <"capitalised.txt">; > > outfile = greeting("hello from Swift"); > cfile = capitalise(outfile); > =============================================================== > > *THIS IS WHAT tc.data LOOKS LIKE:* > =============================================================== > [mchin at tp-login2 swift]$ cat tc.data > #This is the transformation catalog. > # > #It comes pre-configured with a number of simple transformations with > #paths that are likely to work on a linux box. However, on some systems, > #the paths to these executables will be different (for example, sometimes > #some of these programs are found in /usr/bin rather than in /bin) > # > #NOTE WELL: fields in this file must be separated by tabs, not spaces; and > #there must be no trailing whitespace at the end of each line. > # > # sitename transformation path INSTALLED platform profiles > localhost echo /bin/echo INSTALLED > INTEL32::LINUX null > localhost cat /bin/cat INSTALLED > INTEL32::LINUX null > localhost ls /bin/ls INSTALLED > INTEL32::LINUX null > localhost grep /bin/grep INSTALLED > INTEL32::LINUX null > localhost sort /bin/sort INSTALLED > INTEL32::LINUX null > localhost paste /bin/paste INSTALLED > INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED > INTEL32::LINUX null > =============================================================== > > *THIS IS THE ERROR MESSAGE:* > =============================================================== > [mchin at tp-login2 swift]$ swift t.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090615-1125-6z1ehvf8 > Progress: > Progress: Initializing site shared directory:1 > Execution failed: > Could not find any valid host for task "Task(type=UNKNOWN, > identity=urn:cog-1245083114852)" with constraints > {filenames=[Ljava.lang.String;@1f0a76e, trfqn=translate, > filecache=org.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 1f0a797, > tr=translate} > =============================================================== > > Thank you for your help, > Melinda Chin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fedorov at cs.wm.edu Mon Jun 15 11:49:32 2009 From: fedorov at cs.wm.edu (Andriy Fedorov) Date: Mon, 15 Jun 2009 12:49:32 -0400 Subject: [Swift-user] Swift on local resources In-Reply-To: <1244891562.10588.6.camel@localhost> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> <1244891562.10588.6.camel@localhost> Message-ID: <82f536810906150949t1fbbaa01q8fccd1017245950b@mail.gmail.com> On Sat, Jun 13, 2009 at 7:12 AM, Mihael Hategan wrote: > If you want to use password authentication, remove the key and > passphrase lines and add .password=xxx. > ...and add host.site.type=password What you suggested is actually working for me, thank you for the clarification. But having password saved in plain text is not a very comfortable option. Please consider this a feature request: it would be great if for the SSH authentication, the user was asked for a password once, and then the password (or its hash) was stored securely, and authentication was done transparently. I am even willing to enter my password each time I submit a script, rather than having it saved as text on a filesystem! But I am not given an option to do it like this at this point. > However, in some cases being asked by ssh for a password may mean > "keyboard interactive authentication" which is not the same as > "username/password authentication" and may not work with the ssh > provider. Do an ssh -v and you'll see the list of authentication methods > that the server allows. > >> >> AF >> >> >> >> On Fri, Jun 12, 2009 at 4:04 PM, Allan >> Espinosa wrote: >> > Hi Andriy and Mike, >> > >> > here is my example ~/.ssh/auth.defaults for executing jobs on >> > tp-login1.ci.uchicago.edu: >> > >> > [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults >> > tp-login1.ci.uchicago.edu.type=key >> > tp-login1.ci.uchicago.edu.username=aespinosa >> > tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa >> > tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX >> > >> > We have used falkon and coasters before for multi-core configurations. >> > ?but for a single multi-core machine, i believe you can get away with >> > having multiple entries of the same host in the sites.xml file using >> > the ssh-provider. >> > >> > ie: >> > >> > >> > >> > ? >> > ? ... >> > >> > >> > ? ... >> > >> > >> > >> > 2009/6/12 Michael Wilde : >> >> Ah, very cool. Im eager to get more user experience feedback on multicore >> >> use. >> >> >> >> So I will try to hunt down my examples of .ssh configs. >> >> >> >> Also, Allan Espinosa used this recently. Allan, can you post details and >> >> examples? >> >> >> >> Thanks! >> >> >> >> Mike >> >> >> >> On 6/12/09 2:06 PM, Andriy Fedorov wrote: >> >>> >> >>> Michael, >> >>> >> >>> Thank you for the advice, I will look into this. This is very helpful. >> >>> I had an impression Lava is not included in the list of schedulers >> >>> supported out of the box, but wanted to check. >> >>> >> >>> Just a clarification -- I need to access two different types of local >> >>> resources. Cluster (via Lava or Condor) is one, but for the multicore >> >>> nodes we have on the network, which are not part of cluster, the only >> >>> option is to use ssh. >> >>> >> >>> AF >> >>> >> >>> >> >>> >> >>> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: >> >>>> >> >>>> Andriy, >> >>>> >> >>>> Ben or Mihael may have better ideas, but I offer my thoughts below. >> >>>> >> >>>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: >> >>>>> >> >>>>> Hi, >> >>>>> >> >>>>> I am trying to set up Swift with the local cluster and non-cluster >> >>>>> resources in our lab. Here some configuration details. >> >>>>> >> >>>>> Due to technical problems, passphrase login is not possible for the >> >>>>> nodes on local network, and I need to enter password each time. >> >>>>> >> >>>>> For the cluster, I was able to set up passphrase login for the head >> >>>>> node. The cluster is running Lava and Condor schedulers at the same >> >>>>> time, but Lava should be used if possible. >> >>>>> >> >>>>> Two questions: >> >>>>> >> >>>>> (1) is it possible to configure Swift to talk to Lava scheduler? >> >>>> >> >>>> Making Swift talk to a new scheduler means writing a new CoG provider (in >> >>>> Java). You can likely use an existing "data" provider like "local"; you >> >>>> could model the "execution" provider after the "PBS" provider. How hard >> >>>> this >> >>>> is depends on how close Lava is to PBS in nature. (I dont know it). And >> >>>> the >> >>>> provider interface you need to code to is not well documented afaik. >> >>>> >> >>>> I would try the Condor provider. While that provider is less mature and >> >>>> tested than others, it should work, and if it doesnt, we should try to >> >>>> fix >> >>>> it. >> >>>> >> >>>> If possible, make sure a simple condor_submit hello-world works for you >> >>>> first. >> >>>> >> >>>> Run swift on the head/login node; use the "local" data provider. >> >>>> >> >>>> Another route is to use Falkon, but that will be harder and its less >> >>>> supported, so I suggest against this until easier routes are exhausted. >> >>>> >> >>>> I dont think that ssh will get you far, as to leverage the cluster I >> >>>> think >> >>>> you'd need to describe each worker node with a separate sites.xml entry. >> >>>> Thats fine in principle, but a bit awkward, and may have scheduling >> >>>> issues >> >>>> (ie if ssh hangs or dies when you dont own the node). >> >>>> >> >>>> Save ssh as another last resort; I suggest trying Condor first. >> >>>> >> >>>> If needed, people who used ssh recently can send you the info below. >> >>>> >> >>>> - Mike >> >>>> >> >>>>> (2) I am following the instructions on setting up ssh site provider to >> >>>>> use nodes on the local network. >> >>>>> ?(2.1) do I need to set up auth.defaults even if I have ssh-agent >> >>>>> running, and can ssh to the remote node without being asked for >> >>>>> password? >> >>>>> ?(2.2.) can anybody give me more detailed instructions on how to set >> >>>>> up auth.defaults? I cannot make it work. >> >>>> >> >>>>> Thanks >> >>>>> >> >>>>> Andriy Fedorov >> >>>>> _______________________________________________ >> > >> > >> > >> > -- >> > Allan M. Espinosa >> > PhD student, Computer Science >> > University of Chicago >> > >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > From benc at hawaga.org.uk Mon Jun 15 11:50:44 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 15 Jun 2009 16:50:44 +0000 (GMT) Subject: [Swift-user] multiple files for each app call In-Reply-To: <63cc32bc0906150941n2afe574dud938d577d3c2852c@mail.gmail.com> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37000@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C37002@BALI.uhd.campus> <63cc32bc0906150941n2afe574dud938d577d3c2852c@mail.gmail.com> Message-ID: On Mon, 15 Jun 2009, Melinda Chin wrote: > And another question on the side what's the difference > between swift-user at ci.uchicago.edu and swift-user-owner at ci.uchicago.edu. > For help we can send to both? or usually just swift-user? swift-user-owner is an administrative address that you should not send mail to - it goes to the people who run the swift-user mailing list. -- From benc at hawaga.org.uk Mon Jun 15 11:52:55 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 15 Jun 2009 16:52:55 +0000 (GMT) Subject: [Swift-user] multiple files for each app call In-Reply-To: <63cc32bc0906150941n2afe574dud938d577d3c2852c@mail.gmail.com> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37000@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C37002@BALI.uhd.campus> <63cc32bc0906150941n2afe574dud938d577d3c2852c@mail.gmail.com> Message-ID: that error message is telling you that it cannot find a tc.data entry for the translate program for any sites that you have defined in your sites file. Based on the commandline that you showed, maybe you aren't actually using the tc.data file that you think you are. Try swift -tc.file ./tc.data t.swift to explicitly tell Swift to use the tc.data file in the current directory. -- From zhaozhang at uchicago.edu Mon Jun 15 11:56:41 2009 From: zhaozhang at uchicago.edu (Zhao Zhang) Date: Mon, 15 Jun 2009 11:56:41 -0500 Subject: [Swift-user] Questions and ERRORs In-Reply-To: <63cc32bc0906150943q42790c10s5ea98cf71319ada2@mail.gmail.com> References: <63cc32bc0906150943q42790c10s5ea98cf71319ada2@mail.gmail.com> Message-ID: <4A367D49.3020104@uchicago.edu> Hi, Melinda Run t.swift, with the following command "swift -tc.file ./tc.data t.swift". If "./tc.data" in not in the same dir as t.swift, then modify ./tc.data to the path of tc.data. zhao Melinda Chin wrote: > On Mon, Jun 15, 2009 at 11:41 AM, Melinda Chin > wrote: > > I was trying to do the t.swift tutorial again and I keep getting > stuck on this error. What does it mean to not be able to find a > valid host? How do I remedy this? And another question on the > side what's the difference between swift-user at ci.uchicago.edu > and > swift-user-owner at ci.uchicago.edu > . For help we can send > to both? or usually just swift-user? > > *THIS IS WHAT t.swift LOOKS LIKE:* > =============================================================== > [mchin at tp-login2 swift]$ cat t.swift > type messagefile; > (messagefile t) greeting (string s) { > app{ > echo s stdout = @filename(t); > } > } > > (messagefile t) capitalise (messagefile f) { > app { > translate "[a-z]" "[A-Z]" stdin = @filename(f) > stdout = @filename(t); > } > } > > messagefile outfile <"greeting.txt">; > messagefile cfile <"capitalised.txt">; > > outfile = greeting("hello from Swift"); > cfile = capitalise(outfile); > =============================================================== > > *THIS IS WHAT tc.data LOOKS LIKE:* > =============================================================== > [mchin at tp-login2 swift]$ cat tc.data > #This is the transformation catalog. > # > #It comes pre-configured with a number of simple transformations with > #paths that are likely to work on a linux box. However, on some > systems, > #the paths to these executables will be different (for example, > sometimes > #some of these programs are found in /usr/bin rather than in /bin) > # > #NOTE WELL: fields in this file must be separated by tabs, not > spaces; and > #there must be no trailing whitespace at the end of each line. > # > # sitename transformation path INSTALLED platform profiles > localhost echo /bin/echo INSTALLED > INTEL32::LINUX null > localhost cat /bin/cat INSTALLED > INTEL32::LINUX null > localhost ls /bin/ls INSTALLED > INTEL32::LINUX null > localhost grep /bin/grep INSTALLED > INTEL32::LINUX null > localhost sort /bin/sort INSTALLED > INTEL32::LINUX null > localhost paste /bin/paste INSTALLED > INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED > INTEL32::LINUX null > =============================================================== > > *THIS IS THE ERROR MESSAGE:* > =============================================================== > [mchin at tp-login2 swift]$ swift t.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090615-1125-6z1ehvf8 > Progress: > Progress: Initializing site shared directory:1 > Execution failed: > Could not find any valid host for task "Task(type=UNKNOWN, > identity=urn:cog-1245083114852)" with constraints > {filenames=[Ljava.lang.String;@1f0a76e, trfqn=translate, > filecache=org.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 1f0a797, > tr=translate} > =============================================================== > > Thank you for your help, > Melinda Chin > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From benc at hawaga.org.uk Mon Jun 15 11:57:11 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 15 Jun 2009 16:57:11 +0000 (GMT) Subject: [Swift-user] still having problems In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37005@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37005@BALI.uhd.campus> Message-ID: based on previous messages, its still unclear to me that you have the program working outside of Swift. Pretty much you need to be able to run rInvoke perm.short.R 0 from the commandline successfully before attempting to run that command inside Swift. On Mon, 15 Jun 2009, Hodgess, Erin wrote: > Hi again! > > I have brought things down to the bones here. Still no luck. > > The files are /home/erin/R-2.9.0/bin/perm.short.R and /home/erin/R-2.9.0/bin/perm4.swift > > > [erin at tp-login2 bin]$ swift -tc.file tc.data perm4.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090615-1137-nbfujhb2 > Progress: > Progress: Checking status:1 > Progress: Stage in:1 > Execution failed: > Exception in RInvoke: > Arguments: [perm.short.R, 0] > Host: localhost > Directory: perm4-20090615-1137-nbfujhb2/jobs/v/RInvoke-vti60bcj > stderr.txt: > stdout.txt: > ---- > > Caused by: > Exit code 1 > [erin at tp-login2 bin]$ cat perm4.swift > type file{} > > (file procOut) permScript (file script, int batchSize){ > app{ > RInvoke @filename(script) batchSize ; > } > } > > > > > > > > int batchSize=0; > file script<"perm.short.R">; > file procOut<"a.txt">; > procOut=permScript(script, batchSize); > > [erin at tp-login2 bin]$ cat perm.short.R > # -- read data files that are hardcoded per analysis and get > vox_speech_vec <- as.matrix(read.table("origccf.txt")) > pm <- as.matrix(read.table("corr.perm.matrix.txt")) > allinputs <- Sys.getenv("R_SWIFT_ARGS") > print(allinputs) > permlength <- as.numeric(noquote(strsplit(allinputs," ")[[1]][1])) > # startrpermrow <- as.numeric(noquote(strsplit(allinputs," ")[[1]][1])) > startrpermrow <- 1 > endpermrow <- permlength + startrpermrow > debug = 1 > if (debug == 1) { > print(paste("start", startrpermrow, "end", endpermrow, sep=",")) > } > # rotate across specified rows in permutation matrix > for (rr in startrpermrow:endpermrow){ > permvec = pm[rr,] > # initialize a 'ccf' matrix for each permutation > ccf <- matrix(nrow=200, ncol=1) > mat_row = 0 > for (vox in 1:200) { > mat_row = mat_row + 1 > speechperm_ccf <- ccf(vox_speech_vec[vox,], permvec, lag.max = 6, type = c("correlation"), na.action=na.pass, plot = FALSE) > if (any(speechperm_ccf$acf[1:13] == "NaN")) { > # speechperm_ccf$acf[1:13] = 0 > speechperm_ccf$acf[1:13] <- rep(0,13) > } else { > speechperm_ccf$acf[1:13] = speechperm_ccf$acf[1:13] > } > speechperm_ccf <- as.matrix(data.frame(speechperm_ccf$acf, speechperm_ccf$lag))[7:13,] > speechperm_cor <- speechperm_ccf[which.max(speechperm_ccf[,1]),] > ccf[mat_row, ] <- c(speechperm_cor[[1]]) > } > write.table (ccf, file=paste("a.txt",sep=""), row.names=FALSE, col.names=FALSE) > if (debug == 1) { > print(paste("permrow finished", rr, sep=",")) > print(date()) > } > rm(ccf) > } > [erin at tp-login2 bin]$ > > Any help appreciated. > > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > From zhaozhang at uchicago.edu Mon Jun 15 12:01:38 2009 From: zhaozhang at uchicago.edu (Zhao Zhang) Date: Mon, 15 Jun 2009 12:01:38 -0500 Subject: [Swift-user] still having problems In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37005@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37005@BALI.uhd.campus> Message-ID: <4A367E72.1050604@uchicago.edu> Hi, Erin Could you try to run it without swift? Without swift, what does the command look like if you invoke "RInvoke" under shell? By your swift script, I am assuming RInvoke requires 2 input parameters, one for input file, the other one is a Integer parameter. What does the output look like? does RInvoke write output to standard output, or write to a output file? How do we define the output file name before we invoke the script? zhao Hodgess, Erin wrote: > > Hi again! > > I have brought things down to the bones here. Still no luck. > > The files are /home/erin/R-2.9.0/bin/perm.short.R and > /home/erin/R-2.9.0/bin/perm4.swift > > > [erin at tp-login2 bin]$ swift -tc.file tc.data perm4.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090615-1137-nbfujhb2 > Progress: > Progress: Checking status:1 > Progress: Stage in:1 > Execution failed: > Exception in RInvoke: > Arguments: [perm.short.R, 0] > Host: localhost > Directory: perm4-20090615-1137-nbfujhb2/jobs/v/RInvoke-vti60bcj > stderr.txt: > stdout.txt: > ---- > > Caused by: > Exit code 1 > [erin at tp-login2 bin]$ cat perm4.swift > type file{} > > (file procOut) permScript (file script, int batchSize){ > app{ > RInvoke @filename(script) batchSize ; > } > } > > > > > > > > int batchSize=0; > file script<"perm.short.R">; > file procOut<"a.txt">; > procOut=permScript(script, batchSize); > > [erin at tp-login2 bin]$ cat perm.short.R > # -- read data files that are hardcoded per analysis and get > vox_speech_vec <- as.matrix(read.table("origccf.txt")) > pm <- as.matrix(read.table("corr.perm.matrix.txt")) > allinputs <- Sys.getenv("R_SWIFT_ARGS") > print(allinputs) > permlength <- as.numeric(noquote(strsplit(allinputs," > ")[[1]][1])) > # startrpermrow <- as.numeric(noquote(strsplit(allinputs," > ")[[1]][1])) > startrpermrow <- 1 > endpermrow <- permlength + startrpermrow > debug = 1 > if (debug == 1) { > print(paste("start", startrpermrow, "end", endpermrow, > sep=",")) > } > # rotate across specified rows in permutation matrix > for (rr in startrpermrow:endpermrow){ > permvec = pm[rr,] > # initialize a 'ccf' matrix for each permutation > ccf <- matrix(nrow=200, ncol=1) > mat_row = 0 > for (vox in 1:200) { > mat_row = mat_row + 1 > speechperm_ccf <- ccf(vox_speech_vec[vox,], > permvec, lag.max = 6, type = c("correlation"), na.action=na.pass, plot > = FALSE) > if (any(speechperm_ccf$acf[1:13] == "NaN")) { > # speechperm_ccf$acf[1:13] = 0 > speechperm_ccf$acf[1:13] <- rep(0,13) > } else { > speechperm_ccf$acf[1:13] = > speechperm_ccf$acf[1:13] > } > speechperm_ccf <- > as.matrix(data.frame(speechperm_ccf$acf, speechperm_ccf$lag))[7:13,] > speechperm_cor <- > speechperm_ccf[which.max(speechperm_ccf[,1]),] > ccf[mat_row, ] <- c(speechperm_cor[[1]]) > } > write.table (ccf, file=paste("a.txt",sep=""), > row.names=FALSE, col.names=FALSE) > if (debug == 1) { > print(paste("permrow finished", rr, sep=",")) > print(date()) > } > rm(ccf) > } > [erin at tp-login2 bin]$ > > Any help appreciated. > > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > From HodgessE at uhd.edu Mon Jun 15 12:33:53 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Mon, 15 Jun 2009 12:33:53 -0500 Subject: [Swift-user] still having problems References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37005@BALI.uhd.campus> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37006@BALI.uhd.campus> it works fine from command line. The echo $? gives a zero, the correct file is produced. It just seems to be the jump from command line to swift that is the snag. Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Ben Clifford [mailto:benc at hawaga.org.uk] Sent: Mon 6/15/2009 11:57 AM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] still having problems based on previous messages, its still unclear to me that you have the program working outside of Swift. Pretty much you need to be able to run rInvoke perm.short.R 0 from the commandline successfully before attempting to run that command inside Swift. On Mon, 15 Jun 2009, Hodgess, Erin wrote: > Hi again! > > I have brought things down to the bones here. Still no luck. > > The files are /home/erin/R-2.9.0/bin/perm.short.R and /home/erin/R-2.9.0/bin/perm4.swift > > > [erin at tp-login2 bin]$ swift -tc.file tc.data perm4.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090615-1137-nbfujhb2 > Progress: > Progress: Checking status:1 > Progress: Stage in:1 > Execution failed: > Exception in RInvoke: > Arguments: [perm.short.R, 0] > Host: localhost > Directory: perm4-20090615-1137-nbfujhb2/jobs/v/RInvoke-vti60bcj > stderr.txt: > stdout.txt: > ---- > > Caused by: > Exit code 1 > [erin at tp-login2 bin]$ cat perm4.swift > type file{} > > (file procOut) permScript (file script, int batchSize){ > app{ > RInvoke @filename(script) batchSize ; > } > } > > > > > > > > int batchSize=0; > file script<"perm.short.R">; > file procOut<"a.txt">; > procOut=permScript(script, batchSize); > > [erin at tp-login2 bin]$ cat perm.short.R > # -- read data files that are hardcoded per analysis and get > vox_speech_vec <- as.matrix(read.table("origccf.txt")) > pm <- as.matrix(read.table("corr.perm.matrix.txt")) > allinputs <- Sys.getenv("R_SWIFT_ARGS") > print(allinputs) > permlength <- as.numeric(noquote(strsplit(allinputs," ")[[1]][1])) > # startrpermrow <- as.numeric(noquote(strsplit(allinputs," ")[[1]][1])) > startrpermrow <- 1 > endpermrow <- permlength + startrpermrow > debug = 1 > if (debug == 1) { > print(paste("start", startrpermrow, "end", endpermrow, sep=",")) > } > # rotate across specified rows in permutation matrix > for (rr in startrpermrow:endpermrow){ > permvec = pm[rr,] > # initialize a 'ccf' matrix for each permutation > ccf <- matrix(nrow=200, ncol=1) > mat_row = 0 > for (vox in 1:200) { > mat_row = mat_row + 1 > speechperm_ccf <- ccf(vox_speech_vec[vox,], permvec, lag.max = 6, type = c("correlation"), na.action=na.pass, plot = FALSE) > if (any(speechperm_ccf$acf[1:13] == "NaN")) { > # speechperm_ccf$acf[1:13] = 0 > speechperm_ccf$acf[1:13] <- rep(0,13) > } else { > speechperm_ccf$acf[1:13] = speechperm_ccf$acf[1:13] > } > speechperm_ccf <- as.matrix(data.frame(speechperm_ccf$acf, speechperm_ccf$lag))[7:13,] > speechperm_cor <- speechperm_ccf[which.max(speechperm_ccf[,1]),] > ccf[mat_row, ] <- c(speechperm_cor[[1]]) > } > write.table (ccf, file=paste("a.txt",sep=""), row.names=FALSE, col.names=FALSE) > if (debug == 1) { > print(paste("permrow finished", rr, sep=",")) > print(date()) > } > rm(ccf) > } > [erin at tp-login2 bin]$ > > Any help appreciated. > > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Mon Jun 15 13:29:50 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Mon, 15 Jun 2009 13:29:50 -0500 Subject: [Swift-user] still having problems References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37005@BALI.uhd.campus> <4A367E72.1050604@uchicago.edu> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37008@BALI.uhd.campus> I found a still simpler example and made it go. Now back to perm.short.R thanks, erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Zhao Zhang [mailto:zhaozhang at uchicago.edu] Sent: Mon 6/15/2009 12:01 PM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] still having problems Hi, Erin Could you try to run it without swift? Without swift, what does the command look like if you invoke "RInvoke" under shell? By your swift script, I am assuming RInvoke requires 2 input parameters, one for input file, the other one is a Integer parameter. What does the output look like? does RInvoke write output to standard output, or write to a output file? How do we define the output file name before we invoke the script? zhao Hodgess, Erin wrote: > > Hi again! > > I have brought things down to the bones here. Still no luck. > > The files are /home/erin/R-2.9.0/bin/perm.short.R and > /home/erin/R-2.9.0/bin/perm4.swift > > > [erin at tp-login2 bin]$ swift -tc.file tc.data perm4.swift > Swift svn swift-r2950 cog-r2406 > > RunID: 20090615-1137-nbfujhb2 > Progress: > Progress: Checking status:1 > Progress: Stage in:1 > Execution failed: > Exception in RInvoke: > Arguments: [perm.short.R, 0] > Host: localhost > Directory: perm4-20090615-1137-nbfujhb2/jobs/v/RInvoke-vti60bcj > stderr.txt: > stdout.txt: > ---- > > Caused by: > Exit code 1 > [erin at tp-login2 bin]$ cat perm4.swift > type file{} > > (file procOut) permScript (file script, int batchSize){ > app{ > RInvoke @filename(script) batchSize ; > } > } > > > > > > > > int batchSize=0; > file script<"perm.short.R">; > file procOut<"a.txt">; > procOut=permScript(script, batchSize); > > [erin at tp-login2 bin]$ cat perm.short.R > # -- read data files that are hardcoded per analysis and get > vox_speech_vec <- as.matrix(read.table("origccf.txt")) > pm <- as.matrix(read.table("corr.perm.matrix.txt")) > allinputs <- Sys.getenv("R_SWIFT_ARGS") > print(allinputs) > permlength <- as.numeric(noquote(strsplit(allinputs," > ")[[1]][1])) > # startrpermrow <- as.numeric(noquote(strsplit(allinputs," > ")[[1]][1])) > startrpermrow <- 1 > endpermrow <- permlength + startrpermrow > debug = 1 > if (debug == 1) { > print(paste("start", startrpermrow, "end", endpermrow, > sep=",")) > } > # rotate across specified rows in permutation matrix > for (rr in startrpermrow:endpermrow){ > permvec = pm[rr,] > # initialize a 'ccf' matrix for each permutation > ccf <- matrix(nrow=200, ncol=1) > mat_row = 0 > for (vox in 1:200) { > mat_row = mat_row + 1 > speechperm_ccf <- ccf(vox_speech_vec[vox,], > permvec, lag.max = 6, type = c("correlation"), na.action=na.pass, plot > = FALSE) > if (any(speechperm_ccf$acf[1:13] == "NaN")) { > # speechperm_ccf$acf[1:13] = 0 > speechperm_ccf$acf[1:13] <- rep(0,13) > } else { > speechperm_ccf$acf[1:13] = > speechperm_ccf$acf[1:13] > } > speechperm_ccf <- > as.matrix(data.frame(speechperm_ccf$acf, speechperm_ccf$lag))[7:13,] > speechperm_cor <- > speechperm_ccf[which.max(speechperm_ccf[,1]),] > ccf[mat_row, ] <- c(speechperm_cor[[1]]) > } > write.table (ccf, file=paste("a.txt",sep=""), > row.names=FALSE, col.names=FALSE) > if (debug == 1) { > print(paste("permrow finished", rr, sep=",")) > print(date()) > } > rm(ccf) > } > [erin at tp-login2 bin]$ > > Any help appreciated. > > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Mon Jun 15 14:42:25 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 15 Jun 2009 14:42:25 -0500 Subject: [Swift-user] Swift on local resources In-Reply-To: <82f536810906150949t1fbbaa01q8fccd1017245950b@mail.gmail.com> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> <1244891562.10588.6.camel@localhost> <82f536810906150949t1fbbaa01q8fccd1017245950b@mail.gmail.com> Message-ID: <1245094945.4624.4.camel@localhost> On Mon, 2009-06-15 at 12:49 -0400, Andriy Fedorov wrote: > On Sat, Jun 13, 2009 at 7:12 AM, Mihael Hategan wrote: > > If you want to use password authentication, remove the key and > > passphrase lines and add .password=xxx. > > > > ...and add > > host.site.type=password Ooops! > > What you suggested is actually working for me, thank you for the > clarification. But having password saved in plain text is not a very > comfortable option. No, it isn't. Unless there is no access to the machine from the outside, in which case it's pretty much equivalent (if not better) than host-based authentication. > > Please consider this a feature request: it would be great if for the > SSH authentication, the user was asked for a password once, and then > the password (or its hash) was stored securely, and authentication was > done transparently. Well... there is such a feature in cog, but I don't think it's enabled in swift. It's a matter of SSH not having been used much. So I think we can work something out. > > I am even willing to enter my password each time I submit a script, > rather than having it saved as text on a filesystem! That's likely what will happen. There is no "secure way" I know of for storing retrievable passwords on a shared disk. > But I am not > given an option to do it like this at this point. > > > > However, in some cases being asked by ssh for a password may mean > > "keyboard interactive authentication" which is not the same as > > "username/password authentication" and may not work with the ssh > > provider. Do an ssh -v and you'll see the list of authentication methods > > that the server allows. > > > >> > >> AF > >> > >> > >> > >> On Fri, Jun 12, 2009 at 4:04 PM, Allan > >> Espinosa wrote: > >> > Hi Andriy and Mike, > >> > > >> > here is my example ~/.ssh/auth.defaults for executing jobs on > >> > tp-login1.ci.uchicago.edu: > >> > > >> > [aespinosa at tp-login2 ~]$ cat .ssh/auth.defaults > >> > tp-login1.ci.uchicago.edu.type=key > >> > tp-login1.ci.uchicago.edu.username=aespinosa > >> > tp-login1.ci.uchicago.edu.key=/home/aespinosa/.ssh/id_dsa > >> > tp-login1.ci.uchicago.edu.passphrase=XXXXXXXX > >> > > >> > We have used falkon and coasters before for multi-core configurations. > >> > but for a single multi-core machine, i believe you can get away with > >> > having multiple entries of the same host in the sites.xml file using > >> > the ssh-provider. > >> > > >> > ie: > >> > > >> > > >> > > >> > > >> > ... > >> > > >> > > >> > ... > >> > > >> > > >> > > >> > 2009/6/12 Michael Wilde : > >> >> Ah, very cool. Im eager to get more user experience feedback on multicore > >> >> use. > >> >> > >> >> So I will try to hunt down my examples of .ssh configs. > >> >> > >> >> Also, Allan Espinosa used this recently. Allan, can you post details and > >> >> examples? > >> >> > >> >> Thanks! > >> >> > >> >> Mike > >> >> > >> >> On 6/12/09 2:06 PM, Andriy Fedorov wrote: > >> >>> > >> >>> Michael, > >> >>> > >> >>> Thank you for the advice, I will look into this. This is very helpful. > >> >>> I had an impression Lava is not included in the list of schedulers > >> >>> supported out of the box, but wanted to check. > >> >>> > >> >>> Just a clarification -- I need to access two different types of local > >> >>> resources. Cluster (via Lava or Condor) is one, but for the multicore > >> >>> nodes we have on the network, which are not part of cluster, the only > >> >>> option is to use ssh. > >> >>> > >> >>> AF > >> >>> > >> >>> > >> >>> > >> >>> On Fri, Jun 12, 2009 at 2:52 PM, Michael Wilde wrote: > >> >>>> > >> >>>> Andriy, > >> >>>> > >> >>>> Ben or Mihael may have better ideas, but I offer my thoughts below. > >> >>>> > >> >>>> On 6/12/09 1:18 PM, Andriy Fedorov wrote: > >> >>>>> > >> >>>>> Hi, > >> >>>>> > >> >>>>> I am trying to set up Swift with the local cluster and non-cluster > >> >>>>> resources in our lab. Here some configuration details. > >> >>>>> > >> >>>>> Due to technical problems, passphrase login is not possible for the > >> >>>>> nodes on local network, and I need to enter password each time. > >> >>>>> > >> >>>>> For the cluster, I was able to set up passphrase login for the head > >> >>>>> node. The cluster is running Lava and Condor schedulers at the same > >> >>>>> time, but Lava should be used if possible. > >> >>>>> > >> >>>>> Two questions: > >> >>>>> > >> >>>>> (1) is it possible to configure Swift to talk to Lava scheduler? > >> >>>> > >> >>>> Making Swift talk to a new scheduler means writing a new CoG provider (in > >> >>>> Java). You can likely use an existing "data" provider like "local"; you > >> >>>> could model the "execution" provider after the "PBS" provider. How hard > >> >>>> this > >> >>>> is depends on how close Lava is to PBS in nature. (I dont know it). And > >> >>>> the > >> >>>> provider interface you need to code to is not well documented afaik. > >> >>>> > >> >>>> I would try the Condor provider. While that provider is less mature and > >> >>>> tested than others, it should work, and if it doesnt, we should try to > >> >>>> fix > >> >>>> it. > >> >>>> > >> >>>> If possible, make sure a simple condor_submit hello-world works for you > >> >>>> first. > >> >>>> > >> >>>> Run swift on the head/login node; use the "local" data provider. > >> >>>> > >> >>>> Another route is to use Falkon, but that will be harder and its less > >> >>>> supported, so I suggest against this until easier routes are exhausted. > >> >>>> > >> >>>> I dont think that ssh will get you far, as to leverage the cluster I > >> >>>> think > >> >>>> you'd need to describe each worker node with a separate sites.xml entry. > >> >>>> Thats fine in principle, but a bit awkward, and may have scheduling > >> >>>> issues > >> >>>> (ie if ssh hangs or dies when you dont own the node). > >> >>>> > >> >>>> Save ssh as another last resort; I suggest trying Condor first. > >> >>>> > >> >>>> If needed, people who used ssh recently can send you the info below. > >> >>>> > >> >>>> - Mike > >> >>>> > >> >>>>> (2) I am following the instructions on setting up ssh site provider to > >> >>>>> use nodes on the local network. > >> >>>>> (2.1) do I need to set up auth.defaults even if I have ssh-agent > >> >>>>> running, and can ssh to the remote node without being asked for > >> >>>>> password? > >> >>>>> (2.2.) can anybody give me more detailed instructions on how to set > >> >>>>> up auth.defaults? I cannot make it work. > >> >>>> > >> >>>>> Thanks > >> >>>>> > >> >>>>> Andriy Fedorov > >> >>>>> _______________________________________________ > >> > > >> > > >> > > >> > -- > >> > Allan M. Espinosa > >> > PhD student, Computer Science > >> > University of Chicago > >> > > >> _______________________________________________ > >> Swift-user mailing list > >> Swift-user at ci.uchicago.edu > >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > From HodgessE at uhd.edu Tue Jun 16 16:39:01 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Tue, 16 Jun 2009 16:39:01 -0500 Subject: [Swift-user] trouble with tutorial Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37036@BALI.uhd.campus> Hi all! This swift procedure doesn't seem to like the print statement(?) [erin at communicado swift1]$ swift -tc.file tc.data fold9.swift Could not start execution. Compile error in procedure invocation at line 21: Procedure print is not declared. [erin at communicado swift1]$ cat fold9.swift type counterfile; (counterfile t) echo(string m) { app { echo m stdout=@filename(t); } } (counterfile t) countstep(counterfile i) { app { wcl @filename(i) @filename(t); } } counterfile a[] ; a[0] = echo("793578934574893"); iterate v { a[v+1] = countstep(a[v]); print("extract int value ", at extractint(a[v+1])); } until (@extractint(a[v+1]) <= 1); [erin at communicado swift1]$ cat tc.data #NOTE WELL: fields in this file must be separated by tabs, not spaces # and there must be no trailing whitespace at the end of each line. # # sitename app pathname (ignored) (ignored) profiles localhost echo /bin/echo INSTALLED INTEL32::LINUX null teraport echo /bin/echo INSTALLED INTEL32::LINUX null localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null localhost wcl /home/erin/swift1 INSTALLED INTEL32::LINUX null Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Tue Jun 16 16:47:16 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 16 Jun 2009 16:47:16 -0500 Subject: [Swift-user] trouble with tutorial In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37036@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37036@BALI.uhd.campus> Message-ID: <4A3812E4.8060200@mcs.anl.gov> Erin, The print() built-in procedure has been replaced by one called "trace()". Replace print with trace in: print("extract int value ", at extractint(a[v+1])); - Mike On 6/16/09 4:39 PM, Hodgess, Erin wrote: > Hi all! > > This swift procedure doesn't seem to like the print statement(?) > > > [erin at communicado swift1]$ swift -tc.file tc.data fold9.swift > Could not start execution. > Compile error in procedure invocation at line 21: Procedure > print is not declared. > [erin at communicado swift1]$ cat fold9.swift > type counterfile; > > (counterfile t) echo(string m) { > app { > echo m stdout=@filename(t); > } > } > > (counterfile t) countstep(counterfile i) { > app { > wcl @filename(i) @filename(t); > } > } > > counterfile a[] ; > > a[0] = echo("793578934574893"); > > iterate v { > a[v+1] = countstep(a[v]); > print("extract int value ", at extractint(a[v+1])); > } until (@extractint(a[v+1]) <= 1); > [erin at communicado swift1]$ cat tc.data > #NOTE WELL: fields in this file must be separated by tabs, not spaces > # and there must be no trailing whitespace at the end of each > line. > # > # sitename app pathname (ignored) (ignored) > profiles > localhost echo /bin/echo INSTALLED INTEL32::LINUX null > teraport echo /bin/echo INSTALLED INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED > INTEL32::LINUX null > localhost R /home/erin/R-2.9.0/bin/R INSTALLED > INTEL32::LINUX null > localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null > localhost convert /usr/bin/convert INSTALLED > INTEL32::LINUX null > localhost wcl /home/erin/swift1 INSTALLED > INTEL32::LINUX null > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From benc at hawaga.org.uk Tue Jun 16 16:47:32 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 16 Jun 2009 21:47:32 +0000 (GMT) Subject: [Swift-user] trouble with tutorial In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37036@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37036@BALI.uhd.campus> Message-ID: On Tue, 16 Jun 2009, Hodgess, Erin wrote: > This swift procedure doesn't seem to like the print statement(?) Print doesn't exist any more - the closest equivalent in recent versions is the trace() procedure which behaves differently but roughly the same. -- From HodgessE at uhd.edu Tue Jun 16 16:49:25 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Tue, 16 Jun 2009 16:49:25 -0500 Subject: [Swift-user] trouble with tutorial References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37036@BALI.uhd.campus> <4A3812E4.8060200@mcs.anl.gov> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37038@BALI.uhd.campus> Ok. I've used trace; thought it was only for integers/numeric. Thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Michael Wilde [mailto:wilde at mcs.anl.gov] Sent: Tue 6/16/2009 4:47 PM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] trouble with tutorial Erin, The print() built-in procedure has been replaced by one called "trace()". Replace print with trace in: print("extract int value ", at extractint(a[v+1])); - Mike On 6/16/09 4:39 PM, Hodgess, Erin wrote: > Hi all! > > This swift procedure doesn't seem to like the print statement(?) > > > [erin at communicado swift1]$ swift -tc.file tc.data fold9.swift > Could not start execution. > Compile error in procedure invocation at line 21: Procedure > print is not declared. > [erin at communicado swift1]$ cat fold9.swift > type counterfile; > > (counterfile t) echo(string m) { > app { > echo m stdout=@filename(t); > } > } > > (counterfile t) countstep(counterfile i) { > app { > wcl @filename(i) @filename(t); > } > } > > counterfile a[] ; > > a[0] = echo("793578934574893"); > > iterate v { > a[v+1] = countstep(a[v]); > print("extract int value ", at extractint(a[v+1])); > } until (@extractint(a[v+1]) <= 1); > [erin at communicado swift1]$ cat tc.data > #NOTE WELL: fields in this file must be separated by tabs, not spaces > # and there must be no trailing whitespace at the end of each > line. > # > # sitename app pathname (ignored) (ignored) > profiles > localhost echo /bin/echo INSTALLED INTEL32::LINUX null > teraport echo /bin/echo INSTALLED INTEL32::LINUX null > localhost translate /usr/bin/tr INSTALLED > INTEL32::LINUX null > localhost R /home/erin/R-2.9.0/bin/R INSTALLED > INTEL32::LINUX null > localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null > localhost convert /usr/bin/convert INSTALLED > INTEL32::LINUX null > localhost wcl /home/erin/swift1 INSTALLED > INTEL32::LINUX null > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Tue Jun 16 16:58:46 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 16 Jun 2009 21:58:46 +0000 (GMT) Subject: [Swift-user] trouble with tutorial In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37038@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37036@BALI.uhd.campus> <4A3812E4.8060200@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C37038@BALI.uhd.campus> Message-ID: swift svn r2962 removes a couple of remaining uses of print from the tutorial and userguide that I'd omitted previously. -- From HodgessE at uhd.edu Tue Jun 16 17:13:59 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Tue, 16 Jun 2009 17:13:59 -0500 Subject: [Swift-user] still stuck on fold9.swift Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C3703B@BALI.uhd.campus> Ok. We're back to fold9.swift again, but it's saying that there are multiple writers. These are in /home/erin/swift1. Is it because of the recursive nature of the a[v+1] setup, maybe? thanks, Erin [erin at communicado swift1]$ swift -tc.file tc.data fold9.swift Could not start execution. variable a has multiple writers. [erin at communicado swift1]$ cat fold9.swift type counterfile; (counterfile t) echo(string m) { app { echo m stdout=@filename(t); } } (counterfile t) countstep(counterfile i) { app { wcl @filename(i) @filename(t); } } counterfile a[] ; a[0] = echo("793578934574893"); iterate v { a[v+1] = countstep(a[v]); trace(@extractint(a[v+1])); } until (@extractint(a[v+1]) <= 1); [erin at communicado swift1]$ cat tc.data #NOTE WELL: fields in this file must be separated by tabs, not spaces # and there must be no trailing whitespace at the end of each line. # # sitename app pathname (ignored) (ignored) profiles localhost echo /bin/echo INSTALLED INTEL32::LINUX null teraport echo /bin/echo INSTALLED INTEL32::LINUX null localhost translate /usr/bin/tr INSTALLED INTEL32::LINUX null localhost R /home/erin/R-2.9.0/bin/R INSTALLED INTEL32::LINUX null localhost wc /usr/bin/wc INSTALLED INTEL32::LINUX null localhost convert /usr/bin/convert INSTALLED INTEL32::LINUX null localhost wcl /home/erin/swift1 INSTALLED INTEL32::LINUX null [erin at communicado swift1]$ Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Tue Jun 16 17:26:25 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 16 Jun 2009 22:26:25 +0000 (GMT) Subject: [Swift-user] still stuck on fold9.swift In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C3703B@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C3703B@BALI.uhd.campus> Message-ID: On Tue, 16 Jun 2009, Hodgess, Erin wrote: > Ok. We're back to fold9.swift again, but it's saying that there are > multiple writers. > These are in /home/erin/swift1. > > Is it because of the recursive nature of the a[v+1] setup, maybe? This is a static compile time analysis problem - Swift looks at the source code and sees that you are assigning to the a[] array in one place (the a[0] statement, outside of iterate) and again in another place (the a[v+1] place inside of iterate). Its bothered me in the past that this hasn't worked, but I hadn't realised that it did at one stage actually work (which it must have done to be written in the tutorial). Its probably useful to file a bug about this, then - its a comment I have from doing things with the 3rd provenance challenge over the past couple of months too. I think in the long term its a use that should be accepted, but the syntax of Swift makes this kind of analysis incredibly awkward to get right. You can maybe around this by something like this (untested): replace a[v+1] = countstep(a[v]); with if(v==0) { a[v+1] = countstep(startfile); } else { a[v+1] = countstep(a[v]); } and replace the a[0] assignment line with: countfile startfile = echo("793578934574893"); -- From HodgessE at uhd.edu Wed Jun 17 11:18:18 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Wed, 17 Jun 2009 11:18:18 -0500 Subject: [Swift-user] Mapper.Factory.java Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37042@BALI.uhd.campus> Hi! Has anyone use Mapper.Factory.java as mentioned in the tutorial, please? I'm having some trouble locating it. thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Wed Jun 17 11:51:08 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 17 Jun 2009 11:51:08 -0500 Subject: [Swift-user] Mapper.Factory.java In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37042@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37042@BALI.uhd.campus> Message-ID: <4A391EFC.7050107@mcs.anl.gov> Erin, the Mapper classes should be in the source tree in an svn checkout. But unless you have a very exotic mapper need, I suggest using the external "ext" mapper described in the user guide. This is far simpler, and can handle most situations Ive encountered. Can you try the ext mapper first, and report back if you dont think its capable of meeting your needs? Its a far better way to learn how mappers work. - Mike On 6/17/09 11:18 AM, Hodgess, Erin wrote: > Hi! > > Has anyone use Mapper.Factory.java as mentioned in the tutorial, please? > > I'm having some trouble locating it. > > thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From benc at hawaga.org.uk Thu Jun 18 05:11:26 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Thu, 18 Jun 2009 10:11:26 +0000 (GMT) Subject: [Swift-user] Mapper.Factory.java In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37042@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37042@BALI.uhd.campus> Message-ID: On Wed, 17 Jun 2009, Hodgess, Erin wrote: > Has anyone use Mapper.Factory.java as mentioned in the tutorial, please? I've not known anyone write a mapper using the Java interface for many years. That tutorial section should probably go away. As Mike mentions, the ext mapper seems easier most of the time. -- From hategan at mcs.anl.gov Thu Jun 18 14:36:03 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 18 Jun 2009 14:36:03 -0500 Subject: [Swift-user] Swift on local resources In-Reply-To: <1245094945.4624.4.camel@localhost> References: <82f536810906121118t532b3dc8oe295449a95ae64d6@mail.gmail.com> <4A32A3FD.8000603@mcs.anl.gov> <82f536810906121206v626a1736l8ec04ec60d39848b@mail.gmail.com> <4A32A9FE.4000600@mcs.anl.gov> <50b07b4b0906121304t395195e5g5eca46aba5101a3f@mail.gmail.com> <82f536810906121317i764427abubb23531e6c71d88a@mail.gmail.com> <1244891562.10588.6.camel@localhost> <82f536810906150949t1fbbaa01q8fccd1017245950b@mail.gmail.com> <1245094945.4624.4.camel@localhost> Message-ID: <1245353763.3915.6.camel@localhost> On Mon, 2009-06-15 at 14:42 -0500, Mihael Hategan wrote: > > > > Please consider this a feature request: it would be great if for the > > SSH authentication, the user was asked for a password once, and then > > the password (or its hash) was stored securely, and authentication was > > done transparently. > > Well... there is such a feature in cog, but I don't think it's enabled > in swift. It's a matter of SSH not having been used much. So I think we > can work something out. > cog r2407 enables this. Use .type=interactive You'll get a prompt for the username and password or private key/passphrase. If a graphical terminal is available, you'll see a Swing dialog window. If not, don't be scared by the funny prompt, just type the password as you normally would. For one run and one site, you'll only have to type the information once. I suspect you may have many ssh hosts, in which case you'll get as many prompts. I may have a plan for that. In the mean time, you may be able to use it as it is. Maybe I should use "may" once more. From amoore2 at uchicago.edu Mon Jun 22 12:48:44 2009 From: amoore2 at uchicago.edu (Alex Moore) Date: Mon, 22 Jun 2009 12:48:44 -0500 Subject: [Swift-user] Running on Teraport Message-ID: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> Trying to run a job on Teraport. Logged into tp-login.ci.uchicago.edu. Use the following entries for sites.xml and tc.dat: ------ /var/tmp 0 ---------- # sitename transformation path INSTALLED platform profiles pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh INSTALLED INTEL32::LINUX null ---------- My swift program call an app named wormanalysis that is on my ci account. Runs fine locally. Mike Wilde said that changing the " References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> Message-ID: <4A3FC552.2090406@uchicago.edu> Hi, Alex # sitename transformation path INSTALLED platform profiles pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh INSTALLED INTEL32::LINUX null change the "pbs" above to localhost. That "sitename" field should be the same as the in sites.xml best zhao Alex Moore wrote: > Trying to run a job on Teraport. Logged into tp-login.ci.uchicago.edu. > Use the following entries for sites.xml and tc.dat: > ------ > > > > /var/tmp > 0 > > ---------- > # sitename transformation path INSTALLED platform profiles > pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh > INSTALLED INTEL32::LINUX null > ---------- > > My swift program call an app named wormanalysis that is on my ci > account. Runs fine locally. Mike Wilde said that changing the > " identity=urn:cog-1245692325277)" with constraints {filenames= > [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, > filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, > tr=wormanalysis} > > Any help would be appreciated. Thanks. > -Alex > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > From amoore2 at uchicago.edu Mon Jun 22 13:10:48 2009 From: amoore2 at uchicago.edu (Alex Moore) Date: Mon, 22 Jun 2009 13:10:48 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: <4A3FC552.2090406@uchicago.edu> References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> Message-ID: <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> Thanks, that got it running. I get a different error message now though: Execution failed: Exception in wormanalysis Arguments: [home/amoore2/Work/Data/070326.tif, home/amoore2/Work/Save/Ang-Def.0006.dat, home/amoore2/Work/Save/Props.0006.dat] Host: localhost Directory: wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj stderr.txt stdout.txt ---- Caused by: Cannot submit job: java.io.IOException: qsub: not found The program loads an image file from my CI account as input and outputs two .dat files to a directory on my CI account as well- I don't know if it might be something in my code that is causing this. Thanks. -Alex On Mon, Jun 22, 2009 at 12:54 PM, Zhao Zhang wrote: > Hi, Alex > > # sitename ? transformation ? path ? INSTALLED ? ?platform ? ?profiles > pbs ? ? wormanalysis ? ?/home/amoore2/work/run_wormanalysis.sh > INSTALLED ? ?INTEL32::LINUX null > > change the "pbs" above to localhost. > That "sitename" field should be the same as the in sites.xml > > best > zhao > > > Alex Moore wrote: >> >> Trying to run a job on Teraport. Logged into tp-login.ci.uchicago.edu. >> Use the following entries for sites.xml and tc.dat: >> ------ >> ? >> ? ? >> ? ? >> ? ?/var/tmp >> ? ?0 >> ? >> ---------- >> # sitename ? transformation ? path ? INSTALLED ? ?platform ? ?profiles >> pbs ? ? wormanalysis ? ?/home/amoore2/work/run_wormanalysis.sh >> INSTALLED ? ?INTEL32::LINUX null >> ---------- >> >> My swift program call an app named wormanalysis that is on my ci >> account. Runs fine locally. Mike Wilde said that changing the >> "> identity=urn:cog-1245692325277)" with constraints {filenames= >> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, >> filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, >> tr=wormanalysis} >> >> Any help would be appreciated. Thanks. >> -Alex >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >> >> > From zhaozhang at uchicago.edu Mon Jun 22 13:15:22 2009 From: zhaozhang at uchicago.edu (Zhao Zhang) Date: Mon, 22 Jun 2009 13:15:22 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> Message-ID: <4A3FCA3A.5010200@uchicago.edu> Hi, Alex On which machine are you running swift? zhao Alex Moore wrote: > Thanks, that got it running. I get a different error message now though: > > Execution failed: > Exception in wormanalysis > Arguments: [home/amoore2/Work/Data/070326.tif, > home/amoore2/Work/Save/Ang-Def.0006.dat, > home/amoore2/Work/Save/Props.0006.dat] > Host: localhost > Directory: wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj > stderr.txt > > stdout.txt > > ---- > Caused by: > Cannot submit job: java.io.IOException: qsub: not found > > The program loads an image file from my CI account as input and > outputs two .dat files to a directory on my CI account as well- I > don't know if it might be something in my code that is causing this. > Thanks. > -Alex > > On Mon, Jun 22, 2009 at 12:54 PM, Zhao Zhang wrote: > >> Hi, Alex >> >> # sitename transformation path INSTALLED platform profiles >> pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh >> INSTALLED INTEL32::LINUX null >> >> change the "pbs" above to localhost. >> That "sitename" field should be the same as the in sites.xml >> >> best >> zhao >> >> >> Alex Moore wrote: >> >>> Trying to run a job on Teraport. Logged into tp-login.ci.uchicago.edu. >>> Use the following entries for sites.xml and tc.dat: >>> ------ >>> >>> >>> >>> /var/tmp >>> 0 >>> >>> ---------- >>> # sitename transformation path INSTALLED platform profiles >>> pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh >>> INSTALLED INTEL32::LINUX null >>> ---------- >>> >>> My swift program call an app named wormanalysis that is on my ci >>> account. Runs fine locally. Mike Wilde said that changing the >>> ">> identity=urn:cog-1245692325277)" with constraints {filenames= >>> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, >>> filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, >>> tr=wormanalysis} >>> >>> Any help would be appreciated. Thanks. >>> -Alex >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >>> >>> >>> > > From hockyg at uchicago.edu Mon Jun 22 13:26:43 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Mon, 22 Jun 2009 13:26:43 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> Message-ID: fast 01:00:00 /home/hockyg/swiftwork change swiftwork dir On Mon, Jun 22, 2009 at 1:10 PM, Alex Moore wrote: > Thanks, that got it running. I get a different error message now though: > > Execution failed: > Exception in wormanalysis > Arguments: [home/amoore2/Work/Data/070326.tif, > home/amoore2/Work/Save/Ang-Def.0006.dat, > home/amoore2/Work/Save/Props.0006.dat] > Host: localhost > Directory: wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj > stderr.txt > > stdout.txt > > ---- > Caused by: > Cannot submit job: java.io.IOException: qsub: not found > > The program loads an image file from my CI account as input and > outputs two .dat files to a directory on my CI account as well- I > don't know if it might be something in my code that is causing this. > Thanks. > -Alex > > On Mon, Jun 22, 2009 at 12:54 PM, Zhao Zhang > wrote: > > Hi, Alex > > > > # sitename transformation path INSTALLED platform profiles > > pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh > > INSTALLED INTEL32::LINUX null > > > > change the "pbs" above to localhost. > > That "sitename" field should be the same as the in > sites.xml > > > > best > > zhao > > > > > > Alex Moore wrote: > >> > >> Trying to run a job on Teraport. Logged into tp-login.ci.uchicago.edu. > >> Use the following entries for sites.xml and tc.dat: > >> ------ > >> > >> > >> > >> /var/tmp > >> 0 > >> > >> ---------- > >> # sitename transformation path INSTALLED platform profiles > >> pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh > >> INSTALLED INTEL32::LINUX null > >> ---------- > >> > >> My swift program call an app named wormanalysis that is on my ci > >> account. Runs fine locally. Mike Wilde said that changing the > >> " >> identity=urn:cog-1245692325277)" with constraints {filenames= > >> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, > >> filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, > >> tr=wormanalysis} > >> > >> Any help would be appreciated. Thanks. > >> -Alex > >> _______________________________________________ > >> Swift-user mailing list > >> Swift-user at ci.uchicago.edu > >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > >> > >> > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amoore2 at uchicago.edu Mon Jun 22 14:14:48 2009 From: amoore2 at uchicago.edu (Alex Moore) Date: Mon, 22 Jun 2009 14:14:48 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> Message-ID: <8a98db410906221214u3335cd23p345e3a01d822a840@mail.gmail.com> I'm running swift on tp-login.ci.uchicago.edu, and I want to run it on the cluster. I changed my swiftwork dir as Glen suggested so I now have: pool handle="localhost" > /home/amoore2/Work 0 localhost wormanalysis /home/amoore2/work/run_wormanalysis.sh INSTALLED INTEL32::LINUX null as my sites.xml and tc.dat files. I get the same error message: Caused by: Cannot submit job: java.io.IOException: qsub: not found It also says that it "Failed to transfer wrapper log from wormanalysis-......../info/9 on localhost" when I run the program -Alex On Mon, Jun 22, 2009 at 1:26 PM, Glen Hocky wrote: > > ? fast > ? 01:00:00 > ? > ? > ? /home/hockyg/swiftwork > > > change swiftwork dir > > On Mon, Jun 22, 2009 at 1:10 PM, Alex Moore wrote: >> >> Thanks, that got it running. I get a different error message now though: >> >> Execution failed: >> Exception in wormanalysis >> Arguments: [home/amoore2/Work/Data/070326.tif, >> home/amoore2/Work/Save/Ang-Def.0006.dat, >> home/amoore2/Work/Save/Props.0006.dat] >> Host: localhost >> Directory: >> wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj >> stderr.txt >> >> stdout.txt >> >> ---- >> Caused by: >> Cannot submit job: java.io.IOException: qsub: not found >> >> The program loads an image file from my CI account as input and >> outputs two .dat files to a directory on my CI account as well- I >> don't know if it might be something in my code that is causing this. >> Thanks. >> -Alex >> >> On Mon, Jun 22, 2009 at 12:54 PM, Zhao Zhang >> wrote: >> > Hi, Alex >> > >> > # sitename ? transformation ? path ? INSTALLED ? ?platform ? ?profiles >> > pbs ? ? wormanalysis ? ?/home/amoore2/work/run_wormanalysis.sh >> > INSTALLED ? ?INTEL32::LINUX null >> > >> > change the "pbs" above to localhost. >> > That "sitename" field should be the same as the in >> > sites.xml >> > >> > best >> > zhao >> > >> > >> > Alex Moore wrote: >> >> >> >> Trying to run a job on Teraport. Logged into tp-login.ci.uchicago.edu. >> >> Use the following entries for sites.xml and tc.dat: >> >> ------ >> >> ? >> >> ? ? >> >> ? ? >> >> ? ?/var/tmp >> >> ? ?0 >> >> ? >> >> ---------- >> >> # sitename ? transformation ? path ? INSTALLED ? ?platform ? ?profiles >> >> pbs ? ? wormanalysis ? ?/home/amoore2/work/run_wormanalysis.sh >> >> INSTALLED ? ?INTEL32::LINUX null >> >> ---------- >> >> >> >> My swift program call an app named wormanalysis that is on my ci >> >> account. Runs fine locally. Mike Wilde said that changing the >> >> "> >> identity=urn:cog-1245692325277)" with constraints {filenames= >> >> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, >> >> filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, >> >> tr=wormanalysis} >> >> >> >> Any help would be appreciated. Thanks. >> >> -Alex >> >> _______________________________________________ >> >> Swift-user mailing list >> >> Swift-user at ci.uchicago.edu >> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >> >> >> >> >> > >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > From zhaozhang at uchicago.edu Mon Jun 22 14:22:11 2009 From: zhaozhang at uchicago.edu (Zhao Zhang) Date: Mon, 22 Jun 2009 14:22:11 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: <8a98db410906221214u3335cd23p345e3a01d822a840@mail.gmail.com> References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> <8a98db410906221214u3335cd23p345e3a01d822a840@mail.gmail.com> Message-ID: <4A3FD9E3.9010404@uchicago.edu> Hi, Try to put the following two lines after the fast 01:00:00 zhao Alex Moore wrote: > I'm running swift on tp-login.ci.uchicago.edu, and I want to run it on > the cluster. I changed my swiftwork dir as Glen suggested so I now > have: > > pool handle="localhost" > > > > /home/amoore2/Work > 0 > > > localhost wormanalysis /home/amoore2/work/run_wormanalysis.sh > INSTALLED INTEL32::LINUX null > > as my sites.xml and tc.dat files. I get the same error message: > > Caused by: > Cannot submit job: java.io.IOException: qsub: not found > > It also says that it "Failed to transfer wrapper log from > wormanalysis-......../info/9 on localhost" when I run the program > > -Alex > > On Mon, Jun 22, 2009 at 1:26 PM, Glen Hocky wrote: > >> >> fast >> 01:00:00 >> >> >> /home/hockyg/swiftwork >> >> >> change swiftwork dir >> >> On Mon, Jun 22, 2009 at 1:10 PM, Alex Moore wrote: >> >>> Thanks, that got it running. I get a different error message now though: >>> >>> Execution failed: >>> Exception in wormanalysis >>> Arguments: [home/amoore2/Work/Data/070326.tif, >>> home/amoore2/Work/Save/Ang-Def.0006.dat, >>> home/amoore2/Work/Save/Props.0006.dat] >>> Host: localhost >>> Directory: >>> wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj >>> stderr.txt >>> >>> stdout.txt >>> >>> ---- >>> Caused by: >>> Cannot submit job: java.io.IOException: qsub: not found >>> >>> The program loads an image file from my CI account as input and >>> outputs two .dat files to a directory on my CI account as well- I >>> don't know if it might be something in my code that is causing this. >>> Thanks. >>> -Alex >>> >>> On Mon, Jun 22, 2009 at 12:54 PM, Zhao Zhang >>> wrote: >>> >>>> Hi, Alex >>>> >>>> # sitename transformation path INSTALLED platform profiles >>>> pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh >>>> INSTALLED INTEL32::LINUX null >>>> >>>> change the "pbs" above to localhost. >>>> That "sitename" field should be the same as the in >>>> sites.xml >>>> >>>> best >>>> zhao >>>> >>>> >>>> Alex Moore wrote: >>>> >>>>> Trying to run a job on Teraport. Logged into tp-login.ci.uchicago.edu. >>>>> Use the following entries for sites.xml and tc.dat: >>>>> ------ >>>>> >>>>> >>>>> >>>>> /var/tmp >>>>> 0 >>>>> >>>>> ---------- >>>>> # sitename transformation path INSTALLED platform profiles >>>>> pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh >>>>> INSTALLED INTEL32::LINUX null >>>>> ---------- >>>>> >>>>> My swift program call an app named wormanalysis that is on my ci >>>>> account. Runs fine locally. Mike Wilde said that changing the >>>>> ">>>> identity=urn:cog-1245692325277)" with constraints {filenames= >>>>> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, >>>>> filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, >>>>> tr=wormanalysis} >>>>> >>>>> Any help would be appreciated. Thanks. >>>>> -Alex >>>>> _______________________________________________ >>>>> Swift-user mailing list >>>>> Swift-user at ci.uchicago.edu >>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >>>>> >>>>> >>>>> >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >>> >> > > From hockyg at uchicago.edu Mon Jun 22 14:31:10 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Mon, 22 Jun 2009 14:31:10 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: <8a98db410906221214u3335cd23p345e3a01d822a840@mail.gmail.com> References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> <8a98db410906221214u3335cd23p345e3a01d822a840@mail.gmail.com> Message-ID: You have to change your environment until typing "qstat" on the command line works. I'm guessing if you edit $HOME/.soft and put in the line +torque and then type resoft;source ~/.bashrc you'll be good to go On Mon, Jun 22, 2009 at 2:14 PM, Alex Moore wrote: > I'm running swift on tp-login.ci.uchicago.edu, and I want to run it on > the cluster. I changed my swiftwork dir as Glen suggested so I now > have: > > pool handle="localhost" > > > > /home/amoore2/Work > 0 > > > localhost wormanalysis /home/amoore2/work/run_wormanalysis.sh > INSTALLED INTEL32::LINUX null > > as my sites.xml and tc.dat files. I get the same error message: > > Caused by: > Cannot submit job: java.io.IOException: qsub: not found > > It also says that it "Failed to transfer wrapper log from > wormanalysis-......../info/9 on localhost" when I run the program > > -Alex > > On Mon, Jun 22, 2009 at 1:26 PM, Glen Hocky wrote: > > > > fast > > 01:00:00 > > > > > > /home/hockyg/swiftwork > > > > > > change swiftwork dir > > > > On Mon, Jun 22, 2009 at 1:10 PM, Alex Moore > wrote: > >> > >> Thanks, that got it running. I get a different error message now though: > >> > >> Execution failed: > >> Exception in wormanalysis > >> Arguments: [home/amoore2/Work/Data/070326.tif, > >> home/amoore2/Work/Save/Ang-Def.0006.dat, > >> home/amoore2/Work/Save/Props.0006.dat] > >> Host: localhost > >> Directory: > >> wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj > >> stderr.txt > >> > >> stdout.txt > >> > >> ---- > >> Caused by: > >> Cannot submit job: java.io.IOException: qsub: not found > >> > >> The program loads an image file from my CI account as input and > >> outputs two .dat files to a directory on my CI account as well- I > >> don't know if it might be something in my code that is causing this. > >> Thanks. > >> -Alex > >> > >> On Mon, Jun 22, 2009 at 12:54 PM, Zhao Zhang > >> wrote: > >> > Hi, Alex > >> > > >> > # sitename transformation path INSTALLED platform profiles > >> > pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh > >> > INSTALLED INTEL32::LINUX null > >> > > >> > change the "pbs" above to localhost. > >> > That "sitename" field should be the same as the in > >> > sites.xml > >> > > >> > best > >> > zhao > >> > > >> > > >> > Alex Moore wrote: > >> >> > >> >> Trying to run a job on Teraport. Logged into > tp-login.ci.uchicago.edu. > >> >> Use the following entries for sites.xml and tc.dat: > >> >> ------ > >> >> > >> >> > >> >> > >> >> /var/tmp > >> >> 0 > >> >> > >> >> ---------- > >> >> # sitename transformation path INSTALLED platform > profiles > >> >> pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh > >> >> INSTALLED INTEL32::LINUX null > >> >> ---------- > >> >> > >> >> My swift program call an app named wormanalysis that is on my ci > >> >> account. Runs fine locally. Mike Wilde said that changing the > >> >> " >> >> identity=urn:cog-1245692325277)" with constraints {filenames= > >> >> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, > >> >> filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, > >> >> tr=wormanalysis} > >> >> > >> >> Any help would be appreciated. Thanks. > >> >> -Alex > >> >> _______________________________________________ > >> >> Swift-user mailing list > >> >> Swift-user at ci.uchicago.edu > >> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > >> >> > >> >> > >> > > >> _______________________________________________ > >> Swift-user mailing list > >> Swift-user at ci.uchicago.edu > >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhaozhang at uchicago.edu Mon Jun 22 14:32:20 2009 From: zhaozhang at uchicago.edu (Zhao Zhang) Date: Mon, 22 Jun 2009 14:32:20 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> <8a98db410906221214u3335cd23p345e3a01d822a840@mail.gmail.com> Message-ID: <4A3FDC44.8080305@uchicago.edu> Ah, yes add the following two lines in ~/.soft +maui +torque then do what Glen said will fix the problem. zhao Glen Hocky wrote: > You have to change your environment until typing "qstat" on the > command line works. > I'm guessing if you edit $HOME/.soft and put in the line > > +torque > > and then type > > resoft;source ~/.bashrc > > you'll be good to go > > On Mon, Jun 22, 2009 at 2:14 PM, Alex Moore > wrote: > > I'm running swift on tp-login.ci.uchicago.edu > , and I want to run it on > the cluster. I changed my swiftwork dir as Glen suggested so I now > have: > > pool handle="localhost" > > > > /home/amoore2/Work > 0 > > > localhost wormanalysis /home/amoore2/work/run_wormanalysis.sh > INSTALLED INTEL32::LINUX null > > as my sites.xml and tc.dat files. I get the same error message: > > Caused by: > Cannot submit job: java.io.IOException: qsub: not found > > It also says that it "Failed to transfer wrapper log from > wormanalysis-......../info/9 on localhost" when I run the program > > -Alex > > On Mon, Jun 22, 2009 at 1:26 PM, Glen Hocky > wrote: > > > > fast > > 01:00:00 > > > > > > /home/hockyg/swiftwork > > > > > > change swiftwork dir > > > > On Mon, Jun 22, 2009 at 1:10 PM, Alex Moore > > wrote: > >> > >> Thanks, that got it running. I get a different error message > now though: > >> > >> Execution failed: > >> Exception in wormanalysis > >> Arguments: [home/amoore2/Work/Data/070326.tif, > >> home/amoore2/Work/Save/Ang-Def.0006.dat, > >> home/amoore2/Work/Save/Props.0006.dat] > >> Host: localhost > >> Directory: > >> wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj > >> stderr.txt > >> > >> stdout.txt > >> > >> ---- > >> Caused by: > >> Cannot submit job: java.io.IOException: qsub: not found > >> > >> The program loads an image file from my CI account as input and > >> outputs two .dat files to a directory on my CI account as well- I > >> don't know if it might be something in my code that is causing > this. > >> Thanks. > >> -Alex > >> > >> On Mon, Jun 22, 2009 at 12:54 PM, Zhao > Zhang> > >> wrote: > >> > Hi, Alex > >> > > >> > # sitename transformation path INSTALLED platform > profiles > >> > pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh > >> > INSTALLED INTEL32::LINUX null > >> > > >> > change the "pbs" above to localhost. > >> > That "sitename" field should be the same as the in > >> > sites.xml > >> > > >> > best > >> > zhao > >> > > >> > > >> > Alex Moore wrote: > >> >> > >> >> Trying to run a job on Teraport. Logged into > tp-login.ci.uchicago.edu . > >> >> Use the following entries for sites.xml and tc.dat: > >> >> ------ > >> >> > >> >> > >> >> > >> >> /var/tmp > >> >> 0 > >> >> > >> >> ---------- > >> >> # sitename transformation path INSTALLED platform > profiles > >> >> pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh > >> >> INSTALLED INTEL32::LINUX null > >> >> ---------- > >> >> > >> >> My swift program call an app named wormanalysis that is on my ci > >> >> account. Runs fine locally. Mike Wilde said that changing the > >> >> " >> >> identity=urn:cog-1245692325277)" with constraints {filenames= > >> >> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, > >> >> > filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, > >> >> tr=wormanalysis} > >> >> > >> >> Any help would be appreciated. Thanks. > >> >> -Alex > >> >> _______________________________________________ > >> >> Swift-user mailing list > >> >> Swift-user at ci.uchicago.edu > >> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > >> >> > >> >> > >> > > >> _______________________________________________ > >> Swift-user mailing list > >> Swift-user at ci.uchicago.edu > >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > From hockyg at uchicago.edu Mon Jun 22 14:32:34 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Mon, 22 Jun 2009 14:32:34 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> <8a98db410906221214u3335cd23p345e3a01d822a840@mail.gmail.com> Message-ID: Also, i think you want the output of "which qsub to be the following" [hockyg at tp-login2 ~]$ which qsub /soft/torque-2.3.6-r1/bin/qsub On Mon, Jun 22, 2009 at 2:31 PM, Glen Hocky wrote: > You have to change your environment until typing "qstat" on the command > line works. > I'm guessing if you edit $HOME/.soft and put in the line > > +torque > > and then type > > resoft;source ~/.bashrc > > you'll be good to go > > > On Mon, Jun 22, 2009 at 2:14 PM, Alex Moore wrote: > >> I'm running swift on tp-login.ci.uchicago.edu, and I want to run it on >> the cluster. I changed my swiftwork dir as Glen suggested so I now >> have: >> >> pool handle="localhost" > >> >> >> /home/amoore2/Work >> 0 >> >> >> localhost wormanalysis /home/amoore2/work/run_wormanalysis.sh >> INSTALLED INTEL32::LINUX null >> >> as my sites.xml and tc.dat files. I get the same error message: >> >> Caused by: >> Cannot submit job: java.io.IOException: qsub: not found >> >> It also says that it "Failed to transfer wrapper log from >> wormanalysis-......../info/9 on localhost" when I run the program >> >> -Alex >> >> On Mon, Jun 22, 2009 at 1:26 PM, Glen Hocky wrote: >> > >> > fast >> > 01:00:00 >> > >> > >> > /home/hockyg/swiftwork >> > >> > >> > change swiftwork dir >> > >> > On Mon, Jun 22, 2009 at 1:10 PM, Alex Moore >> wrote: >> >> >> >> Thanks, that got it running. I get a different error message now >> though: >> >> >> >> Execution failed: >> >> Exception in wormanalysis >> >> Arguments: [home/amoore2/Work/Data/070326.tif, >> >> home/amoore2/Work/Save/Ang-Def.0006.dat, >> >> home/amoore2/Work/Save/Props.0006.dat] >> >> Host: localhost >> >> Directory: >> >> wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj >> >> stderr.txt >> >> >> >> stdout.txt >> >> >> >> ---- >> >> Caused by: >> >> Cannot submit job: java.io.IOException: qsub: not found >> >> >> >> The program loads an image file from my CI account as input and >> >> outputs two .dat files to a directory on my CI account as well- I >> >> don't know if it might be something in my code that is causing this. >> >> Thanks. >> >> -Alex >> >> >> >> On Mon, Jun 22, 2009 at 12:54 PM, Zhao Zhang >> >> wrote: >> >> > Hi, Alex >> >> > >> >> > # sitename transformation path INSTALLED platform >> profiles >> >> > pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh >> >> > INSTALLED INTEL32::LINUX null >> >> > >> >> > change the "pbs" above to localhost. >> >> > That "sitename" field should be the same as the in >> >> > sites.xml >> >> > >> >> > best >> >> > zhao >> >> > >> >> > >> >> > Alex Moore wrote: >> >> >> >> >> >> Trying to run a job on Teraport. Logged into >> tp-login.ci.uchicago.edu. >> >> >> Use the following entries for sites.xml and tc.dat: >> >> >> ------ >> >> >> >> >> >> >> >> >> >> >> >> /var/tmp >> >> >> 0 >> >> >> >> >> >> ---------- >> >> >> # sitename transformation path INSTALLED platform >> profiles >> >> >> pbs wormanalysis /home/amoore2/work/run_wormanalysis.sh >> >> >> INSTALLED INTEL32::LINUX null >> >> >> ---------- >> >> >> >> >> >> My swift program call an app named wormanalysis that is on my ci >> >> >> account. Runs fine locally. Mike Wilde said that changing the >> >> >> "> >> >> identity=urn:cog-1245692325277)" with constraints {filenames= >> >> >> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, >> >> >> filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9 >> , >> >> >> tr=wormanalysis} >> >> >> >> >> >> Any help would be appreciated. Thanks. >> >> >> -Alex >> >> >> _______________________________________________ >> >> >> Swift-user mailing list >> >> >> Swift-user at ci.uchicago.edu >> >> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >> >> >> >> >> >> >> >> > >> >> _______________________________________________ >> >> Swift-user mailing list >> >> Swift-user at ci.uchicago.edu >> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >> > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.uchicago.edu Mon Jun 22 14:42:16 2009 From: iraicu at cs.uchicago.edu (Ioan Raicu) Date: Mon, 22 Jun 2009 14:42:16 -0500 Subject: [Swift-user] CFP: 2nd ACM Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS09) at Supercomputing 2009 Message-ID: <4A3FDE98.6030102@cs.uchicago.edu> Call for Papers --------------------------------------------------------------------------------------- The 2nd ACM Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) 2009 http://dsl.cs.uchicago.edu/MTAGS09/ --------------------------------------------------------------------------------------- November 16th, 2009 Portland, Oregon, USA Co-located with with IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC09) ======================================================================================= The 2nd workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) will provide the scientific community a dedicated forum for presenting new research, development, and deployment efforts of loosely coupled large scale applications on large scale clusters, Grids, Supercomputers, and Cloud Computing infrastructure. Many-task computing (MTC), the theme of the workshop encompasses loosely coupled applications, which are generally composed of many tasks (both independent and dependent tasks) to achieve some larger application goal. This workshop will cover challenges that can hamper efficiency and utilization in running applications on large-scale systems, such as local resource manager scalability and granularity, efficient utilization of the raw hardware, parallel file system contention and scalability, reliability at scale, and application scalability. We welcome paper submissions on all topics related to MTC on large scale systems. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library. The workshop will be co-located with the IEEE/ACM Supercomputing 2009 Conference in Portland Oregon on November 16th, 2009. For more information, please visithttp://dsl.cs.uchicago.edu/MTAGS09/. Scope --------------------------------------------------------------------------------------- This workshop will focus on the ability to manage and execute large scale applications on today's largest clusters, Grids, and Supercomputers. Clusters with 50K+ processor cores are beginning to come online (i.e. TACC Sun Constellation System - Ranger), Grids (i.e. TeraGrid) with a dozen sites and 100K+ processors, and supercomputers with 160K processors (i.e. IBM BlueGene/P). Large clusters and supercomputers have traditionally been high performance computing (HPC) systems, as they are efficient at executing tightly coupled parallel jobs within a particular machine with low-latency interconnects; the applications typically use message passing interface (MPI) to achieve the needed inter-process communication. On the other hand, Grids have been the preferred platform for more loosely coupled applications that tend to be managed and executed through workflow systems. In contrast to HPC (tightly coupled applications), these loosely coupled applications make up a new class of applications as what we call Many-Task Computing (MTC). MTC systems generally involve the execution of independent, sequential jobs that can be individually scheduled on many different computing resources across multiple administrative boundaries. MTC systems typically achieve this using various grid computing technologies and techniques, and often times use files to achieve the inter-process communication as alternative communication mechanisms than MPI. MTC is reminiscent to High Throughput Computing (HTC); however, MTC differs from HTC in the emphasis of using many computing resources over short periods of time to accomplish many computational tasks, where the primary metrics are measured in seconds (e.g. FLOPS, tasks/sec, MB/s I/O rates). HTC on the other hand requires large amounts of computing for longer times (months and years, rather than hours and days, and are generally measured in operations per month). Today's existing HPC systems are a viable platform to host MTC applications. However, some challenges arise in large scale applications when run on large scale systems, which can hamper the efficiency and utilization of these large scale systems. These challenges vary from local resource manager scalability and granularity, efficient utilization of the raw hardware, shared file system contention and scalability, reliability at scale, application scalability, and understanding the limitations of the HPC systems in order to identify good candidate MTC applications. Furthermore, the MTC paradigm can be naturally applied to the emerging Cloud Computing paradigm due to its loosely coupled nature, which is being adopted by industry as the next wave of technological advancement to reduce operational costs while improving efficiencies in large scale infrastructures. For an interesting discussion in a blog by Ian Foster on the difference between MTC and HTC, please see his blog athttp://ianfoster.typepad.com/blog/2008/07/many-tasks-comp.html. We also published two papers that are highly relevant to this workshop. One paper is titled "Toward Loosely Coupled Programming on Petascale Systems", and was published in SC08; the second paper is titled "Many-Task Computing for Grids and Supercomputers", which was published in MTAGS08. Furthermore, to see last year's workshop program agenda, and accepted papers and presentations, please seehttp://dsl.cs.uchicago.edu/MTAGS08/. For more information, please visithttp://dsl.cs.uchicago.edu/MTAGS09/. Topics --------------------------------------------------------------------------------------- MTAGS 2008 topics of interest include, but are not limited to: * Compute Resource Management in large scale clusters, large Grids, Supercomputers, or Cloud Computing infrastructure o Scheduling o Job execution frameworks o Local resource manager extensions o Performance evaluation of resource managers in use on large scale systems o Challenges and opportunities in running many-task workloads on HPC systems o Challenges and opportunities in running many-task workloads on Cloud Computing infrastructure * Data Management in large scale Grid and Supercomputer environments: o Data-Aware Scheduling o Parallel File System performance and scalability in large deployments o Distributed file systems o Data caching frameworks and techniques * Large-Scale Workflow Systems o Workflow system performance and scalability analysis o Scalability of workflow systems o Workflow infrastructure and e-Science middleware o Programming Paradigms and Models * Large-Scale Many-Task Applications o Large-scale many-task applications o Large-scale many-task data-intensive applications o Large-scale high throughput computing (HTC) applications o Quasi-supercomputing applications, deployments, and experiences Paper Submission and Publication --------------------------------------------------------------------------------------- Authors are invited to submit papers with unpublished, original work of not more than 10 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages, as per ACM 8.5 x 11 manuscript guidelines (http://www.acm.org/publications/instructions_for_proceedings_volumes); document templates can be found athttp://www.acm.org/sigs/publications/proceedings-templates. A 250 word abstract (PDF format) must be submitted online at https://cmt.research.microsoft.com/MTAGS2009/ before the deadline of August 1st, 2009 at 11:59PM PST; the final 10 page papers in PDF format will be due on September 1st, 2009 at 11:59PM PST. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library. Notifications of the paper decisions will be sent out by October 1st, 2009. Selected excellent work will be invited to submit extended versions of the workshop paper to the IEEE Transactions on Parallel and Distributed Systems (TPDS) Journal, Special Issue on Many-Task Computing (due December 21st, 2009); for more information about this journal special issue, please visithttp://dsl.cs.uchicago.edu/TPDS_MTC/. Submission implies the willingness of at least one of the authors to register and present the paper. For more information, please visithttp://dsl.cs.uchicago.edu/MTAGS09/. Important Dates --------------------------------------------------------------------------------------- * Abstract Due: August 1st, 2009 * Papers Due: September 1st, 2009 * Notification of Acceptance: October 1st, 2009 * Camera Ready Papers Due: November 1st, 2009 * Workshop Date: November 16th, 2009 Committee Members --------------------------------------------------------------------------------------- Workshop Chairs * Ioan Raicu, University of Chicago * Ian Foster, University of Chicago& Argonne National Laboratory * Yong Zhao, Microsoft Technical Committee (confirmed) * David Abramson, Monash University, Australia * Pete Beckman, Argonne National Laboratory, USA * Peter Dinda, Northwestern University, USA * Ian Foster, University of Chicago& Argonne National Laboratory, USA * Bob Grossman, University of Illinois at Chicago, USA * Indranil Gupta, University of Illinois at Urbana Champaign, USA * Alexandru Iosup, Delft University of Technology, Netherlands * Kamil Iskra, Argonne National Laboratory, USA * Chuang Liu, Ask.com, USA * Zhou Lei, Shanghai University, China * Shiyong Lu, Wayne State University, USA * Reagan Moore, University of North Carolina at Chapel Hill, USA * Marlon Pierce, Indiana University, USA * Ioan Raicu, University of Chicago, USA * Matei Ripeanu, University of British Columbia, Canada * David Swanson, University of Nebraska, USA * Greg Thain, Univeristy of Wisconsin, USA * Matthew Woitaszek, The University Corporation for Atmospheric Research, USA * Sherali Zeadally, University of the District of Columbia, USA * Yong Zhao, Microsoft, USA From iraicu at cs.uchicago.edu Mon Jun 22 14:42:42 2009 From: iraicu at cs.uchicago.edu (Ioan Raicu) Date: Mon, 22 Jun 2009 14:42:42 -0500 Subject: [Swift-user] CFP: Special Issue on Many-Task Computing in the IEEE Transactions on Parallel and Distributed Systems (TPDS) Journal Message-ID: <4A3FDEB2.1090109@cs.uchicago.edu> Call for Papers --------------------------------------------------------------------------------------- IEEE Transactions on Parallel and Distributed Systems Special Issue on Many-Task Computing on Grids and Supercomputers http://dsl.cs.uchicago.edu/TPDS_MTC/ ======================================================================================= The Special Issue on Many-Task Computing (MTC) will provide the scientific community a dedicated forum, within the prestigious IEEE Transactions on Parallel and Distributed Systems Journal, for presenting new research, development, and deployment efforts of loosely coupled large scale applications on large scale clusters, Grids, Supercomputers, and Cloud Computing infrastructure. MTC, the focus of the special issue, encompasses loosely coupled applications, which are generally composed of many tasks (both independent and dependent tasks) to achieve some larger application goal. This special issue will cover challenges that can hamper efficiency and utilization in running applications on large-scale systems, such as local resource manager scalability and granularity, efficient utilization of the raw hardware, parallel file system contention and scalability, data management, I/O management, reliability at scale, and application scalability. We welcome paper submissions on all topics related to MTC on large scale systems. For more information on this special issue, please see http://dsl.cs.uchicago.edu/TPDS_MTC/. Scope --------------------------------------------------------------------------------------- This special issue will focus on the ability to manage and execute large scale applications on today's largest clusters, Grids, and Supercomputers. Clusters with tens of thousands of processor cores are readily available, Grids (i.e. TeraGrid) with a dozen sites and 100K+ processors, and supercomputers with up to 200K processors (i.e. IBM BlueGene/L and BlueGene/P, Cray XT5, Sun Constellation), are all now available to the broader scientific community for open science research. Large clusters and supercomputers have traditionally been high performance computing (HPC) systems, as they are efficient at executing tightly coupled parallel jobs within a particular machine with low-latency interconnects; the applications typically use message passing interface (MPI) to achieve the needed inter-process communication. On the other hand, Grids have been the preferred platform for more loosely coupled applications that tend to be managed and executed through workflow systems, commonly known to fit in the high-throughput computing (HTC) paradigm. Many-task computing (MTC) aims to bridge the gap between two computing paradigms, HTC and HPC. MTC is reminiscent to HTC, but it differs in the emphasis of using many computing resources over short periods of time to accomplish many computational tasks (i.e. including both dependent and independent tasks), where the primary metrics are measured in seconds (e.g. FLOPS, tasks/s, MB/s I/O rates), as opposed to operations (e.g. jobs) per month. MTC denotes high-performance computations comprising multiple distinct activities, coupled via file system operations. Tasks may be small or large, uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. MTC includes loosely coupled applications that are generally communication-intensive but not naturally expressed using standard message passing interface commonly found in HPC, drawing attention to the many computations that are heterogeneous but not "happily" parallel. There is more to HPC than tightly coupled MPI, and more to HTC than embarrassingly parallel long running jobs. Like HPC applications, and science itself, applications are becoming increasingly complex opening new doors for many opportunities to apply HPC in new ways if we broaden our perspective. Some applications have just so many simple tasks that managing them is hard. Applications that operate on or produce large amounts of data need sophisticated data management in order to scale. There exist applications that involve many tasks, each composed of tightly coupled MPI tasks. Loosely coupled applications often have dependencies among tasks, and typically use files for inter-process communication. Efficient support for these sorts of applications on existing large scale systems will involve substantial technical challenges and will have big impact on science. Today's existing HPC systems are a viable platform to host MTC applications. However, some challenges arise in large scale applications when run on large scale systems, which can hamper the efficiency and utilization of these large scale systems. These challenges vary from local resource manager scalability and granularity, efficient utilization of the raw hardware, parallel file system contention and scalability, data management, I/O management, reliability at scale, application scalability, and understanding the limitations of the HPC systems in order to identify good candidate MTC applications. Furthermore, the MTC paradigm can be naturally applied to the emerging Cloud Computing paradigm due to its loosely coupled nature, which is being adopted by industry as the next wave of technological advancement to reduce operational costs while improving efficiencies in large scale infrastructures. For an interesting discussion in a blog by Ian Foster on the difference between MTC and HTC, please see his blog athttp://ianfoster.typepad.com/blog/2008/07/many-tasks-comp.html. The proposed editors also published several papers highly relevant to this special issue. One paper is titled "Toward Loosely Coupled Programming on Petascale Systems", and was published in IEEE/ACM Supercomputing 2008 (SC08) Conference; the second paper is titled "Many-Task Computing for Grids and Supercomputers", which was published in the IEEE Workshop on Many-Task Computing on Grids and Supercomputers 2008 (MTAGS08). To see last year's workshop program agenda, and accepted papers and presentations, please see http://dsl.cs.uchicago.edu/MTAGS08/. To see this year's workshop web site, see http://dsl.cs.uchicago.edu/MTAGS09/. Topics --------------------------------------------------------------------------------------- Topics of interest include, but are not limited to: * Compute Resource Management in large scale clusters, large Grids, Supercomputers, or Cloud Computing infrastructure o Scheduling o Job execution frameworks o Local resource manager extensions o Performance evaluation of resource managers in use on large scale systems o Challenges and opportunities in running many-task workloads on HPC systems o Challenges and opportunities in running many-task workloads on Cloud Computing infrastructure * Data Management in large scale Grid and Supercomputer environments: o Data-Aware Scheduling o Parallel File System performance and scalability in large deployments o Distributed file systems o Data caching frameworks and techniques * Large-Scale Workflow Systems o Workflow system performance and scalability analysis o Scalability of workflow systems o Workflow infrastructure and e-Science middleware o Programming Paradigms and Models * Large-Scale Many-Task Applications o Large-scale many-task applications o Large-scale many-task data-intensive applications o Large-scale high throughput computing (HTC) applications o Quasi-supercomputing applications, deployments, and experiences Paper Submission and Publication --------------------------------------------------------------------------------------- Authors are invited to submit papers with unpublished, original work of not more than 14 pages of double column text using single spaced 9.5 point size on 8.5 x 11 inch pages and 0.5 inch margins (http://www2.computer.org/portal/c/document_library/get_file?uuid=02e1509b-5526-4658-afb2-fe8b35044552&groupId=525767). Papers will be peer-reviewed, and accepted papers will be published in the IEEE digital library. For more information, please visithttp://dsl.cs.uchicago.edu/TPDS_MTC/. Important Dates --------------------------------------------------------------------------------------- * Abstract Due: December 1st, 2009 * Papers Due: December 21st, 2009 * First Round Decisions: February 22nd, 2010 * Major Revisions if needed: April 19th, 2010 * Second Round Decisions: May 24th, 2010 * Minor Revisions if needed: June 7th, 2010 * Final Decision: June 21st, 2010 * Publication Date: November, 2010 Guest Editors and Potential Reviewers --------------------------------------------------------------------------------------- Special Issue Guest Editors * Ian Foster, University of Chicago& Argonne National Laboratory * Ioan Raicu, University of Chicago * Yong Zhao, Microsoft From amoore2 at uchicago.edu Mon Jun 22 14:50:05 2009 From: amoore2 at uchicago.edu (Alex Moore) Date: Mon, 22 Jun 2009 14:50:05 -0500 Subject: [Swift-user] Running on Teraport In-Reply-To: References: <8a98db410906221048oe7e51f8lfd85c27718d7af8d@mail.gmail.com> <4A3FC552.2090406@uchicago.edu> <8a98db410906221110h434604ccqe50ebe854df7f8ca@mail.gmail.com> <8a98db410906221214u3335cd23p345e3a01d822a840@mail.gmail.com> Message-ID: <8a98db410906221250k21ef807ak126430d22af5503f@mail.gmail.com> Alright, that was it. Thanks, everything is working fine now. -Alex On Mon, Jun 22, 2009 at 2:32 PM, Glen Hocky wrote: > Also, i think you want the output of "which qsub to be the following" > > [hockyg at tp-login2 ~]$ which qsub > /soft/torque-2.3.6-r1/bin/qsub > > > On Mon, Jun 22, 2009 at 2:31 PM, Glen Hocky wrote: >> >> You have to change your environment until typing "qstat" on the command >> line works. >> I'm guessing if you edit $HOME/.soft and put in the line >> >> +torque >> >> and then type >> >> resoft;source ~/.bashrc >> >> you'll be good to go >> >> On Mon, Jun 22, 2009 at 2:14 PM, Alex Moore wrote: >>> >>> I'm running swift on tp-login.ci.uchicago.edu, and I want to run it on >>> the cluster. I changed my swiftwork dir as Glen suggested so I now >>> have: >>> >>> pool handle="localhost" > >>> ? >>> ? >>> ? /home/amoore2/Work >>> ? 0 >>> ? >>> >>> localhost ? wormanalysis ? ?/home/amoore2/work/run_wormanalysis.sh >>> INSTALLED ? ?INTEL32::LINUX null >>> >>> as my sites.xml and tc.dat files. I get the same error message: >>> >>> Caused by: >>> Cannot submit job: java.io.IOException: qsub: not found >>> >>> It also says that it "Failed to transfer wrapper log from >>> wormanalysis-......../info/9 on localhost" when I run the program >>> >>> -Alex >>> >>> On Mon, Jun 22, 2009 at 1:26 PM, Glen Hocky wrote: >>> > >>> > ? fast >>> > ? 01:00:00 >>> > ? >>> > ? >>> > ? /home/hockyg/swiftwork >>> > >>> > >>> > change swiftwork dir >>> > >>> > On Mon, Jun 22, 2009 at 1:10 PM, Alex Moore >>> > wrote: >>> >> >>> >> Thanks, that got it running. I get a different error message now >>> >> though: >>> >> >>> >> Execution failed: >>> >> Exception in wormanalysis >>> >> Arguments: [home/amoore2/Work/Data/070326.tif, >>> >> home/amoore2/Work/Save/Ang-Def.0006.dat, >>> >> home/amoore2/Work/Save/Props.0006.dat] >>> >> Host: localhost >>> >> Directory: >>> >> wormanalysis-20090622-1303-7z3h1d62/jobs/0/wormanalysis-0mumlmcj >>> >> stderr.txt >>> >> >>> >> stdout.txt >>> >> >>> >> ---- >>> >> Caused by: >>> >> Cannot submit job: java.io.IOException: qsub: not found >>> >> >>> >> The program loads an image file from my CI account as input and >>> >> outputs two .dat files to a directory on my CI account as well- I >>> >> don't know if it might be something in my code that is causing this. >>> >> Thanks. >>> >> -Alex >>> >> >>> >> On Mon, Jun 22, 2009 at 12:54 PM, Zhao Zhang >>> >> wrote: >>> >> > Hi, Alex >>> >> > >>> >> > # sitename ? transformation ? path ? INSTALLED ? ?platform >>> >> > ?profiles >>> >> > pbs ? ? wormanalysis ? ?/home/amoore2/work/run_wormanalysis.sh >>> >> > INSTALLED ? ?INTEL32::LINUX null >>> >> > >>> >> > change the "pbs" above to localhost. >>> >> > That "sitename" field should be the same as the in >>> >> > sites.xml >>> >> > >>> >> > best >>> >> > zhao >>> >> > >>> >> > >>> >> > Alex Moore wrote: >>> >> >> >>> >> >> Trying to run a job on Teraport. Logged into >>> >> >> tp-login.ci.uchicago.edu. >>> >> >> Use the following entries for sites.xml and tc.dat: >>> >> >> ------ >>> >> >> ? >>> >> >> ? ? >>> >> >> ? ? >>> >> >> ? ?/var/tmp >>> >> >> ? ?0 >>> >> >> ? >>> >> >> ---------- >>> >> >> # sitename ? transformation ? path ? INSTALLED ? ?platform >>> >> >> ?profiles >>> >> >> pbs ? ? wormanalysis ? ?/home/amoore2/work/run_wormanalysis.sh >>> >> >> INSTALLED ? ?INTEL32::LINUX null >>> >> >> ---------- >>> >> >> >>> >> >> My swift program call an app named wormanalysis that is on my ci >>> >> >> account. Runs fine locally. Mike Wilde said that changing the >>> >> >> ">> >> >> identity=urn:cog-1245692325277)" with constraints {filenames= >>> >> >> [Ljava.lang.Sting;@15f98bd, trfqn=wormanalysis, >>> >> >> >>> >> >> filecache=0rg.griphyn.vdl.karajan.lib.cache.CacheMapAdapter at 15f98f9, >>> >> >> tr=wormanalysis} >>> >> >> >>> >> >> Any help would be appreciated. Thanks. >>> >> >> -Alex >>> >> >> _______________________________________________ >>> >> >> Swift-user mailing list >>> >> >> Swift-user at ci.uchicago.edu >>> >> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >>> >> >> >>> >> >> >>> >> > >>> >> _______________________________________________ >>> >> Swift-user mailing list >>> >> Swift-user at ci.uchicago.edu >>> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >>> > >>> > >> > > From me.melly at gmail.com Mon Jun 22 15:28:36 2009 From: me.melly at gmail.com (Melinda Chin) Date: Mon, 22 Jun 2009 15:28:36 -0500 Subject: [Swift-user] Swift-Plot-Log Message-ID: <63cc32bc0906221328q35426bb1p350be98bb8a212c@mail.gmail.com> I was testing out the swift-plot-log command however I was running into some troubles. I was using http://www.ci.uchicago.edu/swift/guides/log-processing.php as my guide to run swift-plot-log on a random log I picked from the /disk/ci-gpfs/swift/swift-logs directory. My swift-plot log is in /home/mchin/swift/bin Inside the swift dir, I created a log-processing dir and a bin dir in side it and typed the commands as shown on the above url, Step 3. ====================================================================== *[mchin at tp-login2 bin]$* cd ../log-processing/bin/ *[mchin at tp-login2 bin]$ *export PATH=$(pwd):$PATH *[mchin at tp-login2 bin]$* swift-plot-log /path/to/mchin-20090609-1307-020920gc.log Log file path is /path/to/mchin-20090609-1307-020920gc.log Log is in directory /path/to Log basename is mchin-20090609-1307-020920gc Now in directory /tmp/swift-plot-log-pnKiMSBLiTdy9503 rm -f start-times.data kickstart-times.data start-time.tmp end-time.tmp threads.list tasks.list log *.data *.shifted *.png *.event *.coloured-event *.total *.tmp *.transitions *.last karatasks-type-counts.txt index.html *.lastsummary execstages.plot total.plot colour.plot jobs-sites.html jobs.retrycount.summary kickstart.stats execution-counts.txt site-duration.txt jobs.retrycount sp.plot karatasks.coloured-sorted-event *.cedps *.stats t.inf *.seenstates tmp-* clusterstats trname-summary sites-list.data.nm info-md5sums pse2d-tmp.eip karajan.html falkon.html execute2.html info.html execute.html kickstart.html scheduler.html assorted.html make: *** No rule to make target `/path/to/mchin-20090609-1307-020920gc.log', needed by `karatasks.transitions'. Stop. *[mchin at tp-login2 bin]$* ====================================================================== pwd: /home/mchin/swift/log-processing/bin The directory I am in is empty at the moment, but I have tried it with the swift-plot-log as well as the log file (in the current directory at the same time) and the output is the same regardless. I tried reading what the swift-plot-log does using emacs, but got a little lost. What I gathered is something is wrong with karatasks, but I don't know what I'm doing wrong and how to fix this problem. Thank you ahead of time. Sincerely, Melinda Chin -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Mon Jun 22 16:24:51 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 22 Jun 2009 21:24:51 +0000 (GMT) Subject: [Swift-user] Swift-Plot-Log In-Reply-To: <63cc32bc0906221328q35426bb1p350be98bb8a212c@mail.gmail.com> References: <63cc32bc0906221328q35426bb1p350be98bb8a212c@mail.gmail.com> Message-ID: On Mon, 22 Jun 2009, Melinda Chin wrote: > I was testing out the swift-plot-log command however I was running into some > troubles. I was using > http://www.ci.uchicago.edu/swift/guides/log-processing.php as my guide to thats out of date. Do this: put the log file you want to process in a directory that you have write access to - if you did the run yourself then it will already be in such a directory. if you are getting the log from the repository, copy it to a directory of your choice. inside that direcotry, type: swift-plog-log whatever.log where is the name of the log file. -- From benc at hawaga.org.uk Mon Jun 22 16:44:51 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 22 Jun 2009 21:44:51 +0000 (GMT) Subject: [Swift-user] Swift-Plot-Log In-Reply-To: <63cc32bc0906221328q35426bb1p350be98bb8a212c@mail.gmail.com> References: <63cc32bc0906221328q35426bb1p350be98bb8a212c@mail.gmail.com> Message-ID: On Mon, 22 Jun 2009, Melinda Chin wrote: > http://www.ci.uchicago.edu/swift/guides/log-processing.php as my guide to I have modified this document to no longer include incorrect instructions wrt executing swift-plot-log - the proper place for such instructions is in the user guide now, and there is a small section there about running swift-plot-log. -- From wilde at mcs.anl.gov Fri Jun 26 13:53:39 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 26 Jun 2009 13:53:39 -0500 Subject: [Swift-user] How best to get per-job times? Message-ID: <4A451933.9050704@mcs.anl.gov> I was under the impression that wrapper logs have superseded kickstart logs. But looking at the _swiftwrap script, I dont see where it records the job's resource consumption (at least CPU and wall time, and ideally more, like memory). Is that available somewhere from the wrapper, or only from kickstart? From me.melly at gmail.com Fri Jun 26 14:04:49 2009 From: me.melly at gmail.com (Melinda Chin) Date: Fri, 26 Jun 2009 14:04:49 -0500 Subject: [swift-user] List of OSG sites Message-ID: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> Here's what I think is a simple question but can't find the answer to? Where can I find a list of all the osg sites? Thank you, Melinda Chin -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Fri Jun 26 14:14:36 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 26 Jun 2009 14:14:36 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> Message-ID: <4A451E1C.4020406@mcs.anl.gov> From a practical point of view, you can get the sites in the OSG "engagement" VO using this Swift command, which returns the list as a sites.xml file: swift-osg-ress-site-catalog --engage-verified That command also has options for getting sites from other VOs swift-osg-ress-site-catalog --vo=osg On 6/26/09 2:04 PM, Melinda Chin wrote: > Here's what I think is a simple question but can't find the answer to? > Where can I find a list of all the osg sites? > > Thank you, > Melinda Chin > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From HodgessE at uhd.edu Fri Jun 26 15:20:25 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Fri, 26 Jun 2009 15:20:25 -0500 Subject: [swift-user] List of OSG sites References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> I'm having trouble with the swift-osg-ress-site-catalog command: [erin at tp-login2 swift]$ swift-osg-ress-site-catalog --engage-verified Neither the environment variable CONDOR_CONFIG, /etc/condor/, nor ~condor/ contain a condor_config source. Either set CONDOR_CONFIG to point to a valid config source, or put a "condor_config" file in /etc/condor or ~condor/ Exiting. [erin at tp-login2 swift]$ Any help is much appreciated. Thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde Sent: Fri 6/26/2009 2:14 PM To: Melinda Chin Cc: Swift User Discussion List Subject: Re: [swift-user] List of OSG sites From a practical point of view, you can get the sites in the OSG "engagement" VO using this Swift command, which returns the list as a sites.xml file: swift-osg-ress-site-catalog --engage-verified That command also has options for getting sites from other VOs swift-osg-ress-site-catalog --vo=osg On 6/26/09 2:04 PM, Melinda Chin wrote: > Here's what I think is a simple question but can't find the answer to? > Where can I find a list of all the osg sites? > > Thank you, > Melinda Chin > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user _______________________________________________ Swift-user mailing list Swift-user at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From hockyg at uchicago.edu Fri Jun 26 15:24:46 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Fri, 26 Jun 2009 15:24:46 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> Message-ID: It's likely this error can be fixed by the same thing Zhao did to get condor working last week (since the command works for me). Try putting @osg and +osg-client into your ~/.soft file and then resetting up by either logging back in or typing "resoft;source ~/.bashrc" for reference, here is my ~/.soft file @python-2.5 +java-sun +osg-client +maui +torque +R +matlab-7.7 +apache-ant +gx-map @osg @default @globus-4 On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin wrote: > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog --engage-verified > > Neither the environment variable CONDOR_CONFIG, > /etc/condor/, nor ~condor/ contain a condor_config source. > Either set CONDOR_CONFIG to point to a valid config source, > or put a "condor_config" file in /etc/condor or ~condor/ > Exiting. > > > > > [erin at tp-login2 swift]$ > > Any help is much appreciated. > > Thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > > -----Original Message----- > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > Sent: Fri 6/26/2009 2:14 PM > To: Melinda Chin > Cc: Swift User Discussion List > Subject: Re: [swift-user] List of OSG sites > > From a practical point of view, you can get the sites in the OSG > "engagement" VO using this Swift command, which returns the list as a > sites.xml file: > > swift-osg-ress-site-catalog --engage-verified > > That command also has options for getting sites from other VOs > > swift-osg-ress-site-catalog --vo=osg > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > Here's what I think is a simple question but can't find the answer to? > > Where can I find a list of all the osg sites? > > > > Thank you, > > Melinda Chin > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Fri Jun 26 15:33:57 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Fri, 26 Jun 2009 15:33:57 -0500 Subject: [swift-user] List of OSG sites References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com><4A451E1C.4020406@mcs.anl.gov><70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> This is getting weird: [erin at tp-login2 ~]$ swift-osg-ress-site-catalog --engage-verified -bash: swift-osg-ress-site-catalog: command not found [erin at tp-login2 ~]$ Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: hockyg at gmail.com on behalf of Glen Hocky Sent: Fri 6/26/2009 3:24 PM To: Hodgess, Erin Cc: Michael Wilde; Melinda Chin; Swift User Discussion List Subject: Re: [swift-user] List of OSG sites It's likely this error can be fixed by the same thing Zhao did to get condor working last week (since the command works for me). Try putting @osg and +osg-client into your ~/.soft file and then resetting up by either logging back in or typing "resoft;source ~/.bashrc" for reference, here is my ~/.soft file @python-2.5 +java-sun +osg-client +maui +torque +R +matlab-7.7 +apache-ant +gx-map @osg @default @globus-4 On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin wrote: > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog --engage-verified > > Neither the environment variable CONDOR_CONFIG, > /etc/condor/, nor ~condor/ contain a condor_config source. > Either set CONDOR_CONFIG to point to a valid config source, > or put a "condor_config" file in /etc/condor or ~condor/ > Exiting. > > > > > [erin at tp-login2 swift]$ > > Any help is much appreciated. > > Thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > > -----Original Message----- > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > Sent: Fri 6/26/2009 2:14 PM > To: Melinda Chin > Cc: Swift User Discussion List > Subject: Re: [swift-user] List of OSG sites > > From a practical point of view, you can get the sites in the OSG > "engagement" VO using this Swift command, which returns the list as a > sites.xml file: > > swift-osg-ress-site-catalog --engage-verified > > That command also has options for getting sites from other VOs > > swift-osg-ress-site-catalog --vo=osg > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > Here's what I think is a simple question but can't find the answer to? > > Where can I find a list of all the osg sites? > > > > Thank you, > > Melinda Chin > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hockyg at uchicago.edu Fri Jun 26 15:39:53 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Fri, 26 Jun 2009 15:39:53 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> Message-ID: did you remember to source bashrc or similar to resetup your path? resoft destroys your custom path information. you need to have the swift bin directory in your path to have this command work... On Fri, Jun 26, 2009 at 3:33 PM, Hodgess, Erin wrote: > This is getting weird: > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog --engage-verified > -bash: swift-osg-ress-site-catalog: command not found > [erin at tp-login2 ~]$ > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > -----Original Message----- > From: hockyg at gmail.com on behalf of Glen Hocky > Sent: Fri 6/26/2009 3:24 PM > To: Hodgess, Erin > Cc: Michael Wilde; Melinda Chin; Swift User Discussion List > Subject: Re: [swift-user] List of OSG sites > > It's likely this error can be fixed by the same thing Zhao did to get > condor > working last week (since the command works for me). Try putting @osg and > +osg-client into your ~/.soft file and then resetting up by either logging > back in or typing "resoft;source ~/.bashrc" > > for reference, here is my ~/.soft file > > @python-2.5 > +java-sun > +osg-client > +maui > +torque > +R > +matlab-7.7 > +apache-ant > +gx-map > @osg > @default > @globus-4 > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin wrote: > > > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog --engage-verified > > > > Neither the environment variable CONDOR_CONFIG, > > /etc/condor/, nor ~condor/ contain a condor_config source. > > Either set CONDOR_CONFIG to point to a valid config source, > > or put a "condor_config" file in /etc/condor or ~condor/ > > Exiting. > > > > > > > > > > [erin at tp-login2 swift]$ > > > > Any help is much appreciated. > > > > Thanks, > > Erin > > > > > > Erin M. Hodgess, PhD > > Associate Professor > > Department of Computer and Mathematical Sciences > > University of Houston - Downtown > > mailto: hodgesse at uhd.edu > > > > > > > > > > -----Original Message----- > > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > > Sent: Fri 6/26/2009 2:14 PM > > To: Melinda Chin > > Cc: Swift User Discussion List > > Subject: Re: [swift-user] List of OSG sites > > > > From a practical point of view, you can get the sites in the OSG > > "engagement" VO using this Swift command, which returns the list as a > > sites.xml file: > > > > swift-osg-ress-site-catalog --engage-verified > > > > That command also has options for getting sites from other VOs > > > > swift-osg-ress-site-catalog --vo=osg > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > Here's what I think is a simple question but can't find the answer to? > > > Where can I find a list of all the osg sites? > > > > > > Thank you, > > > Melinda Chin > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Fri Jun 26 15:41:05 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 26 Jun 2009 15:41:05 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com><4A451E1C.4020406@mcs.anl.gov><70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> Message-ID: <4A453261.9090300@mcs.anl.gov> If you added your swift/bin directory to your path manually, when you run resoft you may loose that and need to do it again. On 6/26/09 3:33 PM, Hodgess, Erin wrote: > This is getting weird: > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog --engage-verified > -bash: swift-osg-ress-site-catalog: command not found > [erin at tp-login2 ~]$ > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > -----Original Message----- > From: hockyg at gmail.com on behalf of Glen Hocky > Sent: Fri 6/26/2009 3:24 PM > To: Hodgess, Erin > Cc: Michael Wilde; Melinda Chin; Swift User Discussion List > Subject: Re: [swift-user] List of OSG sites > > It's likely this error can be fixed by the same thing Zhao did to get condor > working last week (since the command works for me). Try putting @osg and > +osg-client into your ~/.soft file and then resetting up by either logging > back in or typing "resoft;source ~/.bashrc" > > for reference, here is my ~/.soft file > > @python-2.5 > +java-sun > +osg-client > +maui > +torque > +R > +matlab-7.7 > +apache-ant > +gx-map > @osg > @default > @globus-4 > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin wrote: > > > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog --engage-verified > > > > Neither the environment variable CONDOR_CONFIG, > > /etc/condor/, nor ~condor/ contain a condor_config source. > > Either set CONDOR_CONFIG to point to a valid config source, > > or put a "condor_config" file in /etc/condor or ~condor/ > > Exiting. > > > > > > > > > > [erin at tp-login2 swift]$ > > > > Any help is much appreciated. > > > > Thanks, > > Erin > > > > > > Erin M. Hodgess, PhD > > Associate Professor > > Department of Computer and Mathematical Sciences > > University of Houston - Downtown > > mailto: hodgesse at uhd.edu > > > > > > > > > > -----Original Message----- > > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > > Sent: Fri 6/26/2009 2:14 PM > > To: Melinda Chin > > Cc: Swift User Discussion List > > Subject: Re: [swift-user] List of OSG sites > > > > From a practical point of view, you can get the sites in the OSG > > "engagement" VO using this Swift command, which returns the list as a > > sites.xml file: > > > > swift-osg-ress-site-catalog --engage-verified > > > > That command also has options for getting sites from other VOs > > > > swift-osg-ress-site-catalog --vo=osg > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > Here's what I think is a simple question but can't find the answer to? > > > Where can I find a list of all the osg sites? > > > > > > Thank you, > > > Melinda Chin > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > From HodgessE at uhd.edu Fri Jun 26 15:43:48 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Fri, 26 Jun 2009 15:43:48 -0500 Subject: [swift-user] List of OSG sites References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com><4A451E1C.4020406@mcs.anl.gov><70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus><70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C370D2@BALI.uhd.campus> Hi again Here's what's going on: [erin at tp-login2 bin]$ pwd /home/erin/swift-0.9/bin [erin at tp-login2 bin]$ cat ~/.soft # # This is your SoftEnv configuration run control file. # # It is used to tell SoftEnv how to customize your environment by # setting up variables such as PATH and MANPATH. To learn more # about this file, do a "man softenv". # @python-2.5 +java-sun +apache-ant +gx-map +condor +gx-map @globus-4 @default +R +torque +maui +matlab-7.7 +osg-client #+osg-client-1.0.0-r1 @osg +apache-ant +gx-map [erin at tp-login2 bin]$ source ~/.bashrc [erin at tp-login2 bin]$ swift-osg-ress-site-catalog --engage-verified -bash: swift-osg-ress-site-catalog: command not found [erin at tp-login2 bin]$ Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: hockyg at gmail.com on behalf of Glen Hocky Sent: Fri 6/26/2009 3:39 PM To: Hodgess, Erin Cc: Michael Wilde; Melinda Chin; Swift User Discussion List Subject: Re: [swift-user] List of OSG sites did you remember to source bashrc or similar to resetup your path? resoft destroys your custom path information. you need to have the swift bin directory in your path to have this command work... On Fri, Jun 26, 2009 at 3:33 PM, Hodgess, Erin wrote: > This is getting weird: > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog --engage-verified > -bash: swift-osg-ress-site-catalog: command not found > [erin at tp-login2 ~]$ > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > -----Original Message----- > From: hockyg at gmail.com on behalf of Glen Hocky > Sent: Fri 6/26/2009 3:24 PM > To: Hodgess, Erin > Cc: Michael Wilde; Melinda Chin; Swift User Discussion List > Subject: Re: [swift-user] List of OSG sites > > It's likely this error can be fixed by the same thing Zhao did to get > condor > working last week (since the command works for me). Try putting @osg and > +osg-client into your ~/.soft file and then resetting up by either logging > back in or typing "resoft;source ~/.bashrc" > > for reference, here is my ~/.soft file > > @python-2.5 > +java-sun > +osg-client > +maui > +torque > +R > +matlab-7.7 > +apache-ant > +gx-map > @osg > @default > @globus-4 > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin wrote: > > > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog --engage-verified > > > > Neither the environment variable CONDOR_CONFIG, > > /etc/condor/, nor ~condor/ contain a condor_config source. > > Either set CONDOR_CONFIG to point to a valid config source, > > or put a "condor_config" file in /etc/condor or ~condor/ > > Exiting. > > > > > > > > > > [erin at tp-login2 swift]$ > > > > Any help is much appreciated. > > > > Thanks, > > Erin > > > > > > Erin M. Hodgess, PhD > > Associate Professor > > Department of Computer and Mathematical Sciences > > University of Houston - Downtown > > mailto: hodgesse at uhd.edu > > > > > > > > > > -----Original Message----- > > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > > Sent: Fri 6/26/2009 2:14 PM > > To: Melinda Chin > > Cc: Swift User Discussion List > > Subject: Re: [swift-user] List of OSG sites > > > > From a practical point of view, you can get the sites in the OSG > > "engagement" VO using this Swift command, which returns the list as a > > sites.xml file: > > > > swift-osg-ress-site-catalog --engage-verified > > > > That command also has options for getting sites from other VOs > > > > swift-osg-ress-site-catalog --vo=osg > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > Here's what I think is a simple question but can't find the answer to? > > > Where can I find a list of all the osg sites? > > > > > > Thank you, > > > Melinda Chin > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 26 15:59:02 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 26 Jun 2009 15:59:02 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <4A453261.9090300@mcs.anl.gov> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> Message-ID: <1246049942.14074.2.camel@localhost> Doesn't work. My .soft: @default @globus-4 +java-sun +torque +maui +condor @osg +osg-client [hategan at tp-login2 ~]$ condor_status Neither the environment variable CONDOR_CONFIG, /etc/condor/, nor ~condor/ contain a condor_config source. Either set CONDOR_CONFIG to point to a valid config source, or put a "condor_config" file in /etc/condor or ~condor/ Exiting. On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde wrote: > If you added your swift/bin directory to your path manually, when you > run resoft you may loose that and need to do it again. > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > This is getting weird: > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog --engage-verified > > -bash: swift-osg-ress-site-catalog: command not found > > [erin at tp-login2 ~]$ > > > > Erin M. Hodgess, PhD > > Associate Professor > > Department of Computer and Mathematical Sciences > > University of Houston - Downtown > > mailto: hodgesse at uhd.edu > > > > > > > > -----Original Message----- > > From: hockyg at gmail.com on behalf of Glen Hocky > > Sent: Fri 6/26/2009 3:24 PM > > To: Hodgess, Erin > > Cc: Michael Wilde; Melinda Chin; Swift User Discussion List > > Subject: Re: [swift-user] List of OSG sites > > > > It's likely this error can be fixed by the same thing Zhao did to get condor > > working last week (since the command works for me). Try putting @osg and > > +osg-client into your ~/.soft file and then resetting up by either logging > > back in or typing "resoft;source ~/.bashrc" > > > > for reference, here is my ~/.soft file > > > > @python-2.5 > > +java-sun > > +osg-client > > +maui > > +torque > > +R > > +matlab-7.7 > > +apache-ant > > +gx-map > > @osg > > @default > > @globus-4 > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin wrote: > > > > > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog --engage-verified > > > > > > Neither the environment variable CONDOR_CONFIG, > > > /etc/condor/, nor ~condor/ contain a condor_config source. > > > Either set CONDOR_CONFIG to point to a valid config source, > > > or put a "condor_config" file in /etc/condor or ~condor/ > > > Exiting. > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > Any help is much appreciated. > > > > > > Thanks, > > > Erin > > > > > > > > > Erin M. Hodgess, PhD > > > Associate Professor > > > Department of Computer and Mathematical Sciences > > > University of Houston - Downtown > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > -----Original Message----- > > > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > > > Sent: Fri 6/26/2009 2:14 PM > > > To: Melinda Chin > > > Cc: Swift User Discussion List > > > Subject: Re: [swift-user] List of OSG sites > > > > > > From a practical point of view, you can get the sites in the OSG > > > "engagement" VO using this Swift command, which returns the list as a > > > sites.xml file: > > > > > > swift-osg-ress-site-catalog --engage-verified > > > > > > That command also has options for getting sites from other VOs > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > Here's what I think is a simple question but can't find the answer to? > > > > Where can I find a list of all the osg sites? > > > > > > > > Thank you, > > > > Melinda Chin > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From hockyg at uchicago.edu Fri Jun 26 16:17:10 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Fri, 26 Jun 2009 16:17:10 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <1246049942.14074.2.camel@localhost> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> Message-ID: I never claimed the "condor_status" command works. but with my setup [hockyg at communicado ~]$ swift-osg-ress-site-catalog | head /grid/data/engage/tmp/FNAL_DZEROOSG_1 On Fri, Jun 26, 2009 at 3:59 PM, Mihael Hategan wrote: > Doesn't work. > > My .soft: > @default > @globus-4 > +java-sun > +torque > +maui > +condor > @osg > +osg-client > > [hategan at tp-login2 ~]$ condor_status > > Neither the environment variable CONDOR_CONFIG, > /etc/condor/, nor ~condor/ contain a condor_config source. > Either set CONDOR_CONFIG to point to a valid config source, > or put a "condor_config" file in /etc/condor or ~condor/ > Exiting. > > > On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde wrote: > > If you added your swift/bin directory to your path manually, when you > > run resoft you may loose that and need to do it again. > > > > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > > This is getting weird: > > > > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog --engage-verified > > > -bash: swift-osg-ress-site-catalog: command not found > > > [erin at tp-login2 ~]$ > > > > > > Erin M. Hodgess, PhD > > > Associate Professor > > > Department of Computer and Mathematical Sciences > > > University of Houston - Downtown > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > -----Original Message----- > > > From: hockyg at gmail.com on behalf of Glen Hocky > > > Sent: Fri 6/26/2009 3:24 PM > > > To: Hodgess, Erin > > > Cc: Michael Wilde; Melinda Chin; Swift User Discussion List > > > Subject: Re: [swift-user] List of OSG sites > > > > > > It's likely this error can be fixed by the same thing Zhao did to get > condor > > > working last week (since the command works for me). Try putting @osg > and > > > +osg-client into your ~/.soft file and then resetting up by either > logging > > > back in or typing "resoft;source ~/.bashrc" > > > > > > for reference, here is my ~/.soft file > > > > > > @python-2.5 > > > +java-sun > > > +osg-client > > > +maui > > > +torque > > > +R > > > +matlab-7.7 > > > +apache-ant > > > +gx-map > > > @osg > > > @default > > > @globus-4 > > > > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin > wrote: > > > > > > > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog > --engage-verified > > > > > > > > Neither the environment variable CONDOR_CONFIG, > > > > /etc/condor/, nor ~condor/ contain a condor_config source. > > > > Either set CONDOR_CONFIG to point to a valid config source, > > > > or put a "condor_config" file in /etc/condor or ~condor/ > > > > Exiting. > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > > > Any help is much appreciated. > > > > > > > > Thanks, > > > > Erin > > > > > > > > > > > > Erin M. Hodgess, PhD > > > > Associate Professor > > > > Department of Computer and Mathematical Sciences > > > > University of Houston - Downtown > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > > > > Sent: Fri 6/26/2009 2:14 PM > > > > To: Melinda Chin > > > > Cc: Swift User Discussion List > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > From a practical point of view, you can get the sites in the OSG > > > > "engagement" VO using this Swift command, which returns the list as > a > > > > sites.xml file: > > > > > > > > swift-osg-ress-site-catalog --engage-verified > > > > > > > > That command also has options for getting sites from other VOs > > > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > > Here's what I think is a simple question but can't find the answer > to? > > > > > Where can I find a list of all the osg sites? > > > > > > > > > > Thank you, > > > > > Melinda Chin > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hockyg at uchicago.edu Fri Jun 26 16:18:46 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Fri, 26 Jun 2009 16:18:46 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C370D2@BALI.uhd.campus> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D2@BALI.uhd.campus> Message-ID: in this setup you could do "./swift-osg-ress-site-catalog" you should also put the line "export PATH=$PATH:/home/erin/swift-0.9/bin" in your .bashrc file before running the source command On Fri, Jun 26, 2009 at 3:43 PM, Hodgess, Erin wrote: > Hi again > > Here's what's going on: > > [erin at tp-login2 bin]$ pwd > /home/erin/swift-0.9/bin > [erin at tp-login2 bin]$ cat ~/.soft > # > # This is your SoftEnv configuration run control file. > # > # It is used to tell SoftEnv how to customize your environment by > # setting up variables such as PATH and MANPATH. To learn more > # about this file, do a "man softenv". > # > @python-2.5 > > +java-sun > +apache-ant > +gx-map > +condor > +gx-map > @globus-4 > @default > +R > +torque > +maui > +matlab-7.7 > +osg-client > #+osg-client-1.0.0-r1 > @osg > +apache-ant > +gx-map > [erin at tp-login2 bin]$ source ~/.bashrc > [erin at tp-login2 bin]$ swift-osg-ress-site-catalog --engage-verified > -bash: swift-osg-ress-site-catalog: command not found > [erin at tp-login2 bin]$ > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > > -----Original Message----- > From: hockyg at gmail.com on behalf of Glen Hocky > Sent: Fri 6/26/2009 3:39 PM > To: Hodgess, Erin > Cc: Michael Wilde; Melinda Chin; Swift User Discussion List > Subject: Re: [swift-user] List of OSG sites > > did you remember to source bashrc or similar to resetup your path? resoft > destroys your custom path information. you need to have the swift bin > directory in your path to have this command work... > > On Fri, Jun 26, 2009 at 3:33 PM, Hodgess, Erin wrote: > > > This is getting weird: > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog --engage-verified > > -bash: swift-osg-ress-site-catalog: command not found > > [erin at tp-login2 ~]$ > > > > Erin M. Hodgess, PhD > > Associate Professor > > Department of Computer and Mathematical Sciences > > University of Houston - Downtown > > mailto: hodgesse at uhd.edu > > > > > > > > -----Original Message----- > > From: hockyg at gmail.com on behalf of Glen Hocky > > Sent: Fri 6/26/2009 3:24 PM > > To: Hodgess, Erin > > Cc: Michael Wilde; Melinda Chin; Swift User Discussion List > > Subject: Re: [swift-user] List of OSG sites > > > > It's likely this error can be fixed by the same thing Zhao did to get > > condor > > working last week (since the command works for me). Try putting @osg and > > +osg-client into your ~/.soft file and then resetting up by either > logging > > back in or typing "resoft;source ~/.bashrc" > > > > for reference, here is my ~/.soft file > > > > @python-2.5 > > +java-sun > > +osg-client > > +maui > > +torque > > +R > > +matlab-7.7 > > +apache-ant > > +gx-map > > @osg > > @default > > @globus-4 > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin wrote: > > > > > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog --engage-verified > > > > > > Neither the environment variable CONDOR_CONFIG, > > > /etc/condor/, nor ~condor/ contain a condor_config source. > > > Either set CONDOR_CONFIG to point to a valid config source, > > > or put a "condor_config" file in /etc/condor or ~condor/ > > > Exiting. > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > Any help is much appreciated. > > > > > > Thanks, > > > Erin > > > > > > > > > Erin M. Hodgess, PhD > > > Associate Professor > > > Department of Computer and Mathematical Sciences > > > University of Houston - Downtown > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > -----Original Message----- > > > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > > > Sent: Fri 6/26/2009 2:14 PM > > > To: Melinda Chin > > > Cc: Swift User Discussion List > > > Subject: Re: [swift-user] List of OSG sites > > > > > > From a practical point of view, you can get the sites in the OSG > > > "engagement" VO using this Swift command, which returns the list as a > > > sites.xml file: > > > > > > swift-osg-ress-site-catalog --engage-verified > > > > > > That command also has options for getting sites from other VOs > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > Here's what I think is a simple question but can't find the answer > to? > > > > Where can I find a list of all the osg sites? > > > > > > > > Thank you, > > > > Melinda Chin > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hockyg at uchicago.edu Fri Jun 26 16:22:02 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Fri, 26 Jun 2009 16:22:02 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <1246049942.14074.2.camel@localhost> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> Message-ID: By the way, in reference to my previous e-mail. the result of my condor status is: condor_status CEDAR:6001:Failed to connect to <128.135.125.117:9618> Error: Couldn't contact the condor_collector on tp-login2.ci.uchicago.edu. Extra Info: the condor_collector is a process that runs on the central manager of your Condor pool and collects the status of all the machines and jobs in the Condor pool. The condor_collector might not be running, it might be refusing to communicate with you, there might be a network problem, or there may be some other problem. Check with your system administrator to fix this problem. If you are the system administrator, check that the condor_collector is running on tp-login2.ci.uchicago.edu, check the HOSTALLOW configuration in your condor_config, and check the MasterLog and CollectorLog files in your log directory for possible clues as to why the condor_collector is not responding. Also see the Troubleshooting section of the manual. which is different from yours. try putting @osg and +osg-client before @default On Fri, Jun 26, 2009 at 3:59 PM, Mihael Hategan wrote: > Doesn't work. > > My .soft: > @default > @globus-4 > +java-sun > +torque > +maui > +condor > @osg > +osg-client > > [hategan at tp-login2 ~]$ condor_status > > Neither the environment variable CONDOR_CONFIG, > /etc/condor/, nor ~condor/ contain a condor_config source. > Either set CONDOR_CONFIG to point to a valid config source, > or put a "condor_config" file in /etc/condor or ~condor/ > Exiting. > > > On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde wrote: > > If you added your swift/bin directory to your path manually, when you > > run resoft you may loose that and need to do it again. > > > > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > > This is getting weird: > > > > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog --engage-verified > > > -bash: swift-osg-ress-site-catalog: command not found > > > [erin at tp-login2 ~]$ > > > > > > Erin M. Hodgess, PhD > > > Associate Professor > > > Department of Computer and Mathematical Sciences > > > University of Houston - Downtown > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > -----Original Message----- > > > From: hockyg at gmail.com on behalf of Glen Hocky > > > Sent: Fri 6/26/2009 3:24 PM > > > To: Hodgess, Erin > > > Cc: Michael Wilde; Melinda Chin; Swift User Discussion List > > > Subject: Re: [swift-user] List of OSG sites > > > > > > It's likely this error can be fixed by the same thing Zhao did to get > condor > > > working last week (since the command works for me). Try putting @osg > and > > > +osg-client into your ~/.soft file and then resetting up by either > logging > > > back in or typing "resoft;source ~/.bashrc" > > > > > > for reference, here is my ~/.soft file > > > > > > @python-2.5 > > > +java-sun > > > +osg-client > > > +maui > > > +torque > > > +R > > > +matlab-7.7 > > > +apache-ant > > > +gx-map > > > @osg > > > @default > > > @globus-4 > > > > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin > wrote: > > > > > > > I'm having trouble with the swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog > --engage-verified > > > > > > > > Neither the environment variable CONDOR_CONFIG, > > > > /etc/condor/, nor ~condor/ contain a condor_config source. > > > > Either set CONDOR_CONFIG to point to a valid config source, > > > > or put a "condor_config" file in /etc/condor or ~condor/ > > > > Exiting. > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > > > Any help is much appreciated. > > > > > > > > Thanks, > > > > Erin > > > > > > > > > > > > Erin M. Hodgess, PhD > > > > Associate Professor > > > > Department of Computer and Mathematical Sciences > > > > University of Houston - Downtown > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: swift-user-bounces at ci.uchicago.edu on behalf of Michael Wilde > > > > Sent: Fri 6/26/2009 2:14 PM > > > > To: Melinda Chin > > > > Cc: Swift User Discussion List > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > From a practical point of view, you can get the sites in the OSG > > > > "engagement" VO using this Swift command, which returns the list as > a > > > > sites.xml file: > > > > > > > > swift-osg-ress-site-catalog --engage-verified > > > > > > > > That command also has options for getting sites from other VOs > > > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > > Here's what I think is a simple question but can't find the answer > to? > > > > > Where can I find a list of all the osg sites? > > > > > > > > > > Thank you, > > > > > Melinda Chin > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 26 16:26:59 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 26 Jun 2009 16:26:59 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> Message-ID: <1246051619.14751.4.camel@localhost> On Fri, 2009-06-26 at 16:17 -0500, Glen Hocky wrote: > I never claimed the "condor_status" command works. but with my setup No, but I claim that it must work in order for ?swift-osg-ress-site-catalog to work. Now obviously something isn't right here, since nor Erin nor I can get it to work, including by using your .soft verbatim. > > [hockyg at communicado ~]$ swift-osg-ress-site-catalog | head > > > > > > url="d0cabosg1.fnal.gov/jobmanager-pbs" major="2" /> > >/grid/data/engage/tmp/FNAL_DZEROOSG_1 > > > > > > > On Fri, Jun 26, 2009 at 3:59 PM, Mihael Hategan > wrote: > Doesn't work. > > My .soft: > @default > @globus-4 > +java-sun > +torque > +maui > +condor > @osg > +osg-client > > [hategan at tp-login2 ~]$ condor_status > > Neither the environment variable CONDOR_CONFIG, > /etc/condor/, nor ~condor/ contain a condor_config source. > Either set CONDOR_CONFIG to point to a valid config source, > or put a "condor_config" file in /etc/condor or ~condor/ > Exiting. > > > > > On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde wrote: > > If you added your swift/bin directory to your path manually, > when you > > run resoft you may loose that and need to do it again. > > > > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > > This is getting weird: > > > > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog > --engage-verified > > > -bash: swift-osg-ress-site-catalog: command not found > > > [erin at tp-login2 ~]$ > > > > > > Erin M. Hodgess, PhD > > > Associate Professor > > > Department of Computer and Mathematical Sciences > > > University of Houston - Downtown > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > -----Original Message----- > > > From: hockyg at gmail.com on behalf of Glen Hocky > > > Sent: Fri 6/26/2009 3:24 PM > > > To: Hodgess, Erin > > > Cc: Michael Wilde; Melinda Chin; Swift User Discussion > List > > > Subject: Re: [swift-user] List of OSG sites > > > > > > It's likely this error can be fixed by the same thing Zhao > did to get condor > > > working last week (since the command works for me). Try > putting @osg and > > > +osg-client into your ~/.soft file and then resetting up > by either logging > > > back in or typing "resoft;source ~/.bashrc" > > > > > > for reference, here is my ~/.soft file > > > > > > @python-2.5 > > > +java-sun > > > +osg-client > > > +maui > > > +torque > > > +R > > > +matlab-7.7 > > > +apache-ant > > > +gx-map > > > @osg > > > @default > > > @globus-4 > > > > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin > wrote: > > > > > > > I'm having trouble with the > swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog > --engage-verified > > > > > > > > Neither the environment variable CONDOR_CONFIG, > > > > /etc/condor/, nor ~condor/ contain a condor_config > source. > > > > Either set CONDOR_CONFIG to point to a valid config > source, > > > > or put a "condor_config" file in /etc/condor or > ~condor/ > > > > Exiting. > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > > > Any help is much appreciated. > > > > > > > > Thanks, > > > > Erin > > > > > > > > > > > > Erin M. Hodgess, PhD > > > > Associate Professor > > > > Department of Computer and Mathematical Sciences > > > > University of Houston - Downtown > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: swift-user-bounces at ci.uchicago.edu on behalf of > Michael Wilde > > > > Sent: Fri 6/26/2009 2:14 PM > > > > To: Melinda Chin > > > > Cc: Swift User Discussion List > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > From a practical point of view, you can get the sites > in the OSG > > > > "engagement" VO using this Swift command, which returns > the list as a > > > > sites.xml file: > > > > > > > > swift-osg-ress-site-catalog --engage-verified > > > > > > > > That command also has options for getting sites from > other VOs > > > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > > Here's what I think is a simple question but can't > find the answer to? > > > > > Where can I find a list of all the osg sites? > > > > > > > > > > Thank you, > > > > > Melinda Chin > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > From hategan at mcs.anl.gov Fri Jun 26 16:34:31 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 26 Jun 2009 16:34:31 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> Message-ID: <1246052071.14751.8.camel@localhost> On Fri, 2009-06-26 at 16:22 -0500, Glen Hocky wrote: > By the way, in reference to my previous e-mail. the result of my > condor status is: Yep, this means it's working, but there is no condor pool on tp. Erin, try this: source /autonfs/software/linux-rhel4-x86_64/osg-client-1.0.0-r1/setup.sh Then swift-osg... > > condor_status > CEDAR:6001:Failed to connect to <128.135.125.117:9618> > Error: Couldn't contact the condor_collector on > tp-login2.ci.uchicago.edu. > > Extra Info: the condor_collector is a process that runs on the > central > manager of your Condor pool and collects the status of all the > machines and > jobs in the Condor pool. The condor_collector might not be running, it > might > be refusing to communicate with you, there might be a network problem, > or > there may be some other problem. Check with your system administrator > to fix > this problem. > > If you are the system administrator, check that the condor_collector > is > running on tp-login2.ci.uchicago.edu, check the HOSTALLOW > configuration in > your condor_config, and check the MasterLog and CollectorLog files in > your > log directory for possible clues as to why the condor_collector is > not > responding. Also see the Troubleshooting section of the manual. > > which is different from yours. try putting @osg and +osg-client before > @default > > On Fri, Jun 26, 2009 at 3:59 PM, Mihael Hategan > wrote: > Doesn't work. > > My .soft: > @default > @globus-4 > +java-sun > +torque > +maui > +condor > @osg > +osg-client > > [hategan at tp-login2 ~]$ condor_status > > Neither the environment variable CONDOR_CONFIG, > /etc/condor/, nor ~condor/ contain a condor_config source. > Either set CONDOR_CONFIG to point to a valid config source, > or put a "condor_config" file in /etc/condor or ~condor/ > Exiting. > > > > > On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde wrote: > > If you added your swift/bin directory to your path manually, > when you > > run resoft you may loose that and need to do it again. > > > > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > > This is getting weird: > > > > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog > --engage-verified > > > -bash: swift-osg-ress-site-catalog: command not found > > > [erin at tp-login2 ~]$ > > > > > > Erin M. Hodgess, PhD > > > Associate Professor > > > Department of Computer and Mathematical Sciences > > > University of Houston - Downtown > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > -----Original Message----- > > > From: hockyg at gmail.com on behalf of Glen Hocky > > > Sent: Fri 6/26/2009 3:24 PM > > > To: Hodgess, Erin > > > Cc: Michael Wilde; Melinda Chin; Swift User Discussion > List > > > Subject: Re: [swift-user] List of OSG sites > > > > > > It's likely this error can be fixed by the same thing Zhao > did to get condor > > > working last week (since the command works for me). Try > putting @osg and > > > +osg-client into your ~/.soft file and then resetting up > by either logging > > > back in or typing "resoft;source ~/.bashrc" > > > > > > for reference, here is my ~/.soft file > > > > > > @python-2.5 > > > +java-sun > > > +osg-client > > > +maui > > > +torque > > > +R > > > +matlab-7.7 > > > +apache-ant > > > +gx-map > > > @osg > > > @default > > > @globus-4 > > > > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin > wrote: > > > > > > > I'm having trouble with the > swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog > --engage-verified > > > > > > > > Neither the environment variable CONDOR_CONFIG, > > > > /etc/condor/, nor ~condor/ contain a condor_config > source. > > > > Either set CONDOR_CONFIG to point to a valid config > source, > > > > or put a "condor_config" file in /etc/condor or > ~condor/ > > > > Exiting. > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > > > Any help is much appreciated. > > > > > > > > Thanks, > > > > Erin > > > > > > > > > > > > Erin M. Hodgess, PhD > > > > Associate Professor > > > > Department of Computer and Mathematical Sciences > > > > University of Houston - Downtown > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: swift-user-bounces at ci.uchicago.edu on behalf of > Michael Wilde > > > > Sent: Fri 6/26/2009 2:14 PM > > > > To: Melinda Chin > > > > Cc: Swift User Discussion List > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > From a practical point of view, you can get the sites > in the OSG > > > > "engagement" VO using this Swift command, which returns > the list as a > > > > sites.xml file: > > > > > > > > swift-osg-ress-site-catalog --engage-verified > > > > > > > > That command also has options for getting sites from > other VOs > > > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > > Here's what I think is a simple question but can't > find the answer to? > > > > > Where can I find a list of all the osg sites? > > > > > > > > > > Thank you, > > > > > Melinda Chin > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > From hockyg at uchicago.edu Fri Jun 26 16:36:20 2009 From: hockyg at uchicago.edu (Glen Hocky) Date: Fri, 26 Jun 2009 16:36:20 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: <1246052071.14751.8.camel@localhost> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> <1246052071.14751.8.camel@localhost> Message-ID: that would be odd if it were necessary, since the only related thing i use is source /opt/osg/setup.sh which only works on tp-osg On Fri, Jun 26, 2009 at 4:34 PM, Mihael Hategan wrote: > On Fri, 2009-06-26 at 16:22 -0500, Glen Hocky wrote: > > By the way, in reference to my previous e-mail. the result of my > > condor status is: > > Yep, this means it's working, but there is no condor pool on tp. > > Erin, try this: > source /autonfs/software/linux-rhel4-x86_64/osg-client-1.0.0-r1/setup.sh > > Then swift-osg... > > > > > condor_status > > CEDAR:6001:Failed to connect to <128.135.125.117:9618> > > Error: Couldn't contact the condor_collector on > > tp-login2.ci.uchicago.edu. > > > > Extra Info: the condor_collector is a process that runs on the > > central > > manager of your Condor pool and collects the status of all the > > machines and > > jobs in the Condor pool. The condor_collector might not be running, it > > might > > be refusing to communicate with you, there might be a network problem, > > or > > there may be some other problem. Check with your system administrator > > to fix > > this problem. > > > > If you are the system administrator, check that the condor_collector > > is > > running on tp-login2.ci.uchicago.edu, check the HOSTALLOW > > configuration in > > your condor_config, and check the MasterLog and CollectorLog files in > > your > > log directory for possible clues as to why the condor_collector is > > not > > responding. Also see the Troubleshooting section of the manual. > > > > which is different from yours. try putting @osg and +osg-client before > > @default > > > > On Fri, Jun 26, 2009 at 3:59 PM, Mihael Hategan > > wrote: > > Doesn't work. > > > > My .soft: > > @default > > @globus-4 > > +java-sun > > +torque > > +maui > > +condor > > @osg > > +osg-client > > > > [hategan at tp-login2 ~]$ condor_status > > > > Neither the environment variable CONDOR_CONFIG, > > /etc/condor/, nor ~condor/ contain a condor_config source. > > Either set CONDOR_CONFIG to point to a valid config source, > > or put a "condor_config" file in /etc/condor or ~condor/ > > Exiting. > > > > > > > > > > On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde wrote: > > > If you added your swift/bin directory to your path manually, > > when you > > > run resoft you may loose that and need to do it again. > > > > > > > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > > > This is getting weird: > > > > > > > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog > > --engage-verified > > > > -bash: swift-osg-ress-site-catalog: command not found > > > > [erin at tp-login2 ~]$ > > > > > > > > Erin M. Hodgess, PhD > > > > Associate Professor > > > > Department of Computer and Mathematical Sciences > > > > University of Houston - Downtown > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: hockyg at gmail.com on behalf of Glen Hocky > > > > Sent: Fri 6/26/2009 3:24 PM > > > > To: Hodgess, Erin > > > > Cc: Michael Wilde; Melinda Chin; Swift User Discussion > > List > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > It's likely this error can be fixed by the same thing Zhao > > did to get condor > > > > working last week (since the command works for me). Try > > putting @osg and > > > > +osg-client into your ~/.soft file and then resetting up > > by either logging > > > > back in or typing "resoft;source ~/.bashrc" > > > > > > > > for reference, here is my ~/.soft file > > > > > > > > @python-2.5 > > > > +java-sun > > > > +osg-client > > > > +maui > > > > +torque > > > > +R > > > > +matlab-7.7 > > > > +apache-ant > > > > +gx-map > > > > @osg > > > > @default > > > > @globus-4 > > > > > > > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin > > wrote: > > > > > > > > > I'm having trouble with the > > swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog > > --engage-verified > > > > > > > > > > Neither the environment variable CONDOR_CONFIG, > > > > > /etc/condor/, nor ~condor/ contain a condor_config > > source. > > > > > Either set CONDOR_CONFIG to point to a valid config > > source, > > > > > or put a "condor_config" file in /etc/condor or > > ~condor/ > > > > > Exiting. > > > > > > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > > > > > Any help is much appreciated. > > > > > > > > > > Thanks, > > > > > Erin > > > > > > > > > > > > > > > Erin M. Hodgess, PhD > > > > > Associate Professor > > > > > Department of Computer and Mathematical Sciences > > > > > University of Houston - Downtown > > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: swift-user-bounces at ci.uchicago.edu on behalf of > > Michael Wilde > > > > > Sent: Fri 6/26/2009 2:14 PM > > > > > To: Melinda Chin > > > > > Cc: Swift User Discussion List > > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > > > From a practical point of view, you can get the sites > > in the OSG > > > > > "engagement" VO using this Swift command, which returns > > the list as a > > > > > sites.xml file: > > > > > > > > > > swift-osg-ress-site-catalog --engage-verified > > > > > > > > > > That command also has options for getting sites from > > other VOs > > > > > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > > > Here's what I think is a simple question but can't > > find the answer to? > > > > > > Where can I find a list of all the osg sites? > > > > > > > > > > > > Thank you, > > > > > > Melinda Chin > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-user mailing list > > > > > > Swift-user at ci.uchicago.edu > > > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 26 16:39:14 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 26 Jun 2009 16:39:14 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> <1246052071.14751.8.camel@localhost> Message-ID: <1246052354.15172.0.camel@localhost> Maybe it isn't necessary, but it it is sufficient for me. On Fri, 2009-06-26 at 16:36 -0500, Glen Hocky wrote: > that would be odd if it were necessary, since the only related thing i > use is > source /opt/osg/setup.sh > which only works on tp-osg > > On Fri, Jun 26, 2009 at 4:34 PM, Mihael Hategan > wrote: > On Fri, 2009-06-26 at 16:22 -0500, Glen Hocky wrote: > > By the way, in reference to my previous e-mail. the result > of my > > condor status is: > > > Yep, this means it's working, but there is no condor pool on > tp. > > Erin, try this: > source /autonfs/software/linux-rhel4-x86_64/osg-client-1.0.0-r1/setup.sh > > Then swift-osg... > > > > > > condor_status > > CEDAR:6001:Failed to connect to <128.135.125.117:9618> > > Error: Couldn't contact the condor_collector on > > tp-login2.ci.uchicago.edu. > > > > Extra Info: the condor_collector is a process that runs on > the > > central > > manager of your Condor pool and collects the status of all > the > > machines and > > jobs in the Condor pool. The condor_collector might not be > running, it > > might > > be refusing to communicate with you, there might be a > network problem, > > or > > there may be some other problem. Check with your system > administrator > > to fix > > this problem. > > > > If you are the system administrator, check that the > condor_collector > > is > > running on tp-login2.ci.uchicago.edu, check the HOSTALLOW > > configuration in > > your condor_config, and check the MasterLog and CollectorLog > files in > > your > > log directory for possible clues as to why the > condor_collector is > > not > > responding. Also see the Troubleshooting section of the > manual. > > > > which is different from yours. try putting @osg and > +osg-client before > > @default > > > > On Fri, Jun 26, 2009 at 3:59 PM, Mihael Hategan > > > wrote: > > Doesn't work. > > > > My .soft: > > @default > > @globus-4 > > +java-sun > > +torque > > +maui > > +condor > > @osg > > +osg-client > > > > [hategan at tp-login2 ~]$ condor_status > > > > Neither the environment variable CONDOR_CONFIG, > > /etc/condor/, nor ~condor/ contain a condor_config > source. > > Either set CONDOR_CONFIG to point to a valid config > source, > > or put a "condor_config" file in /etc/condor or > ~condor/ > > Exiting. > > > > > > > > > > On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde > wrote: > > > If you added your swift/bin directory to your path > manually, > > when you > > > run resoft you may loose that and need to do it > again. > > > > > > > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > > > This is getting weird: > > > > > > > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog > > --engage-verified > > > > -bash: swift-osg-ress-site-catalog: command not > found > > > > [erin at tp-login2 ~]$ > > > > > > > > Erin M. Hodgess, PhD > > > > Associate Professor > > > > Department of Computer and Mathematical Sciences > > > > University of Houston - Downtown > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: hockyg at gmail.com on behalf of Glen Hocky > > > > Sent: Fri 6/26/2009 3:24 PM > > > > To: Hodgess, Erin > > > > Cc: Michael Wilde; Melinda Chin; Swift User > Discussion > > List > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > It's likely this error can be fixed by the same > thing Zhao > > did to get condor > > > > working last week (since the command works for > me). Try > > putting @osg and > > > > +osg-client into your ~/.soft file and then > resetting up > > by either logging > > > > back in or typing "resoft;source ~/.bashrc" > > > > > > > > for reference, here is my ~/.soft file > > > > > > > > @python-2.5 > > > > +java-sun > > > > +osg-client > > > > +maui > > > > +torque > > > > +R > > > > +matlab-7.7 > > > > +apache-ant > > > > +gx-map > > > > @osg > > > > @default > > > > @globus-4 > > > > > > > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin > > wrote: > > > > > > > > > I'm having trouble with the > > swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > swift-osg-ress-site-catalog > > --engage-verified > > > > > > > > > > Neither the environment variable > CONDOR_CONFIG, > > > > > /etc/condor/, nor ~condor/ contain a > condor_config > > source. > > > > > Either set CONDOR_CONFIG to point to a valid > config > > source, > > > > > or put a "condor_config" file in /etc/condor > or > > ~condor/ > > > > > Exiting. > > > > > > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > > > > > Any help is much appreciated. > > > > > > > > > > Thanks, > > > > > Erin > > > > > > > > > > > > > > > Erin M. Hodgess, PhD > > > > > Associate Professor > > > > > Department of Computer and Mathematical > Sciences > > > > > University of Houston - Downtown > > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: swift-user-bounces at ci.uchicago.edu on > behalf of > > Michael Wilde > > > > > Sent: Fri 6/26/2009 2:14 PM > > > > > To: Melinda Chin > > > > > Cc: Swift User Discussion List > > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > > > From a practical point of view, you can get > the sites > > in the OSG > > > > > "engagement" VO using this Swift command, > which returns > > the list as a > > > > > sites.xml file: > > > > > > > > > > swift-osg-ress-site-catalog > --engage-verified > > > > > > > > > > That command also has options for getting > sites from > > other VOs > > > > > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > > > Here's what I think is a simple question > but can't > > find the answer to? > > > > > > Where can I find a list of all the osg > sites? > > > > > > > > > > > > Thank you, > > > > > > Melinda Chin > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-user mailing list > > > > > > Swift-user at ci.uchicago.edu > > > > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > From HodgessE at uhd.edu Fri Jun 26 16:41:19 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Fri, 26 Jun 2009 16:41:19 -0500 Subject: [swift-user] List of OSG sites References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com><4A451E1C.4020406@mcs.anl.gov><70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus><70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus><4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> <1246052071.14751.8.camel@localhost> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C370D4@BALI.uhd.campus> That's it! thanks very much! e Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: Mihael Hategan [mailto:hategan at mcs.anl.gov] Sent: Fri 6/26/2009 4:34 PM To: Glen Hocky Cc: Michael Wilde; Swift User Discussion List; Hodgess, Erin Subject: Re: [swift-user] List of OSG sites On Fri, 2009-06-26 at 16:22 -0500, Glen Hocky wrote: > By the way, in reference to my previous e-mail. the result of my > condor status is: Yep, this means it's working, but there is no condor pool on tp. Erin, try this: source /autonfs/software/linux-rhel4-x86_64/osg-client-1.0.0-r1/setup.sh Then swift-osg... > > condor_status > CEDAR:6001:Failed to connect to <128.135.125.117:9618> > Error: Couldn't contact the condor_collector on > tp-login2.ci.uchicago.edu. > > Extra Info: the condor_collector is a process that runs on the > central > manager of your Condor pool and collects the status of all the > machines and > jobs in the Condor pool. The condor_collector might not be running, it > might > be refusing to communicate with you, there might be a network problem, > or > there may be some other problem. Check with your system administrator > to fix > this problem. > > If you are the system administrator, check that the condor_collector > is > running on tp-login2.ci.uchicago.edu, check the HOSTALLOW > configuration in > your condor_config, and check the MasterLog and CollectorLog files in > your > log directory for possible clues as to why the condor_collector is > not > responding. Also see the Troubleshooting section of the manual. > > which is different from yours. try putting @osg and +osg-client before > @default > > On Fri, Jun 26, 2009 at 3:59 PM, Mihael Hategan > wrote: > Doesn't work. > > My .soft: > @default > @globus-4 > +java-sun > +torque > +maui > +condor > @osg > +osg-client > > [hategan at tp-login2 ~]$ condor_status > > Neither the environment variable CONDOR_CONFIG, > /etc/condor/, nor ~condor/ contain a condor_config source. > Either set CONDOR_CONFIG to point to a valid config source, > or put a "condor_config" file in /etc/condor or ~condor/ > Exiting. > > > > > On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde wrote: > > If you added your swift/bin directory to your path manually, > when you > > run resoft you may loose that and need to do it again. > > > > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > > This is getting weird: > > > > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog > --engage-verified > > > -bash: swift-osg-ress-site-catalog: command not found > > > [erin at tp-login2 ~]$ > > > > > > Erin M. Hodgess, PhD > > > Associate Professor > > > Department of Computer and Mathematical Sciences > > > University of Houston - Downtown > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > -----Original Message----- > > > From: hockyg at gmail.com on behalf of Glen Hocky > > > Sent: Fri 6/26/2009 3:24 PM > > > To: Hodgess, Erin > > > Cc: Michael Wilde; Melinda Chin; Swift User Discussion > List > > > Subject: Re: [swift-user] List of OSG sites > > > > > > It's likely this error can be fixed by the same thing Zhao > did to get condor > > > working last week (since the command works for me). Try > putting @osg and > > > +osg-client into your ~/.soft file and then resetting up > by either logging > > > back in or typing "resoft;source ~/.bashrc" > > > > > > for reference, here is my ~/.soft file > > > > > > @python-2.5 > > > +java-sun > > > +osg-client > > > +maui > > > +torque > > > +R > > > +matlab-7.7 > > > +apache-ant > > > +gx-map > > > @osg > > > @default > > > @globus-4 > > > > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin > wrote: > > > > > > > I'm having trouble with the > swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ swift-osg-ress-site-catalog > --engage-verified > > > > > > > > Neither the environment variable CONDOR_CONFIG, > > > > /etc/condor/, nor ~condor/ contain a condor_config > source. > > > > Either set CONDOR_CONFIG to point to a valid config > source, > > > > or put a "condor_config" file in /etc/condor or > ~condor/ > > > > Exiting. > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > > > Any help is much appreciated. > > > > > > > > Thanks, > > > > Erin > > > > > > > > > > > > Erin M. Hodgess, PhD > > > > Associate Professor > > > > Department of Computer and Mathematical Sciences > > > > University of Houston - Downtown > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: swift-user-bounces at ci.uchicago.edu on behalf of > Michael Wilde > > > > Sent: Fri 6/26/2009 2:14 PM > > > > To: Melinda Chin > > > > Cc: Swift User Discussion List > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > From a practical point of view, you can get the sites > in the OSG > > > > "engagement" VO using this Swift command, which returns > the list as a > > > > sites.xml file: > > > > > > > > swift-osg-ress-site-catalog --engage-verified > > > > > > > > That command also has options for getting sites from > other VOs > > > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > > Here's what I think is a simple question but can't > find the answer to? > > > > > Where can I find a list of all the osg sites? > > > > > > > > > > Thank you, > > > > > Melinda Chin > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > _______________________________________________ > > > > Swift-user mailing list > > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 26 16:43:09 2009 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 26 Jun 2009 16:43:09 -0500 Subject: [swift-user] List of OSG sites In-Reply-To: References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> <1246052071.14751.8.camel@localhost> Message-ID: <1246052589.15258.1.camel@localhost> On Fri, 2009-06-26 at 16:36 -0500, Glen Hocky wrote: > that would be odd if it were necessary, since the only related thing i > use is > source /opt/osg/setup.sh You mean /soft/osg/setup.sh? [hategan at tp-login2 ~]$ ls -al /soft/osg lrwxrwxrwx 1 grog software 10 Mar 1 21:47 /soft/osg -> osg-client [hategan at tp-login2 ~]$ ls -al /soft/osg-client lrwxrwxrwx 1 grog software 20 Mar 1 21:47 /soft/osg-client -> osg-client-1.0.0-r1/ [hategan at tp-login2 ~]$ ls -al /soft/osg-client-1.0.0-r1 lrwxrwxrwx 1 grog software 49 Mar 1 21:47 /soft/osg-client-1.0.0-r1 -> /software/linux-rhel4-x86_64/osg-client-1.0.0-r1/ > which only works on tp-osg > > On Fri, Jun 26, 2009 at 4:34 PM, Mihael Hategan > wrote: > On Fri, 2009-06-26 at 16:22 -0500, Glen Hocky wrote: > > By the way, in reference to my previous e-mail. the result > of my > > condor status is: > > > Yep, this means it's working, but there is no condor pool on > tp. > > Erin, try this: > source /autonfs/software/linux-rhel4-x86_64/osg-client-1.0.0-r1/setup.sh > > Then swift-osg... > > > > > > condor_status > > CEDAR:6001:Failed to connect to <128.135.125.117:9618> > > Error: Couldn't contact the condor_collector on > > tp-login2.ci.uchicago.edu. > > > > Extra Info: the condor_collector is a process that runs on > the > > central > > manager of your Condor pool and collects the status of all > the > > machines and > > jobs in the Condor pool. The condor_collector might not be > running, it > > might > > be refusing to communicate with you, there might be a > network problem, > > or > > there may be some other problem. Check with your system > administrator > > to fix > > this problem. > > > > If you are the system administrator, check that the > condor_collector > > is > > running on tp-login2.ci.uchicago.edu, check the HOSTALLOW > > configuration in > > your condor_config, and check the MasterLog and CollectorLog > files in > > your > > log directory for possible clues as to why the > condor_collector is > > not > > responding. Also see the Troubleshooting section of the > manual. > > > > which is different from yours. try putting @osg and > +osg-client before > > @default > > > > On Fri, Jun 26, 2009 at 3:59 PM, Mihael Hategan > > > wrote: > > Doesn't work. > > > > My .soft: > > @default > > @globus-4 > > +java-sun > > +torque > > +maui > > +condor > > @osg > > +osg-client > > > > [hategan at tp-login2 ~]$ condor_status > > > > Neither the environment variable CONDOR_CONFIG, > > /etc/condor/, nor ~condor/ contain a condor_config > source. > > Either set CONDOR_CONFIG to point to a valid config > source, > > or put a "condor_config" file in /etc/condor or > ~condor/ > > Exiting. > > > > > > > > > > On Fri, 2009-06-26 at 15:41 -0500, Michael Wilde > wrote: > > > If you added your swift/bin directory to your path > manually, > > when you > > > run resoft you may loose that and need to do it > again. > > > > > > > > > On 6/26/09 3:33 PM, Hodgess, Erin wrote: > > > > This is getting weird: > > > > > > > > > > > > [erin at tp-login2 ~]$ swift-osg-ress-site-catalog > > --engage-verified > > > > -bash: swift-osg-ress-site-catalog: command not > found > > > > [erin at tp-login2 ~]$ > > > > > > > > Erin M. Hodgess, PhD > > > > Associate Professor > > > > Department of Computer and Mathematical Sciences > > > > University of Houston - Downtown > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: hockyg at gmail.com on behalf of Glen Hocky > > > > Sent: Fri 6/26/2009 3:24 PM > > > > To: Hodgess, Erin > > > > Cc: Michael Wilde; Melinda Chin; Swift User > Discussion > > List > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > It's likely this error can be fixed by the same > thing Zhao > > did to get condor > > > > working last week (since the command works for > me). Try > > putting @osg and > > > > +osg-client into your ~/.soft file and then > resetting up > > by either logging > > > > back in or typing "resoft;source ~/.bashrc" > > > > > > > > for reference, here is my ~/.soft file > > > > > > > > @python-2.5 > > > > +java-sun > > > > +osg-client > > > > +maui > > > > +torque > > > > +R > > > > +matlab-7.7 > > > > +apache-ant > > > > +gx-map > > > > @osg > > > > @default > > > > @globus-4 > > > > > > > > > > > > On Fri, Jun 26, 2009 at 3:20 PM, Hodgess, Erin > > wrote: > > > > > > > > > I'm having trouble with the > > swift-osg-ress-site-catalog command: > > > > > > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > swift-osg-ress-site-catalog > > --engage-verified > > > > > > > > > > Neither the environment variable > CONDOR_CONFIG, > > > > > /etc/condor/, nor ~condor/ contain a > condor_config > > source. > > > > > Either set CONDOR_CONFIG to point to a valid > config > > source, > > > > > or put a "condor_config" file in /etc/condor > or > > ~condor/ > > > > > Exiting. > > > > > > > > > > > > > > > > > > > > > > > > > [erin at tp-login2 swift]$ > > > > > > > > > > Any help is much appreciated. > > > > > > > > > > Thanks, > > > > > Erin > > > > > > > > > > > > > > > Erin M. Hodgess, PhD > > > > > Associate Professor > > > > > Department of Computer and Mathematical > Sciences > > > > > University of Houston - Downtown > > > > > mailto: hodgesse at uhd.edu > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > From: swift-user-bounces at ci.uchicago.edu on > behalf of > > Michael Wilde > > > > > Sent: Fri 6/26/2009 2:14 PM > > > > > To: Melinda Chin > > > > > Cc: Swift User Discussion List > > > > > Subject: Re: [swift-user] List of OSG sites > > > > > > > > > > From a practical point of view, you can get > the sites > > in the OSG > > > > > "engagement" VO using this Swift command, > which returns > > the list as a > > > > > sites.xml file: > > > > > > > > > > swift-osg-ress-site-catalog > --engage-verified > > > > > > > > > > That command also has options for getting > sites from > > other VOs > > > > > > > > > > swift-osg-ress-site-catalog --vo=osg > > > > > > > > > > On 6/26/09 2:04 PM, Melinda Chin wrote: > > > > > > Here's what I think is a simple question > but can't > > find the answer to? > > > > > > Where can I find a list of all the osg > sites? > > > > > > > > > > > > Thank you, > > > > > > Melinda Chin > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-user mailing list > > > > > > Swift-user at ci.uchicago.edu > > > > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-user mailing list > > > > > Swift-user at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > > > > > > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > > > > From benc at hawaga.org.uk Sat Jun 27 03:22:59 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Sat, 27 Jun 2009 08:22:59 +0000 (GMT) Subject: [Swift-user] How best to get per-job times? In-Reply-To: <4A451933.9050704@mcs.anl.gov> References: <4A451933.9050704@mcs.anl.gov> Message-ID: On Fri, 26 Jun 2009, Michael Wilde wrote: > But looking at the _swiftwrap script, I dont see where it records the job's > resource consumption (at least CPU and wall time, and ideally more, like > memory). > > Is that available somewhere from the wrapper, or only from kickstart? It records walltime, by logging the start and end of the app executable. For more fancy stuff, use kickstart. -- From benc at hawaga.org.uk Sat Jun 27 03:26:26 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Sat, 27 Jun 2009 08:26:26 +0000 (GMT) Subject: [swift-user] List of OSG sites In-Reply-To: <1246051619.14751.4.camel@localhost> References: <63cc32bc0906261204l52f9a794vae5b164d52840e19@mail.gmail.com> <4A451E1C.4020406@mcs.anl.gov> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D0@BALI.uhd.campus> <70A5AC06FDB5E54482D19E1C04CDFCF307C370D1@BALI.uhd.campus> <4A453261.9090300@mcs.anl.gov> <1246049942.14074.2.camel@localhost> <1246051619.14751.4.camel@localhost> Message-ID: For people on this thread playing with their own swift installs, there is a CI softenv key @swift to get swift so that you don't need your own install. That will eliminate some of the playing with paths mentioned in this thread. -- From me.melly at gmail.com Mon Jun 29 10:36:28 2009 From: me.melly at gmail.com (Melinda Chin) Date: Mon, 29 Jun 2009 10:36:28 -0500 Subject: [Swift-user] TeraPort to OSG Message-ID: <63cc32bc0906290836ucc2ced3s91f37281e661cb0d@mail.gmail.com> I don't know if this question belongs here, and it's a relatively simple one to answer I think. I was trying to remotely access the clemson site through teraport, but this error and other error's like this keeps popping up [mchin at tp-login2 ~]$ globus-job-run osgce.cs.clemson.edu /bin/pwd ERROR: Couldn't find a valid proxy. Use -debug for further information. ERROR: proxy does not exist Syntax : globus-job-run {[-:] [-np N] [...]}... Use -help to display full usage. [mchin at tp-login2 ~]$ I really hope it's just me typing the command wrong. I also thought it might have something to do with my certificates aren't all in place? A couple weeks ago I had an issue with my certificate being reissued and so it didn't match, but right now I can run globus-job-run just fine on the the clemson side. Perhaps also it has something to do with the swift tutorial where it says to access remotely: *The site catalog is located in etc/sites.xml and is a relatively straightforward XML format file. We must modify each of the following three settings: gridftp (which indicates how and where data can be transferred to the remote resource), jobmanager (which indicates how applications can be run on the remote resource) and workdirectory (which indicates where working storage can be found on the remote resource).* However, I don't undrestand the exact steps I must take to edit gridftp, jobmanagr, and workdirecotry to be able to allow me access to the other side. Thanks, Melinda Chin -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Mon Jun 29 10:48:41 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Mon, 29 Jun 2009 10:48:41 -0500 Subject: [Swift-user] TeraPort to OSG References: <63cc32bc0906290836ucc2ced3s91f37281e661cb0d@mail.gmail.com> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C370E9@BALI.uhd.campus> Do grid-proxy-init first. (same thing happened to me this morning!) Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: swift-user-bounces at ci.uchicago.edu on behalf of Melinda Chin Sent: Mon 6/29/2009 10:36 AM To: Swift User Discussion List Subject: [Swift-user] TeraPort to OSG I don't know if this question belongs here, and it's a relatively simple one to answer I think. I was trying to remotely access the clemson site through teraport, but this error and other error's like this keeps popping up [mchin at tp-login2 ~]$ globus-job-run osgce.cs.clemson.edu /bin/pwd ERROR: Couldn't find a valid proxy. Use -debug for further information. ERROR: proxy does not exist Syntax : globus-job-run {[-:] [-np N] [...]}... Use -help to display full usage. [mchin at tp-login2 ~]$ I really hope it's just me typing the command wrong. I also thought it might have something to do with my certificates aren't all in place? A couple weeks ago I had an issue with my certificate being reissued and so it didn't match, but right now I can run globus-job-run just fine on the the clemson side. Perhaps also it has something to do with the swift tutorial where it says to access remotely: *The site catalog is located in etc/sites.xml and is a relatively straightforward XML format file. We must modify each of the following three settings: gridftp (which indicates how and where data can be transferred to the remote resource), jobmanager (which indicates how applications can be run on the remote resource) and workdirectory (which indicates where working storage can be found on the remote resource).* However, I don't undrestand the exact steps I must take to edit gridftp, jobmanagr, and workdirecotry to be able to allow me access to the other side. Thanks, Melinda Chin -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Jun 29 11:15:07 2009 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 29 Jun 2009 11:15:07 -0500 Subject: [Swift-user] Which OSG directories to use? In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C370E8@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C370E8@BALI.uhd.campus> Message-ID: <4A48E88B.8070106@mcs.anl.gov> was: Re: first question o'the day On 6/29/09 10:19 AM, Hodgess, Erin wrote: > Hi all! > > Here is my first question (of many) for today. > > I have a copy of the sites.xml file, which contains the names of the > sites and also working directories. > > Please consider the following: > > When I check /bin/pwd on osgce.cs.clemson.edu, I get > [erin at tp-login2 oops]$ globus-job-run osgce.cs.clemson.edu /bin/pwd > /home/osg > > So the directory is /home/osg Thats likely the $HOME directory of the login for your VO (Engage?) on that site. > > But when you look at the sites.xml file, it has: > > /export/osg/data/engage/tmp/Clemson-ciTeam > > as the working directory. I suspect that's either the "$DATA" or "$TMP" directory for that site. These should be described in some OSG document - maybe googling for these will help. Maybe the ENgage VO has some notes about site usage. > My question is: which directory should I use, please? I suggest using sites.xml as its generated, but report any problems you find. - Mike > > Thanks, > Erin > > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > From rynge at renci.org Mon Jun 29 11:41:49 2009 From: rynge at renci.org (Mats Rynge) Date: Mon, 29 Jun 2009 12:41:49 -0400 Subject: [Swift-user] Which OSG directories to use? In-Reply-To: <4A48E88B.8070106@mcs.anl.gov> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C370E8@BALI.uhd.campus> <4A48E88B.8070106@mcs.anl.gov> Message-ID: <4A48EECD.7050101@renci.org> Michael Wilde wrote: >> But when you look at the sites.xml file, it has: >> >> /export/osg/data/engage/tmp/Clemson-ciTeam >> >> as the working directory. > > I suspect that's either the "$DATA" or "$TMP" directory for that site. Yes, that is $OSG_DATA/$VO_NAME/tmp/ $OSG_DATA is a shared directory across the resource and is the correct place to put temporary data during a run. Make sure you clean up when you are done. Make sure use --vo=[VO] for the VO you are a member of. Do not use $HOME for anything on OSG. -- Mats Rynge Renaissance Computing Institute From benc at hawaga.org.uk Mon Jun 29 16:23:34 2009 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 29 Jun 2009 21:23:34 +0000 (GMT) Subject: [Swift-user] TeraPort to OSG In-Reply-To: <63cc32bc0906290836ucc2ced3s91f37281e661cb0d@mail.gmail.com> References: <63cc32bc0906290836ucc2ced3s91f37281e661cb0d@mail.gmail.com> Message-ID: On Mon, 29 Jun 2009, Melinda Chin wrote: > [mchin at tp-login2 ~]$ globus-job-run osgce.cs.clemson.edu /bin/pwd > > > ERROR: Couldn't find a valid proxy. > Use -debug for further information. > > ERROR: proxy does not exist You need a proxy. Run grid-proxy-init and make sure grid-proxy-info gives valid output. To make that work, you may need to fiddle round with your credentials a bit. -- From HodgessE at uhd.edu Tue Jun 30 11:58:51 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Tue, 30 Jun 2009 11:58:51 -0500 Subject: [Swift-user] Swift on OSG sites Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C3711C@BALI.uhd.campus> Hi Swift Users: I'm trying to run OOPS via Swift on an OSG site, but it can't find Swift: [erin at tp-login2 oop1]$ globus-job-run osgce.cs.clemson.edu /home/osg/swoops/users/hockyg/swift/runradial_norefine_teraport.sh plist input 4 /home/osg/swoops/users/hockyg/swift/runradial_norefine_teraport.sh: line 10: swift: command not found Now when I put the path for Swift in a simple example, I get: [erin at tp-login2 oop1]$ globus-job-run osgce.cs.clemson.edu /home/erin/cog/modules/swift/dist/swift-svn/bin/swift /home/osg/hello.swift GRAM Job failed because the executable does not exist (error code 5) Has anyone seen this before, please? Thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Tue Jun 30 12:35:55 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Tue, 30 Jun 2009 12:35:55 -0500 Subject: [Swift-user] (no subject) Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37122@BALI.uhd.campus> Hi yet again: Thought I'd try with Swift via condor, but this is what I got: [erin at tp-login2 swift1]$ condor_submit test.submit Neither the environment variable CONDOR_CONFIG, /etc/condor/, nor ~condor/ contain a condor_config source. Either set CONDOR_CONFIG to point to a valid config source, or put a "condor_config" file in /etc/condor or ~condor/ Exiting. Any idea what is wrong, please? thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From HodgessE at uhd.edu Tue Jun 30 12:43:30 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Tue, 30 Jun 2009 12:43:30 -0500 Subject: [Swift-user] Swift on OSG sites References: <70A5AC06FDB5E54482D19E1C04CDFCF307C3711C@BALI.uhd.campus> <50b07b4b0906301040r34ff9c39y53a93aa138e102f0@mail.gmail.com> Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37124@BALI.uhd.campus> We're trying to run OOPS, and the ultimate goal is to run OOPS via Swift. Our next step is to try some OOPS commands at the OSG sites. thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -----Original Message----- From: yecartes at gmail.com on behalf of Allan Espinosa Sent: Tue 6/30/2009 12:40 PM To: Hodgess, Erin Cc: swift-user at ci.uchicago.edu Subject: Re: [Swift-user] Swift on OSG sites hello erin. i suggest you write everything in absolute paths or add manually a PATH env variable in your scripts. also, do you really need the swift submit program running on the osg site itself? -Allan 2009/6/30 Hodgess, Erin : > Hi Swift Users: > > I'm trying to run OOPS via Swift on an OSG site, but it can't find Swift: > > [erin at tp-login2 oop1]$ globus-job-run osgce.cs.clemson.edu > /home/osg/swoops/users/hockyg/swift/runradial_norefine_teraport.sh plist > input 4 > /home/osg/swoops/users/hockyg/swift/runradial_norefine_teraport.sh: line 10: > swift: command not found > > > Now when I put the path for Swift in a simple example, I get: > > [erin at tp-login2 oop1]$ globus-job-run osgce.cs.clemson.edu > /home/erin/cog/modules/swift/dist/swift-svn/bin/swift /home/osg/hello.swift > GRAM Job failed because the executable does not exist (error code 5) > > Has anyone seen this before, please? > > Thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Tue Jun 30 12:40:04 2009 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Tue, 30 Jun 2009 12:40:04 -0500 Subject: [Swift-user] Swift on OSG sites In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C3711C@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C3711C@BALI.uhd.campus> Message-ID: <50b07b4b0906301040r34ff9c39y53a93aa138e102f0@mail.gmail.com> hello erin. i suggest you write everything in absolute paths or add manually a PATH env variable in your scripts. also, do you really need the swift submit program running on the osg site itself? -Allan 2009/6/30 Hodgess, Erin : > Hi Swift Users: > > I'm trying to run OOPS via Swift on an OSG site, but it can't find Swift: > > [erin at tp-login2 oop1]$ globus-job-run osgce.cs.clemson.edu > /home/osg/swoops/users/hockyg/swift/runradial_norefine_teraport.sh plist > input 4 > /home/osg/swoops/users/hockyg/swift/runradial_norefine_teraport.sh: line 10: > swift: command not found > > > Now when I put the path for Swift in a simple example, I get: > > [erin at tp-login2 oop1]$ globus-job-run osgce.cs.clemson.edu > /home/erin/cog/modules/swift/dist/swift-svn/bin/swift /home/osg/hello.swift > GRAM Job failed because the executable does not exist (error code 5) > > Has anyone seen this before, please? > > Thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From HodgessE at uhd.edu Tue Jun 30 15:56:18 2009 From: HodgessE at uhd.edu (Hodgess, Erin) Date: Tue, 30 Jun 2009 15:56:18 -0500 Subject: [Swift-user] which job for which site Message-ID: <70A5AC06FDB5E54482D19E1C04CDFCF307C37130@BALI.uhd.campus> Hi all! I have multiple sites in my sites.xml file and is sending multiple jobs out. How do I determine which jobs went to which sites, please? I'm sure that it's something simple that I'm not seeing. thanks, Erin Erin M. Hodgess, PhD Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: hodgesse at uhd.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Tue Jun 30 15:58:29 2009 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Tue, 30 Jun 2009 15:58:29 -0500 Subject: [Swift-user] which job for which site In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C37130@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C37130@BALI.uhd.campus> Message-ID: <50b07b4b0906301358l26dfe462y3fe3b12c0902cc3b@mail.gmail.com> hi erin. in the swift_filename-xxx.log, you can check the vdl:execute2 JOB_START lines and look at the host= parameter. the name here corresponds to your pool handle= in the sites.xml file 2009/6/30 Hodgess, Erin : > Hi all! > > I have multiple sites in my sites.xml file and is sending multiple jobs out. > > How do I determine which jobs went to which sites, please? > > I'm sure that it's something simple that I'm not seeing. > > thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From mjwilde at gmail.com Tue Jun 30 12:47:23 2009 From: mjwilde at gmail.com (Michael Wilde) Date: Tue, 30 Jun 2009 12:47:23 -0500 Subject: [Swift-user] Swift on OSG sites In-Reply-To: <70A5AC06FDB5E54482D19E1C04CDFCF307C3711C@BALI.uhd.campus> References: <70A5AC06FDB5E54482D19E1C04CDFCF307C3711C@BALI.uhd.campus> Message-ID: <4A4A4FAB.8000500@gmail.com> Erin, - Swift is never run on grid sites; its only run from a submit host - runradial_norefine_teraport.sh is a client-ide command, also meant to be run on a submit host, with swift in your PATH. Mike On 6/30/09 11:58 AM, Hodgess, Erin wrote: > > Hi Swift Users: > > I'm trying to run OOPS via Swift on an OSG site, but it can't find Swift: > > [erin at tp-login2 oop1]$ globus-job-run osgce.cs.clemson.edu > /home/osg/swoops/users/hockyg/swift/runradial_norefine_teraport.sh > plist input 4 > /home/osg/swoops/users/hockyg/swift/runradial_norefine_teraport.sh: > line 10: swift: command not found > > > Now when I put the path for Swift in a simple example, I get: > > [erin at tp-login2 oop1]$ globus-job-run osgce.cs.clemson.edu > /home/erin/cog/modules/swift/dist/swift-svn/bin/swift > /home/osg/hello.swift > GRAM Job failed because the executable does not exist (error code 5) > > Has anyone seen this before, please? > > Thanks, > Erin > > > Erin M. Hodgess, PhD > Associate Professor > Department of Computer and Mathematical Sciences > University of Houston - Downtown > mailto: hodgesse at uhd.edu > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user >